Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: cvedb metric refactoring #4955

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

22f1001635
Copy link
Contributor

This PR addresses the issue of handling unknown values in metrics by ensuring the UNKNOWN metric is properly initialized and validated in the CVE database. Key changes include:

✔️ Added UNKNOWN_METRIC_ID to populate_metrics to ensure the "UNKNOWN" metric is inserted into the metrics table.
✔️ Enhanced schema validation in latest_schema to check for the existence of the UNKNOWN metric, triggering a refresh if missing.
✔️ Updated metrics during refresh to ensure metrics are re-populated if the schema is outdated.

These changes ensure that the database consistently handles unknown values and maintains up-to-date metrics, improving reliability and correctness.

Fixes
#4812

@jloehel
Copy link

jloehel commented Mar 19, 2025

@22f1001635 What do you think about creating a constant for the metrics data, used in populate_metrics:

data = [
(UNKNOWN_METRIC_ID, "UNKNOWN"),
(EPSS_METRIC_ID, "EPSS"),
(CVSS_2_METRIC_ID, "CVSS-2"),
(CVSS_3_METRIC_ID, "CVSS-3"),
]

like:

METRICS = [
        (UNKNOWN_METRIC_ID, "UNKNOWN"),
        (EPSS_METRIC_ID, "EPSS"),
        (CVSS_2_METRIC_ID, "CVSS-2"),
        (CVSS_3_METRIC_ID, "CVSS-3"),
]

and use the new constant in populate_metrics and in latest_schema for iteration like that:

        # Check for metrics table data
        if table_name == "metrics":
			for metrics_id, metrics_name in METRICS:
                result = cursor.execute(
                    "SELECT * FROM metrics WHERE metrics_id=? AND metrics_name", (metrics_id, metrics_name)
               )
               if not result.fetchone():
                   schema_latest = False

This has the benefit that it's simpler to add additional METRICS in the future like SSVC in example.

@22f1001635
Copy link
Contributor Author

Hi @jloehel , I have made the required changes. Can you take a look and let me know

@jloehel
Copy link

jloehel commented Mar 20, 2025

Hi @jloehel , I have made the required changes. Can you take a look and let me know

Hi :-) Thanks. I have read the code again and I am not sure if the condition is at the right place because self.refresh_cache_and_update_db() gets still only executed if the db does not exist, the db is older than 24 days and the latest_schema is not matching. I think the condition needs to go here:

if (
not self.latest_schema(
"cve_severity", self.TABLE_SCHEMAS["cve_severity"]
)
or not self.latest_schema("cve_range", self.TABLE_SCHEMAS["cve_range"])
or not self.latest_schema(
"cve_exploited", self.TABLE_SCHEMAS["cve_exploited"]
)
):

... and you don't need to call populate_metrics again. It gets called in populate_db already. Only the condition when the database gets updated needs to get modified. Sorry, I should have checked this earlier.

@22f1001635 22f1001635 force-pushed the cvedb-metric-refactoring branch from 55554fa to 6b08bef Compare March 20, 2025 19:19
@22f1001635
Copy link
Contributor Author

@jloehel made changes; take a look and let me know if anything else is needed

@jloehel
Copy link

jloehel commented Mar 25, 2025

@22f1001635 What do you think about a test case for this scenario?

  • Database exists, is not older than 24 hours and the schemas are all correct, but the UNKNOWN_METRIC is missing.
  • Update the database
  • Check if the UNKNOWN_METRIC exists after the update

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants