-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue with query fetching vulnerabilities for a large sbom and large number of advisories #1129
Comments
thx for this - on my plate |
@JimFuller-RedHat Great. I managed to get by the error, by adding more resources to the postgres container, but still the query is really slow. I think it can be easily improved by eliminating |
Here's zipped sbom for the reference. It might be not the best example, but we can test with others as well |
FWIW - I was able to reproduce ... |
temporarily setting in postgres:
I can force the indexes to be used and its faster ... the reason why this query plan is coming up with seqscans is because of STATISTICS setting on the various tables ... I think in this case we can ensure a good query plan if we tweak some of the stats on some of the larger tables. |
This helps some. The query time for go from
to
But that still seems a lot. Here's the new query plan
What I think would be greatly beneficial is to filter for statuses with appropriate CPEs early in the query. At the moment we do this in the code after the query https://github.com/trustification/trustify/blob/main/modules/fundamental/src/sbom/model/details.rs#L225. So even if we do this expensive query, we end up with 0 result. If we would code this as postgres function and use we could significantly improve the speed.
to the query, in which case the execution time goes to
|
Intro
The issue happens when we have a large SBOM (in this concrete case rhel 7.9 with around 50K packages) and large number of vex entries for these packages (in this case all of Red Hat CSAF files from 2024).
This issue is in the https://github.com/trustification/trustify/blob/main/modules/fundamental/src/sbom/model/details.rs#L51 query, which resolves to
This leads to query fail with the message
ERROR: could not write to file "base/pgsql_tmp/pgsql_tmp3964.175": No space left on device
Meaning there's no enough space for intermediary results.
First analysis
It turns out that for this particular scenario there are around million entries in
purl_status
table for the base purls contained in the SBOM. Than the join is tried with theversion_range
table and calingversion_matches
function which leads to resources exhaustion.Note: this might be a problem only in the local environment, but it still can lead to the issues with ingesting even more advisories and large SBOMs
Here's the query plan
Reproduce
http POST localhost:8080/api/v1/dataset @etc/datasets/ds3.zip
csaf scoop http://localhost:8080/api/v1/sbom etc/test-data/rhel-7.9.z.json
redhat-csaf-vex-2024
importerSolution
I still didn't find a proper solution. I had a few tries at rewriting the query to limit the impact of matching version ranges, but still without the success
The text was updated successfully, but these errors were encountered: