What it will do:
Repository Level:
1. Maintain a list of repology repository prefixes that are relevant
2. Generate a list of source package directory URLs from (1)
3. Fetch the list of packages and keep it locally.
Package Level:
1. Fetch the list of repology identifiers for a product
2. These are used to fetch the relevant repology project
3. For each product, filter the list of packages by a list of repository
prefixes
4. For each such repository, use the list of packages generated above
and use it to generate a comprehensive list of PURLs
5. Finally, save the list of PURLs to disk
The final version should deliver a clear and comprehensive list of PURLs
for a given product, where each PURL represents the latest version of a
package available on a specific distribution channel (not necessarily
linux distro).
These PURLs can then be used to augment scan results, by generating
feeds for scanning products. The usecase could be:
1. Use type/namespace/name to check if product is in our database
2. Use the version against our list from above to see if it is the
latest version available on that channel. Give warning if not.
3. If it is the latest version, check to see if the latest version
is considered supported. Additionally, use the channel's support
status as well (such as debian support dates, repository information)
to provide clear guarantee of support.
Depending on results from 1,2,3: return a vulnerability rating. Most of
the scanning part can perhaps be done by existing scanners, so we are
looking to bootstrap this by generating a "feed" instead.
Feed Details:
1. A vulnerability feed typically contains information about known
vulnerabilities in various products, using package name, channel, and
version ranges.
2. We can generate such a feed from our PURLs and EOL API. Each
unsupported release cycle can be used to craft a
"pseudo"-vulnerability that triggers on unsupported versions being
detected.
3. The feed will need a lot of exceptions for supported packages on
various channels, which is why we need to do repology scraping