-
Notifications
You must be signed in to change notification settings - Fork 27
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
generic fetcher: Add usage docs and a ADR
Add documentation on how to use the generic fetcher and also an ADR to help move out of the experimental phase. Signed-off-by: Jan Koscielniak <[email protected]>
- Loading branch information
Showing
3 changed files
with
187 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,109 @@ | ||
# Add generic fetcher | ||
|
||
- Status: proposed | ||
- Date: 2024-10-30 | ||
|
||
## Context | ||
|
||
The main motivation for this change is to cover use cases of users that need to download arbitrary files that don't fit | ||
within an established package ecosystem cachi2 could potentially otherwise support. The target audience is users that | ||
want to use cachi2 to achieve hermetic builds and want an easy way to also include these arbitrary files, that cachi2 | ||
will account for in the SBOM it produces. | ||
|
||
## Decision | ||
|
||
This change introduces a generic fetcher, an additional cachi2 package manager. This package manager utilizes a custom | ||
lockfile that is located in the input repository. Based on that lockfile, it will download files, save them into a requested | ||
location, and verify checksums. Below is a more detailed overview of the implementation. | ||
|
||
### Lockfile format | ||
|
||
Cachi2 expects the lockfile to be named `generic_lockfile.yaml`. | ||
In order to account for possible future breaking changes, the lockfile will contain a `metadata` section with a `version` | ||
field that will indicate the version of the lockfile format. It will also contain a list of artifacts (files) to download, | ||
each of the artifacts to have a URL, list of checksums, and optionally target location specified. | ||
|
||
```yaml | ||
metadata: | ||
# uses X.Y semantic versioning | ||
version: "1.0" | ||
artifacts: | ||
- download_url: https://huggingface.co/instructlab/granite-7b-lab/resolve/main/model-00001-of-00003.safetensors?download=true | ||
target: granite-model-1.safetensors | ||
checksums: | ||
sha256: d16bf783cb6670f7f692ad7d6885ab957c63cfc1b9649bc4a3ba1cfbdfd5230c | ||
``` | ||
#### Lockfile properties | ||
Below is an explanation of individual properties of the lockfile. | ||
##### download_url (required) | ||
Specified as a string containing the download url of the artifact. | ||
##### checksums (required) | ||
Specified as a dictionary of checksum algorithms and their values. At least one cachi2-verifiable checksum must be provided | ||
to ensure at least some degree of confidence in the identity of the artifact. | ||
#### target (optional) | ||
This key is provided mainly for the users convenience, so the files end up in expected locations. It is optional and if | ||
not specified, it will be derived from the download_url. Target here means a specific subdirectory inside cachi2's output | ||
directory for the generic fetcher (`{cachi2-output-dir}/deps/generic`). Cachi2 will verify that the target locations, | ||
including those derived from download urls do not overlap. | ||
|
||
### SBOM components | ||
|
||
Artifacts fetched with the generic fetcher will all be recorded in the SBOM cachi2 produces. Given the inability to derive | ||
any extra information about these files beyond a download location and a filename, these files will always be recorded | ||
as SBOM components with purl of type generic. | ||
|
||
Additionally, the SBOM component will contain [externalReferences] of type `distribution` to indicate the url used to download | ||
the file to allow for easier handling for tools that might process the SBOM. | ||
|
||
Here's an example SBOM generated for above file. | ||
|
||
```json | ||
{ | ||
"bomFormat": "CycloneDX", | ||
"components": [ | ||
{ | ||
"name": "granite-model-1.safetensors", | ||
"purl": "pkg:generic/granite-model-1.safetensors?checksums=sha256:d16bf783cb6670f7f692ad7d6885ab957c63cfc1b9649bc4a3ba1cfbdfd5230c&download_url=https://huggingface.co/instructlab/granite-7b-lab/resolve/main/model-00001-of-00003.safetensors", | ||
"properties": [ | ||
{ | ||
"name": "cachi2:found_by", | ||
"value": "cachi2" | ||
} | ||
], | ||
"type": "file", | ||
"externalReferences": [ | ||
{ | ||
"url": "https://huggingface.co/instructlab/granite-7b-lab/resolve/main/model-00001-of-00003.safetensors", | ||
"type": "distribution" | ||
} | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"tools": [ | ||
{ | ||
"vendor": "red hat", | ||
"name": "cachi2" | ||
} | ||
] | ||
}, | ||
"specVersion": "1.4", | ||
"version": 1 | ||
} | ||
``` | ||
|
||
## Consequences | ||
|
||
As mentioned before, this package manager enables users to fetch arbitrary files with cachi2 and have them accounted for | ||
in the SBOM. Possible downside could be maintaining the lockfile format, as it is specific to cachi2 (which should be | ||
partially mitigated by versioning it). | ||
|
||
[externalReferences]: https://cyclonedx.org/docs/1.6/json/#components_items_externalReferences |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters