Skip to content

Latest commit

 

History

History
296 lines (254 loc) · 16.8 KB

0044-spdx-support.md

File metadata and controls

296 lines (254 loc) · 16.8 KB

44. SPDX SBOM support

Date: 2024-09-24

Status

Accepted

Glossary

  • SBOM - Software Bill of Materials
  • SPDX - Software Package Data Exchange
  • PURL - Package URL
  • Builder images - Images used in FROM instructions in the Dockerfile
  • Root package - The package representing the source of the SBOM itself

Context

SPDX SBOM format enables additional features not available in cyclondedx like multiple purl attributes per component. SPDX is also a widely adopted standard for software bill of materials. This ADR describes how to enable use of SPDX SBOM format in Konflux.

Decision

SBOM lifecycle in build pipeline

At the start SBOMs are generated by cachi2 and syft. These two SBOM files are merged together into single SBOM document. At later phase of the build pipeline, builder images of the currently build image are added into SBOM as build dependency of the image. To switch to SPDX format, all tools producing and processing SBOMS in the pipeline has to be able to work with SPDX format. SBOMS of builder images are not processed by the pipeline, therefore builder images SBOMs doesn't have to be in SPDX format. This leads to fact that when tools generating the SBOMs are switched to SPDX format, all tools processing SBOMS can expect SPDX format only. There's no need for any tool to be able to work with mixed inputs of SPDX and CycloneDX formats As a result, tekton tasks should implement the sbomType attribute to specify the expected SBOM format for input and output. This will allow tools to be tested with SPDX before the entire pipeline transitions to this format.

CycloneDX -> SPDX conversion

CycloneDX (1.5) is structured document in json format with following structure (not full specification)

  • Document
    • Metadata
      • Tools
        • List<Tool>
          • vendor
          • name
      • <Component> (attributes same as bellow)
    • Components
      • List<Component>
        • name
        • version
        • purl
        • properties
        • List<Property>
          • name
          • value
    • formulations
      • List<Formulation>

SPDX (2.3) is structured document in json format with following structure(not full specification):

  • Document
    • name
    • documentNamespace
    • SPDXID
    • creationInfo
      • creators
        • List<String>
      • created
    • packages
      • List<Packages>
        • SPDXID
        • name
        • downloadLocation
        • versionInfo
        • externalRefs
          • List<ExternalRef>
            • referenceCategory
            • referenceType
            • referenceLocator
        • annotations
          • List<Annotation>
            • annotationDate
            • annotationType
            • annotator
            • Comment
    • relationships
      • List<Relationship>
        • spdxElementId
        • relationshipType
        • relatedSpdxElement

1:1 conversions

Following CycloneDX to SPDX attributes are converted as 1:1 as they represent the same thing.

CycloneDX Attribute SPDX Attribute
components packages
component.name package.name
component.version package.versionInfo

Component.purl

CycloneDX (version 1.5) supports only a single purl attribute per component. SPDX doesn’t have a direct attribute, but instead every package includes an externalRefs array which describes all external references for the package. There are defined reference categories and types. For PURL, category PACKAGE-MANAGER and type purl is used. The purl itself will be stored as referenceLocator

| CycloneDX Attribute          | SPDX Attribute                                                |
|------------------------------|---------------------------------------------------------------|
| component.purl = `<PURL>`    | package.externalRefs = [{referenceCategory:”PACKAGE-MANAGER”, |
|                              |                          referenceType:purl,                  |
|                              |                          referenceLocator: `<PURL>`           |
|                              |                          }]                                   |

Component.properties

CycloneDX components properties describe mapping of string:string properties for given component. SPDX component doesn’t have anything similar to cyclonedx properties. SPDX Package annotations are the only attribute where custom data can be stored and the only “customizable” field where there is comment which is a simple string. Due to that fact, cycloneDX property in format of {“name”: , “value”: } is encoded into json string. There can be also annotations produced by other tools. Therefore to be able to tell annotation comment is json encoded, annotator should ends with string “:jsonencoded”. To indicate annotator was a tool, prefix “Tool:" has to be included in the field.

| CycloneDX Attribute                       | SPDX Attribute                                 |
|-------------------------------------------|------------------------------------------------|
| components.properties = [                 | package.annotations = [                        |
|   {“name”: …, “value”: …}                 |   {..., annotator: "`Tool: <tool>`:jsonencoded” |
| ]                                         | ]                                              |

Formulations

CycloneDX formulations describe how the container was manufactured. In SPDX, Relationship elements can be used for the same purpose. All elements in SPDX have SPDXID attribute which is an element identifier unique in the whole SBOM document. Relationship element describes relation between two elements using their SPDXID and relationship type. Relationship type BUILD_TOOL_OF can be used to express the relationship of packages which were used to build the container.

| CycloneDX Attribute             | SPDX Attribute                                             |
|---------------------------------|------------------------------------------------------------|
| Formulations.components = [{}]  | Relationships = [                                          |
|                                 |     {                                                      |
|                                 |          spdxElementId = `<A-BUILDER-IMAGE-ID>`,           |
|                                 |          relationshipType=BUILD_TOOL_OF,                   |
|                                 |          relatedSpdxElement=`<ROOT-PACKAGE>`               |
|                                 |     }                                                      |
|                                 | ]                                                          |

Explanation: Root document DESCRIBES ROOT-PACKAGE element which represents the container itself. BUILDER-IMAGE-ID represents the builder image which was used to build the container. The relationship type BUILD_TOOL_OF is used to express that the builder image was used to build the container image.

Metadata.tools

The CycloneDX metadata.tools sub attributes that we are mostly interested in are the vendor and name elements. Information about the creation of the SPDX document can be stored into creationInfo. CreationInfo.creators element is basically a list of strings. There’s a vague specification (here]) about how it should be structured in the standard. Strings should be formatted in the following way: <Attribute>: <Value>. For example vendor should be stored as Vendor: <vendor>. Redommendation is to use only Tool as vendor can be misinterpreted as vendor of the SBOM not the tool which created it.

CyloneDX Attribute SPDX Attribute
Metadata.tools = [{“vendor”: “X”, “name”: “Y”] CreationInfo.creators = [“Tool: Y”]

Metadata.component

Metadata component describes component which is the component which whole SBOM is related to. For example If SBOM describes internal components and dependencies of a container image, this component should represent the container image itself. In SPDX, a package which is equivalent to this component is root package (see root-packages to have idea how syft handles this package). This package is in relationship SPDXRef-ROOT DESCRIBES SPDXRef-RootPackage.

Example: We run syft/cachi2 on source directory of a project which should be build into container. Generated SBOM contains

{
  "SPDXID": "SPDXRef-DOCUMENT",
  // ...
  "packages": [
      {
        "name": ".",
        "SPDXID": "SPDXRef-DocumentRoot-Directory-.",
        "supplier": "NOASSERTION",
        "downloadLocation": "NOASSERTION",
        "filesAnalyzed": false,
        "licenseConcluded": "NOASSERTION",
        "licenseDeclared": "NOASSERTION",
        "primaryPackagePurpose": "FILE"
      },
      {
      {
        "name": "attrs",
        "SPDXID": "SPDXRef-Package-python-attrs-eef51168ca2a575f",
        "versionInfo": "24.2.0",
        ...
      }
      ...
  ],
  "relationships": [
      {
        "spdxElementId": "SPDXRef-DOCUMENT",
        "relatedSpdxElement": "SPDXRef-DocumentRoot-Directory-.",
        "relationshipType": "DESCRIBES"
      },
      {
          "spdxElementId": "SPDXRef-DocumentRoot-Directory-.",
          "relatedSpdxElement": "SPDXRef-Package-python-attrs-eef51168ca2a575f",
          "relationshipType": "CONTAINS"
      }
      ...
  ]
}

And we want that to express that SBOM is actually generated for a container image not for source directory. So we remove SPDXRef-DocumentRoot-Directory-. package and add new virtual package representing the container image. And replace SPDX ID in relationships with ID of the new package. New SBOM should look like this:

{
  "SPDXID": "SPDXRef-DOCUMENT",
  ...
  "packages": [
      {
        "name": "my-image",
        "SPDXID": "SPDXRef-image",
        "versionInfo": "latest",
        ...
        "checksums": [
            {
                "algorithm": "SHA-256",
                "checksumValue": "9ac75c1a392429b4a087971cdf9190ec42a854a169b6835bc9e25eecaf851258"
            }
        ],
        ...
        "externalRefs": [
            {
                "referenceCategory": "PACKAGE-MANAGER",
                "referenceType": "purl",
                "referenceLocator": "pkg:oci/my-image@sha256:9ac75c1a392429b4a087971cdf9190ec42a854a169b6835bc9e25eecaf851258?repository_url=container-registry.com/my-org/my-image"
            }
        ],
        "primaryPackagePurpose": "CONTAINER"
      },
      {
      {
        "name": "attrs",
        "SPDXID": "SPDXRef-Package-python-attrs-eef51168ca2a575f",
        "versionInfo": "24.2.0",
        ...
      }
      ...
  ],
  "relationships": [
      {
        "spdxElementId": "SPDXRef-DOCUMENT",
        "relatedSpdxElement": "SPDXRef-DocumentRoot-Image-container-registry.com/my-org/my-image:latest",
        "relationshipType": "DESCRIBES"
      },
      {
          "spdxElementId": "SPDXRef-DocumentRoot-Image-container-registry.com/my-org/my-image:latest",
          "relatedSpdxElement": "SPDXRef-Package-python-attrs-eef51168ca2a575f",
          "relationshipType": "CONTAINS"
      }
      ...
  ]
}

SPDX specific attributes

There are SPDX attributes which are required to be present in the document, however there's no cycloneDX equivalent for them. These attributes are:

Document.documentNamespace

documentNamespace is URI which provides way how locate the document or reference it other documents. When creating SPDX document locally via syft or cachi2, this attribute has no meaning as it's not clear yet how the document will be published. But as stated in the SPDX specification, it should be unique. At later stages when it's clear where the sbom document will be published it would make sense to change this to a link to the container containing the sbom. URI itself doesn't need to be neccesarilly accessible

Package.downloadLocation

downloadLocation is URI which provides way how to download the package. This is not always available, and it's not clear if it's useful. Therefore it's set to NOASSERTION.

Merging SPDX

Packages

Packages of two SPDX documents can be merged together as a concatenation of two lists. In cycloneDX component elements can have only a single purl attribute, therefore component elements representing packages with the same name and version but with different purl have to be stored as multiple elements. SPDX package elements can bear multiple purls. Therefore multiple cycloneDX components can be squashed together into single SPDX package element with purls concatenated into a single list. Following rules are applied to generic packages merging process:

  • Packages with the same purl's package name and version and type are squashed into single package element

NOTE: packages cannot be merged together based on SPDXID attribute as there’s no specification in the SPDX standard on how SPDXID should be calculated. Individual tools can calculate it differently while still passing condition to make it unique across the whole document.

Relationships

SPDX relationships represent graph/tree structure of relations of elements in the document. The Root element is the SPDX document itself (with SPDXID SPDXRef-Document). SPDX Root document typically contains a package representing source used for generating the SBOM. This can be container image, directory, etc. The document is in relationship DESCRIBES with this source package - called root package in this document. Other packages are in specific relationships with the root package. See also syft specific sbom details

Relations of two documents needs to be merged together into single graph in a way which keeps the graph structure of the original graph of the main document (into which other document will be merged to). Once packages are merged together, relationships of the second document must be cleared off relations which refer to packages not included in the merged package list. SpdxElementId and relatedSpdxElement point to root document id of the second document should be replaced with root document id of the main document. Root package element id in the second documents needs to be replaced with root package element id of the main document. Example:

--------------------------------------------------------------------------------
| DOC 1                     | DOC 2                    | Merged document        |
----------------------------|--------------------------|------------------------|
| Doc1                      | Doc2                     | Doc1                   |
|   Packages:               |   Packages:              |   Packages:            |
|     root1                 |     root2                |     root1              |
|     p1                    |     p2                   |     p1                 |
|     p2                    |     p3                   |     p2                 |
|   Relationships:          | Relationships:           |     p3                 |
|     Doc1 describes root1  |   Doc2 describes root2   | Relationships:         |
|     root1 contains p1     |   root2 contains p2      |   Doc1 desctibes root1 |
|     root1 contains p2     |   root2 contains p3      |   root1 contains p1    |
|                           |                          |   root1 contains p2    |
|                           |                          |   root1 contains p3    |

Syft specific sbom details

Root packages

Syft generates "source package" representing source used to generate the sbom document. For example when sbom is generated by command syft scan dir:<dir>, the package with SPDXID SPDXRef-DocumentRoot-Directory-<dir>- is generated. Such package has name se to <dir>, no versionInfo and no attributes. In relationships this package is in then in relation SPDXRef-DOCUMENT DESCRIBES SPDXRef-DocumentRoot-Directory-<dir>- and then all packages are in relation CONTAINS with this virtual package, e.i. openshift4----ose-cluster-update-keys <RELATIONSHIP-TYPE> Package-A.

Consequences

All tooling used in pipeline needs to support SPDX SBOM format

References