Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add spdx taxonomy #7

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open

Add spdx taxonomy #7

wants to merge 5 commits into from

Conversation

coderpatros
Copy link
Member

This adds the property namespace taxonomy in use for CDX and SPDX inter-op in the CLI tool.

I've followed the namespace taxonomy recommendation that they should be lower case. However, I think there might be a case here to not follow that.

As the aim of the spdx namespace is inter-op with the SPDX format, I think maybe it should follow SPDX conventions.

i.e. spdx:package:summary, should perhaps become spdx:PackageSummary, as that is what the tag is in the SPDX tag/value format.

I'm leaning towards consistency with SPDX tag/value format, as that will make it easier for people to find information about that field in the SPDX doco. Which I don't want to have to re-create here.

Signed-off-by: Patrick Dwyer <[email protected]>
@stevespringett
Copy link
Member

I like this effort. Are there differences in the names of fields between TV and JSON? It would theoretically be possible to validate SPDX properties in a CDX BOM using a custom JSON validator. Obviously outside the scope of CycloneDX, but it would be possible.

Also, since SPDX 3 has breaking changes, and not entirely flushed out yet, are field names part of what breaks? If so, we may want to put a version in the taxonomy.

@stevespringett
Copy link
Member

Is the goal here to capture every SPDX field or only the ones where data would be lost in conversion?

@coderpatros
Copy link
Member Author

coderpatros commented Jan 9, 2022

Really good point about v2 vs v3. I'll revise the namespace to include the major version, i.e. spdx:v2.

There are some differences between tag/value and other serialization formats.

For example package version is PackageVersion in T/V and in other serialization formats it is the "versionInfo" property of a "package" object.

I don't have a strong opinion one way or the other. I started off going with the spdx:<OBJECT>:<PROPERTY> approach as that made more sense to me initially.

But T/V format seems to always be the primary examples given in the spec. So I'm starting to think aligning with that specific serialization format makes more sense.

I suspect it is also the most widely used format for SPDX. JSON schema is only a recent addition. And I haven't seen a schema for any other serialization format.

The goal is only to capture information that CycloneDX can't natively handle.

Hashes are a good example. SPDX supports SHA1, SHA224, SHA256, SHA384, SHA512, MD2, MD4, MD5 and MD6 hashes for files. CDX doesn't support MD2, MD4, MD6 or SHA224 (which has a typo in this PR). The algorithms that overlap between SPDX and CDX are SHA1, SHA256, SHA384, SHA512 and MD5.

The hash algorithms we support are being converted to a CDX hash object. Hash algorithms we don't support are being stored in properties.

Same with external references. We also have external references but they aren't always the same thing. And although both specs support things like package URLs and CPEs, they are not component identifiers in SPDX like they are in CDX. So there isn't any direct translation for them.

@goneall
Copy link

goneall commented Jan 14, 2022

The RDF/XML is another format used within the SPDX community. Not as popular as Tag/Value but it has some very strong supporters.

JSON is quickly becoming a very popular format. We're planning to add more documentation and support for JSON which will likely improve the adoption.

I would also suggest adding a link to the SPDX RDF terms page which defines a similar taxonomy: https://spdx.org/rdf/terms/

@coderpatros
Copy link
Member Author

Thanks @goneall that's very useful information.

README.md Outdated Show resolved Hide resolved
jkowalleck
jkowalleck previously approved these changes Sep 20, 2022
Copy link
Member

@stevespringett stevespringett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you resolve conflicts and work with @goneall on some of these?

| `spdx:v2:package:originator:organization` |
| `spdx:v2:package:source-info` |
| `spdx:v2:package:summary` |
| `spdx:v2:package:supplier:organization` |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

| `spdx:v2:homepage` |
| `spdx:v2:license-comments` |
| `spdx:v2:license-concluded` |
| `spdx:v2:license-declared` |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this necessary? Wouldn't this map to component/licenses/license? Or am I missing something?

| `spdx:v2:comment` |
| `spdx:v2:download-location` |
| `spdx:v2:files-analyzed` |
| `spdx:v2:homepage` |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't this map to bom/externalReferences/type=website?

| --- |
| `spdx:v2:annotation` |
| `spdx:v2:comment` |
| `spdx:v2:download-location` |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't this map to bom/externalReferences/type=distribution?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree.

@coderpatros
Copy link
Member Author

Can you resolve conflicts and work with @goneall on some of these?

Can do @stevespringett. I need to get my head back into the conversion stuff. From memory all those things that should map to an existing CDX field are there because there is some loss of information.

For example, supplier in SPDX can be a person or an organisation. So to be able to go SPDX->CDX->SPDX you need to know if it was a person or organisation.

But, catering for these little differences complicates the conversion process a lot. I'm starting to wonder whether we should even bother chasing as lossless conversion as possible. Maybe we should just do a best effort CDX->SPDX, and best effort SPDX->CDX. Identify the pain points, and hope they can be reduced in future versions of the specs. Which could be tricky as there are some subtle differences in the two formats that are really problematic for conversion. The latest version of SPDX was supposed to remove some pain points for conversion. But it has added some too.

@goneall
Copy link

goneall commented Sep 22, 2022

Maybe we should just do a best effort CDX->SPDX, and best effort SPDX->CDX. Identify the pain points, and hope they can be reduced in future versions of the specs.

I agree with this approach. Learning from the last SPDX release, I would suggest we coordinate more prior to releasing either spec to at least understand when we're introducing something that makes the conversion more difficult.

One other suggestion is to produce an output in the conversion that lists where we know we lost some fidelity in the data during translation (e.g. a list of warnings).

@stevespringett
Copy link
Member

One other suggestion is to produce an output in the conversion that lists where we know we lost some fidelity in the data during translation (e.g. a list of warnings).

I think that's a good idea. Provide full transparency to the user about what was converted and what was not.

@jkowalleck jkowalleck changed the title Add SPDX taxonomy Add spdx taxonomy Jun 7, 2023
@jkowalleck jkowalleck added the TLN-registry registration/update of a Top Level Namespace label Jun 7, 2023
@jkowalleck jkowalleck requested a review from a team as a code owner July 12, 2023 15:36
@jkowalleck
Copy link
Member

just merged in the latest master and fixed the merge conflicts accordingly.

@jkowalleck jkowalleck added the help wanted Extra attention is needed label Sep 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed TLN-registry registration/update of a Top Level Namespace
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants