Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor!: maintain the latest spdx model and provide conversions from previous #172

Merged
merged 1 commit into from
Jan 17, 2023

Conversation

kzantow
Copy link
Collaborator

@kzantow kzantow commented Nov 22, 2022

This PR implements a fairly straightforward way to maintain a data model all utilities and consumers are able to use.

The gist of this approach is:

  • define all data models and directly related functions under spdx (spdx/v2_1, spdx/v2_2, etc.)
  • maintain type aliases to the latest/current data model
  • provide migrations between versions
  • all top-level functions only operate on the latest/current model
    • e.g. utilities in builder, examples, idsearcher, licensediff, reporter, spdxlib, utils all operate on spdx.Document and do not have duplication for older formats
  • all top-level format Read functions always convert to the latest/current data model:
    • e.g. tagvalue.Read(<2.1 document>) reads a 2.1 document, converts to and returns the current spdx.Document
    • all format types also provide a ReadInto, however, which converts directly to the provided doc type
  • align naming of all readers/writers to use terms Read and Write
  • remove all function naming with version specifics, instead refactor these to the corresponding spdx/<version> packages
  • make imports as consistent as possible, including aliasing all version-specific model imports to spdx
    • this makes a copy/paste operation of a new version significantly easier as basically only imports need to be updated

Ultimately, this change should allow downstream consumers to utilize this library in a significantly easier manner, as an example:

  1. Consumer incorporates this library to read SPDX files, using some code similar to:
import "github.com/spdx/tools-golang/json"

func SomeFunction() {
  doc, err := json.Read(something)
  // deal with doc, which is `spdx.Document`, currently at v2.3
}
  1. New SPDX version is released (let's say 2.4), tools-golang is updated. Type aliases now point to 2.4 instead of 2.3. E.g. spdx.Document -> v2_4.Document.

  2. Consumer updates library and wants 2.4 support, the code before doesn't need to change, and automatically handles reading older 2.3 SPDX files. The only thing the consumer needs to account for is changes in the data model, which they would have to do in all cases to support a newer SPDX version.

NOTE: I have updated Syft to use this model in a working branch -- see: https://github.com/anchore/syft/compare/main...kzantow-anchore:syft:chore/refactored-spdx-lib?expand=1

TODO:

  • handle tag-value conversions more generically to reduce per-version duplication
  • handle rdf decoding more generically to reduce per-version duplication
  • update documentation

Fixes: #175
Fixes: #167
Fixes: #165
Fixes: #160

@kzantow kzantow mentioned this pull request Dec 15, 2022
Copy link
Collaborator

@lumjjb lumjjb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kzantow this is a humungous effort and this is truly a 1000x improvement for the golang library and SPDX more broadly!!! The proposed changes look good, and I've added some minor comments/questions..

The only part of it that i'd want to discuss more about is with v3, it is going to be a breaking change for consumers of the library. With your changes here, when we upgrade from 2.2 to 2.3, it is truly just a package update - which is SUPER SUPER SUPER awesome!! However, this is likely impossible with v3 - since it has a totally different data model and fields, so i'm wondering if it's maybe still better to put it into a folder as spdx/v2 or just spdxv2/ as the base package name. Thoughts?

convert/struct.go Outdated Show resolved Hide resolved
json/writer.go Outdated Show resolved Hide resolved
@kzantow
Copy link
Collaborator Author

kzantow commented Dec 20, 2022

@lumjjb thanks much for the review -- regarding v3, I would very much like to know more about this, I'll ping you about it!

README.md Show resolved Hide resolved
@ivanayov
Copy link

ivanayov commented Jan 6, 2023

Thank you for addressing this @kzantow ! I've been struggling with subversion upgrades and am looking forward to adopting this!

My only comment is that currently it always defaults to the latest SPDX, while one may want to configure a specific version. For example, v2.2 is ISO approved and v2.3 - not. For that reason some people and organisations use only v2.2 prior the next major version. But others may prefer the latest 2.3. It would be great to allow freedom of configuring which version to default to.

This would make the v3.0 transition easier as well.

Other than that, I very much like how you approach it.

@kzantow
Copy link
Collaborator Author

kzantow commented Jan 6, 2023

Thanks @ivanayov -- I'm actually almost done on a last change to make using specific versions easier. I'll update with some examples shortly.

@kzantow kzantow marked this pull request as ready for review January 6, 2023 22:45
@kzantow
Copy link
Collaborator Author

kzantow commented Jan 6, 2023

@ivanayov I'll write up some usage examples in the documentation after this PR gets merged, but to answer your question about supporting a specific version, I'd suggest this approach:

First: always alias your imports of a specific data model as spdx -- the reason for this is it makes upgrades a lot simpler and keeps your code fairly consistent.

For example:

import (
  "github.com/spdx/tools-golang/spdx/json"
  spdx "github.com/spdx/tools-golang/spdx/v2/v2_2"
)

// you will always get the specific version you are looking for, assuming a conversion is possible:
var doc spdx.Document
json.ReadInto(<reader>, &doc) // reads _any version_ and converts it to a 2.2 version, since this is what we imported

// all the write functions work with the specific version you pass in:
json.Write(&doc) // writes a 2.2 version

This makes upgrading very simple by just updating imports instead of every single usage of the types. If you always want to be at the latest version, then the import is just "github.com/spdx/tools-golang/spdx".

@ivanayov
Copy link

ivanayov commented Jan 9, 2023

I think that works well. Thanks @kzantow !

lumjjb
lumjjb previously approved these changes Jan 11, 2023
Copy link
Collaborator

@lumjjb lumjjb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much again @kzantow on helping with this very complex and extensive PR! LGTM! I have some minor documentation nits... happy to merge this by end of week if no other comments. @swinslow @RishabhBhatnagar

convert/spdx_document_conversion_test.go Show resolved Hide resolved
convert/chain.go Show resolved Hide resolved
@@ -6,7 +6,7 @@

# SPDX tools-golang

tools-golang is a collection of Go packages intended to make it easier for
`tools-golang` is a collection of Go packages intended to make it easier for
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it will be super helpful to link your usage repo somewhere in this README or as a comment in the root /spdx package as an example of how to use this new refactored version!

Copy link
Collaborator Author

@kzantow kzantow Jan 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem with the usage repo is it's just a branch that will probably get deleted. Once this is merged and I've updated Syft, I could have a follow-on to the README that links directly to the Syft usage but probably shouldn't put any "permanent" links before that.

It should also be noted that I've updated all the examples -- is this enough? I could also add some more examples, like converting documents...

json/writer.go Show resolved Hide resolved
Copy link
Collaborator

@lumjjb lumjjb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@lumjjb lumjjb merged commit 908a516 into spdx:main Jan 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants