Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(spdx): use the hasExtractedLicensingInfos field for licenses that are not listed in the SPDX #8077

Open
wants to merge 18 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 11 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
129 changes: 129 additions & 0 deletions pkg/licensing/expression/category.go
Original file line number Diff line number Diff line change
@@ -1,5 +1,12 @@
package expression

import (
"slices"
"strings"

"github.com/samber/lo"
)

// Canonical names of the licenses.
// ported from https://github.com/google/licenseclassifier/blob/7c62d6fe8d3aa2f39c4affb58c9781d9dc951a2d/license_type.go#L24-L177
const (
Expand Down Expand Up @@ -358,4 +365,126 @@ var (
Unlicense,
ZeroBSD,
}

// SpdxLicenseExceptions contains all supported SPDX Exceptions
// cf. https://spdx.org/licenses/exceptions.json
// used `awk -F'"' '/"licenseExceptionId":/ {print toupper("\"" $4 "\"," )}' exceptions.json ` command
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we need to keep it up-to-date, it should be done by mage spdx or something like that. I think we should create a separate file for the list and add

// Code generated by "mage spdx", DO NOT EDIT.
// source: https://spdx.org/licenses/exceptions.json

to the header.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to check exceptions.json file in tests?
I mean the same as for mage docs:generate

This will help keep the file up-to-date, but can be noisy for PRs when a new version of file is released.
On the other hand, we can add a separate action to check the file's relevance once a week.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using curl and awk through go generate is the easiest way, but some environments don't have curl, and CLI flags might be different. Ideally, we should do that in Go.

Copy link
Collaborator

@knqyf263 knqyf263 Dec 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to check exceptions.json file in tests?

No, we don't need it for now. We can update the file when we notice that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, that's what i planned
this command (using awk and other commands) is just a quick way to get all exceptions

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can add a separate action to check the file's relevance once a week.

This sounds better. Also, we don't need to fail the test. We can notify it on Microsoft Teams.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated PR.
Take a look, when you have time, please.

spdxLicenseExceptions = []string{
"389-EXCEPTION",
"ASTERISK-EXCEPTION",
"ASTERISK-LINKING-PROTOCOLS-EXCEPTION",
"AUTOCONF-EXCEPTION-2.0",
"AUTOCONF-EXCEPTION-3.0",
"AUTOCONF-EXCEPTION-GENERIC",
"AUTOCONF-EXCEPTION-GENERIC-3.0",
"AUTOCONF-EXCEPTION-MACRO",
"BISON-EXCEPTION-1.24",
"BISON-EXCEPTION-2.2",
"BOOTLOADER-EXCEPTION",
"CLASSPATH-EXCEPTION-2.0",
"CLISP-EXCEPTION-2.0",
"CRYPTSETUP-OPENSSL-EXCEPTION",
"DIGIRULE-FOSS-EXCEPTION",
"ECOS-EXCEPTION-2.0",
"ERLANG-OTP-LINKING-EXCEPTION",
"FAWKES-RUNTIME-EXCEPTION",
"FLTK-EXCEPTION",
"FMT-EXCEPTION",
"FONT-EXCEPTION-2.0",
"FREERTOS-EXCEPTION-2.0",
"GCC-EXCEPTION-2.0",
"GCC-EXCEPTION-2.0-NOTE",
"GCC-EXCEPTION-3.1",
"GMSH-EXCEPTION",
"GNAT-EXCEPTION",
"GNOME-EXAMPLES-EXCEPTION",
"GNU-COMPILER-EXCEPTION",
"GNU-JAVAMAIL-EXCEPTION",
"GPL-3.0-INTERFACE-EXCEPTION",
"GPL-3.0-LINKING-EXCEPTION",
"GPL-3.0-LINKING-SOURCE-EXCEPTION",
"GPL-CC-1.0",
"GSTREAMER-EXCEPTION-2005",
"GSTREAMER-EXCEPTION-2008",
"I2P-GPL-JAVA-EXCEPTION",
"KICAD-LIBRARIES-EXCEPTION",
"LGPL-3.0-LINKING-EXCEPTION",
"LIBPRI-OPENH323-EXCEPTION",
"LIBTOOL-EXCEPTION",
"LINUX-SYSCALL-NOTE",
"LLGPL",
"LLVM-EXCEPTION",
"LZMA-EXCEPTION",
"MIF-EXCEPTION",
"NOKIA-QT-EXCEPTION-1.1",
"OCAML-LGPL-LINKING-EXCEPTION",
"OCCT-EXCEPTION-1.0",
"OPENJDK-ASSEMBLY-EXCEPTION-1.0",
"OPENVPN-OPENSSL-EXCEPTION",
"PCRE2-EXCEPTION",
"PS-OR-PDF-FONT-EXCEPTION-20170817",
"QPL-1.0-INRIA-2004-EXCEPTION",
"QT-GPL-EXCEPTION-1.0",
"QT-LGPL-EXCEPTION-1.1",
"QWT-EXCEPTION-1.0",
"ROMIC-EXCEPTION",
"RRDTOOL-FLOSS-EXCEPTION-2.0",
"SANE-EXCEPTION",
"SHL-2.0",
"SHL-2.1",
"STUNNEL-EXCEPTION",
"SWI-EXCEPTION",
"SWIFT-EXCEPTION",
"TEXINFO-EXCEPTION",
"U-BOOT-EXCEPTION-2.0",
"UBDL-EXCEPTION",
"UNIVERSAL-FOSS-EXCEPTION-1.0",
"VSFTPD-OPENSSL-EXCEPTION",
"WXWINDOWS-EXCEPTION-3.1",
"X11VNC-OPENSSL-EXCEPTION",
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment may deserve to be a separate suggestion, but in reading the code I would recommend building the license and exception IDs from the JSON files maintained by the SPDX legal team. The license list is updated every 3 months with new IDs and maintaining these in code can be a challenge to keep up and maintain. What I do in the code I maintain is attempt to access the current JSON files on the website https://spdx.org/licenses/licenses.json and https://spdx.org/licenses/exceptions.json. If I can not access the website or if the user specified not to use the online version, I'll use a cached version of the file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are cases when users run multiple times.
Downloading these files for each run is not good.
But we can save licenses.json and exceptions.json files in the cache dir and use them.
The files contain releaseDate field, so we can update this file only when releaseDate + 3 months has expired.

The license list is updated every 3 months

How strictly is this rule followed?

Anyway let's move this discussion into another issue/pr.


What I do in the code I maintain is attempt to access the current JSON files on the website https://spdx.org/licenses/licenses.json and https://spdx.org/licenses/exceptions.json

I found that https://spdx.org/licenses/exceptions.json and https://github.com/spdx/license-list-data/blob/592c2dcb8497c6fe829eea604045f77d3bce770b/json/exceptions.json are different (see harbour-exception).
Which file would be more correct to use?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How strictly is this rule followed?

Not very strictly. There is, however, a license list version field which is reliably incremented on release.

I found that https://spdx.org/licenses/exceptions.json and https://github.com/spdx/license-list-data/blob/592c2dcb8497c6fe829eea604045f77d3bce770b/json/exceptions.json are different (see harbour-exception).
Which file would be more correct to use?

The lists at https://spdx.org/licenses - these will always be the latest released version. The github repo master will have the latest in development version which may not be stable. The github repo is tagged with release versions, so if you go to the tag for the latest release in github, it will match what is on the website.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, update exception list in 659f992

)

var spdxLicenses map[string]struct{}

func initSpdxLicenses() {
licenseSlices := [][]string{
ForbiddenLicenses,
RestrictedLicenses,
ReciprocalLicenses,
NoticeLicenses,
PermissiveLicenses,
UnencumberedLicenses,
}

for _, licenseSlice := range licenseSlices {
spdxLicenses = lo.Assign(spdxLicenses, lo.SliceToMap(licenseSlice, func(l string) (string, struct{}) {
return l, struct{}{}
}))
}

// Save GNU licenses with "-or-later" and `"-only" suffixes
for _, l := range GnuLicenses {
license := SimpleExpr{
License: l,
}
spdxLicenses[license.String()] = struct{}{}

license.HasPlus = true
spdxLicenses[license.String()] = struct{}{}
}

}

// ValidSpdxLicense returns true if SPDX license lists contain licenseID and license exception (if exists)
func ValidSpdxLicense(license string) bool {
if spdxLicenses == nil {
initSpdxLicenses()
}

id, exception, ok := strings.Cut(license, " WITH ")
if _, licenseIdFound := spdxLicenses[id]; licenseIdFound && (!ok || slices.Contains(spdxLicenseExceptions, strings.ToUpper(exception))) {
return true
}
return false
}
Loading
Loading