Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Licenses export #1130

Open
bxf12315 opened this issue Jan 9, 2025 · 1 comment
Open

Licenses export #1130

bxf12315 opened this issue Jan 9, 2025 · 1 comment

Comments

@bxf12315
Copy link

bxf12315 commented Jan 9, 2025

Translate the license export functionality, which has already been implemented in version 1, to version 2.

@bxf12315 bxf12315 added this to Trustify Jan 9, 2025
@bxf12315
Copy link
Author

bxf12315 commented Jan 22, 2025

Analysis of the license data structure.

An SBOM typically contains three types of license information:

1 License information for the entire SBOM

1.1 CycloneDX SBOM-level License

"licenses" : [
  { 
    "license" : { 
      "id" : "Apache-2.0" 
    }
  }
]

1.2 SPDX SBOM-level License

"dataLicense": "CC0-1.0",

2 Package-level licenses

2.1 CycloneDX

Contains three types: Id, name, and Expression
The Expression follows the SPDX expression standard.

"licenses" : [  {    "license" : {      "id" : "EPL-1.0"    }  },  {    "license" : {      "name" : "GNU Lesser General Public License",      "url" : "http://www.gnu.org/licenses/old-licenses/lgpl-2.1.html"    }  }],`

2.2 SPDX

It's Expression and follows the SPDX expression standard.

"licenseDeclared": "Apache-2.0 OR LicenseRef-GPL-2.0-with-classpath-exception",

2.3 Special Types in SPDX: ExtractedLicensingInfos

"hasExtractedLicensingInfos": [
{
"comment": "External License Info is obtained from a build system which predates the SPDX specification and is not strict in accepting valid SPDX licenses.",
"extractedText": "The license info found in the package meta data is: GPL-2.0-with-classpath-exception. See the specific package info in this SPDX document or the package itself for more details.",
"licenseId": "LicenseRef-GPL-2.0-with-classpath-exception",
"name": "GPL-2.0-with-classpath-exception"
},

The licenseId can be associated with the license in the package.

2 Current license export requirements

https://docs.google.com/spreadsheets/d/1Ki9hkLl94L3G4A4rfQQ4wuEnQFTPkpNflClVjlk2mQA/edit?gid=0#gid=0

3 Current license data structure design in trustify

pub struct Model {
    #[sea_orm(primary_key)]
    pub id: Uuid,
    pub text: String,
    pub spdx_licenses: Option<Vec<String>>,
    pub spdx_license_exceptions: Option<Vec<String>>,
}

The text field corresponds to CycloneDX's Id, Name, or complete Expression. When it's an Expression, it will be parsed into a license array and stored in spdx_licenses, while parsing exceptions are stored in spdx_license_exceptions.
For SPDX, the text field corresponds to the complete Expression. Similarly, the Expression will be parsed into a license array and stored in spdx_licenses, with parsing exceptions stored in spdx_license_exceptions.
Currently missing SPDX's ExtractedLicensingInfos and SBOM-level licenses.

4 License data structure in GUAC

Image

GUAC parses all expressions into individual licenses and saves them separately. LicenseRef* represents ExtractedLicensingInfos, with its extractedText stored in the inline field.

5 Current design proposal

Image
pub enum LicenseCategory {
    #[sea_orm(string_value = "slc")]
    SPDXDECLARED,
    #[sea_orm(string_value = "sld")]
    SPDXCONCLUDED,
    #[sea_orm(string_value = "clci")]
    CYDXLCID,
    #[sea_orm(string_value = "clcn")]
    CYDXLCNAME,
    #[sea_orm(string_value = "cle")]
    CYDXLEXPRESSION,
    #[sea_orm(string_value = "cd")]
    CLEARLYDEFINED,
    #[sea_orm(string_value = "o")]
    OTHER,
}

SPDXDECLARED: The license from spdx sbom's declered license.
SPDXCONCLUDED: The license from spdx sbom's concluded license.
CYDXLCID: The license from CycloneDX sbom's id.
CYDXLCNAME: The license from CycloneDX sbom's name.
CYDXLEXPRESSION: The license from CycloneDX sbom's Expression.

Where license_id is the same as name in GUAC, license_ref_id serves as a foreign key linking to the Id of extracted_licensing_infos. I illustrate this design through two unit tests https://github.com/trustification/trustify/pull/1164/files#diff-e2251167d81406b13fac64820ca3c5af49095705376eb2d6a21830270605d69cR99.
This design currently has two issues:

  1. The complete expression is not saved, as the complete expression includes not only individual license
    information but also the relationships between these licenses.
  2. SBOM-level licenses are not saved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

No branches or pull requests

1 participant