Skip to content

✨ Add --detect-licenses flag#45

Draft
ddelange wants to merge 2 commits into
masterfrom
recursive-licenses
Draft

✨ Add --detect-licenses flag#45
ddelange wants to merge 2 commits into
masterfrom
recursive-licenses

Conversation

@ddelange
Copy link
Copy Markdown
Owner

@ddelange ddelange commented Feb 8, 2021

This draft will most likely not be finished, as the licencing topic is a rabbit hole which is practically impossible to do right due to lack of strictness in the ecosystem.

As pipgrip exclusively has access to wheels, many licenses will not be present (see code comments for examples) and would call for a source distribution fallback (not trivial).

Assuming authors distribute their packages correctly, legal files should be present in wheels (ref https://wheel.readthedocs.io/en/stable/user_guide.html#including-license-files-in-the-generated-wheel-file found at https://jwodder.github.io/kbits/posts/pypkg-mistakes/#top-level-readme-or-license-file-in-wheel), but sadly this is not the case (even pip's vendored licenses aren't reproduced in the pip wheel).

@ddelange ddelange force-pushed the recursive-licenses branch 2 times, most recently from 14c238f to 5b60da8 Compare February 9, 2021 11:04
@jdvala
Copy link
Copy Markdown

jdvala commented Jun 19, 2021

@ddelange I suppose the licence information would be available somewhere right? I mean on the repo level? If there is repo information available for the repo in setup.py we can use api's to query this information, just saying.

@ddelange
Copy link
Copy Markdown
Owner Author

ddelange commented Jun 24, 2021

Hi @jdvala 💥

Indeed the licence can often be found in the (metadata of the) repo. The repo (hosted source) can be any VSC type hosting or even a plain HTML sitemap or so.

Elaborating on the inline comment, technically speaking, if the licence is missing in the wheel (the distribution which pipgrip installs and is executed by the user), for most licenses that counts as a failure to reproduce the license. This violation aside, there is at this point technically no guarantee that the license you pick up from another distribution (e.g. the hosted source, or an sdist downloaded from pypi) will correspond to the distribution on your system. Usually, a licence is valid only in fulltext, delivered alongside the actual distribution or embedded in each file or so. There are also other legal files like AUTHORS, which might also be required information to build a complete/valid 'licence info package' (so more than just 'pipgrip': 'BSD-3') for a distribution you want to run.

Some existing tools I've seen provide some 'confidence level' for their licence labels, and mostly won't be able to back that up with the licence fulltext for that specific version. I guess under the 'something is better than nothing' philosophy, and the lack of licensing standardisation in the Python ecosystem, this technique of looking at e.g. hosted source, pypi warehouse metadata, source distributions (sdist) etc. as fallback is the best alternative currently.

@ddelange
Copy link
Copy Markdown
Owner Author

ddelange commented Jul 2, 2024

To get an overview of license information present in wheel metadata (using pip and jq, not pipgrip):

pip install -r requirements.txt -qq --no-deps --ignore-installed --disable-pip-version-check --dry-run --report - | jq -c '.install[].metadata | [.name, .license, (try .classifier | map(select(contains("License"))))]'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants