-
-
Notifications
You must be signed in to change notification settings - Fork 595
Improve license required phrase generation #4237
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Put all required phrase validation in one code block Signed-off-by: Philippe Ombredanne <[email protected]>
Signed-off-by: Philippe Ombredanne <[email protected]>
Signed-off-by: Philippe Ombredanne <[email protected]>
Signed-off-by: Philippe Ombredanne <[email protected]>
Ensure it works with all rules at once Add filter to run only a few times and limit the minimum size of rules to gen. Use SPDX keys for rules too. Signed-off-by: Philippe Ombredanne <[email protected]>
Signed-off-by: Philippe Ombredanne <[email protected]>
Signed-off-by: Philippe Ombredanne <[email protected]>
Signed-off-by: Philippe Ombredanne <[email protected]>
Signed-off-by: Philippe Ombredanne <[email protected]>
Signed-off-by: Philippe Ombredanne <[email protected]>
Signed-off-by: Philippe Ombredanne <[email protected]>
Signed-off-by: Philippe Ombredanne <[email protected]>
Signed-off-by: Philippe Ombredanne <[email protected]>
Do not generate rules for "license key". Skip short rules that would contain stopwords as they cannot be matched accurately Validate short required phrase rules for stopwords Add tests that highlight the stop word issue Reference: #4238 Signed-off-by: Philippe Ombredanne <[email protected]>
Do not generate require phrase rules for these rules. Reference: #4238 Signed-off-by: Philippe Ombredanne <[email protected]>
Signed-off-by: Philippe Ombredanne <[email protected]>
Signed-off-by: Philippe Ombredanne <[email protected]>
plugin_license2.python
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops... should not be committed. junk local file!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pombredanne thanks++ for handling these. LGTM!
I will make the updates for a few nits and merge this after we merge #4215
@@ -2,6 +2,7 @@ | |||
license_expression: bsd-new | |||
is_license_text: yes | |||
is_continuous: yes | |||
skip_for_required_phrase_generation: yes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't we set the minimum_coverage
instead here?
This is better for large parts of texts in rules that we want to detect exactly, more so when we want to match the entire rule exactly.
notes: Note a license | ||
--- | ||
|
||
Some units are dual licensed and some are specifically Artistic-only. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should not be a false positive, we should instead make this a license clue for artistic, see example at https://github.com/natias/arnon/blob/e1233fdd840601e202d3bf1d3e9e5aff9a2e13d5/docker_with_perl_dbd/xzbnl#L245.
}} | ||
|
||
<licenses><license><name>MPL</name><url>http://www.mozilla.org/MPL/2.0/index.txt</url><distribution>repo</distribution></license><license><name>LGPL</name><url>http://www.gnu.org/licenses/lgpl-2.1.txt</url><distribution>repo</distribution></license><license><name>GPL</name><url>http://www.gnu.org/licenses/gpl-2.0.txt</url><distribution>repo</distribution></license></licenses> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not have this entire rule as a required phrase, instead we should hane the name/urls as required phrases.
}} | ||
|
||
<licenses><license><name>MPL</name><url>http://www.mozilla.org/MPL/2.0/index.txt</url><distribution>repo</distribution></license><license><name>LGPL</name><url>https://www.gnu.org/licenses/lgpl-2.1.txt</url><distribution>repo</distribution></license><license><name>GPL</name><url>https://www.gnu.org/licenses/gpl-2.0.txt</url><distribution>repo</distribution></license></licenses> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not have this entire rule as a required phrase, instead we should hane the name/urls as required phrases.
ignorable_urls: | ||
- http://www.gnu.org/licenses/gpl.txt | ||
- http://www.gnu.org/licenses/lgpl.txt | ||
- https://www.mozilla.org/en-US/MPL/1.1/ | ||
--- | ||
{{ | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not have this entire rule as a required phrase, instead we should hane the name/urls as required phrases.
Return True if this phrase is a minimally suitable to use as a required phrase | ||
Return True if this phrase is a minimally suitable to use as a required phrase. | ||
Use the original rule to ensure we skip when referenced_filenames could be damaged. | ||
Also skip short rules that would contain stopwords as they could not be detected correctly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably also check here for required phrase texts which are quite large as a refinement, as these can be better handled either by minimum coverage or by skipping required phrase generation from the source rules.
Signed-off-by: Philippe Ombredanne <[email protected]>
0227a0a
to
76f8559
Compare
Signed-off-by: Ayan Sinha Mahapatra <[email protected]>
76f8559
to
a5116ff
Compare
This PR builds on top of:
Specifically:
Tasks
Run tests locally to check for errors.