Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add link-check subcommand to verify package source URLs #406

Closed
wants to merge 1 commit into from

Conversation

ammernico
Copy link
Member

@ammernico ammernico commented Aug 15, 2024

The previous link checker pull request was blocked due to license issues in the lychee-lib crate. I took the previous commit and made some minor changes to make it work with the current version of lychee-lib and clap crate.

@christophprokop
Copy link
Collaborator

Very helpful, but the error message should also provide the path to the corresponding toml file and the id/section of the url that failed (because multiple entries within the same file could fail)

Copy link
Member

@primeos-work primeos-work left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems fine - should be ready for a final test after #405 is merged.

- Implemented a new `link-check` subcommand to validate the source
  URLs of packages.
- Integrated URL validation using `lychee_lib` to ensure links are
  active and correct.

Signed-off-by: Nico Steinle <[email protected]>
Co-authored-by: Matthias Beyer <[email protected]>
@ammernico ammernico marked this pull request as ready for review September 4, 2024 17:02
@primeos-work primeos-work changed the title Link checker Add link-check subcommand to verify package source URLs Sep 12, 2024
Copy link
Member

@primeos-work primeos-work left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO lychee pulls in way too many additional dependencies and I'm not sure if we really gain much over using reqwest like for the downloads.

We should be able to make a HEAD request 0 and extract the HTTP status code from that 1 (plus handle errors) without much more code and with zero additional dependencies.

@@ -49,6 +49,7 @@ indicatif = "0.17"
indoc = "2"
itertools = "0.13"
lazy_static = "1"
lychee-lib = { version = "0.15", default-features = false }
Copy link
Member

@primeos-work primeos-work Sep 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, I just noticed that this adds 91 dependencies (s. Cargo.lock) even with the default-features = false which is kinda insane (grows our dependencies by 25 % from 360 to 451).

IMO this isn't worth it for this feature as the additional maintenance burden is too high (more security advisories and other issues, longer build times, etc.).
Luckily we should be able to implement this with reqwest instead (which we already use for the downloads) without too much additional effort. IMO that's the way to go - what do you think @christophprokop?

Here is the corresponding cargo output:

Adding ahash v0.8.11
Adding allocator-api2 v0.2.18
Adding arc-swap v1.7.1
Adding async-compression v0.4.12
Adding async-stream v0.3.5
Adding async-stream-impl v0.3.5
Adding base64 v0.21.7 (latest: v0.22.1)
Adding by_address v1.2.1
Adding cached v0.49.3 (latest: v0.53.1)
Adding cached_proc_macro v0.20.0 (latest: v0.23.0)
Adding cached_proc_macro_types v0.1.1
Adding cookie v0.17.0 (latest: v0.18.1)
Adding cookie v0.18.1
Adding cookie_store v0.20.0 (latest: v0.21.0)
Adding cookie_store v0.21.0
Adding crossbeam v0.8.4
Adding crossbeam-channel v0.5.13
Adding crossbeam-queue v0.3.11
Adding darling v0.14.4 (latest: v0.20.10)
Adding darling_core v0.14.4 (latest: v0.20.10)
Adding darling_macro v0.14.4 (latest: v0.20.10)
Adding dashmap v5.5.3 (latest: v6.1.0)
Adding derivative v2.2.0
Adding dirs v5.0.1
Adding dirs-sys v0.4.1
Adding email_address v0.2.9
Adding flume v0.10.14 (latest: v0.11.0)
Adding futf v0.1.5
Adding getopts v0.2.21
Adding glob v0.3.1
Adding hashbrown v0.12.3 (latest: v0.14.5)
Adding headers v0.4.0
Adding headers-core v0.3.0
Adding html5ever v0.27.0 (latest: v0.29.0)
Adding html5gum v0.5.7
Adding hyper-rustls v0.26.0 (latest: v0.27.3)
Adding hyper-timeout v0.5.1
Adding idna v0.3.0 (latest: v1.0.2)
Adding indexmap v1.9.3 (latest: v2.5.0)
Adding ip_network v0.4.1
Adding iri-string v0.7.4
Adding jetscii v0.5.3
Adding jsonwebtoken v9.3.0
Adding jwalk v0.8.1
Adding linkify v0.10.0
Adding lychee-lib v0.15.1
Adding mac v0.1.1
Adding markup5ever v0.12.1 (latest: v0.14.0)
Adding nanorand v0.7.0
Adding new_debug_unreachable v1.0.6
Adding num-bigint v0.4.6
Adding num-integer v0.1.46
Adding num_cpus v1.16.0
Adding octocrab v0.38.0 (latest: v0.39.0)
Adding option-ext v0.2.0
Adding par-stream v0.10.2
Adding path-clean v1.0.1
Adding pem v3.0.4
Adding phf v0.11.2
Adding phf_codegen v0.11.2
Adding phf_generator v0.10.0 (latest: v0.11.2)
Adding phf_generator v0.11.2
Adding phf_shared v0.10.0 (latest: v0.11.2)
Adding phf_shared v0.11.2
Adding precomputed-hash v0.1.1
Adding psl-types v2.0.11
Adding publicsuffix v2.2.3
Adding pulldown-cmark v0.9.6 (latest: v0.12.1)
Adding redox_users v0.4.6
Adding reqwest_cookie_store v0.7.0 (latest: v0.8.0)
Adding rustls v0.22.4 (latest: v0.23.13)
Adding rustls-native-certs v0.7.3 (latest: v0.8.0)
Adding secrecy v0.8.0
Adding serde_path_to_error v0.1.16
Adding serde_with v3.9.0
Adding serde_with_macros v3.9.0
Adding shellexpand v3.1.0
Adding simple_asn1 v0.6.2
Adding siphasher v0.3.11 (latest: v1.0.1)
Adding snafu v0.8.4
Adding snafu-derive v0.8.4
Adding string_cache v0.8.7
Adding string_cache_codegen v0.5.2
Adding strsim v0.10.0 (latest: v0.11.1)
Adding tendril v0.4.3
Adding tokio-rustls v0.25.0 (latest: v0.26.0)
Adding tower-http v0.5.2
Adding typed-builder v0.18.2 (latest: v0.20.0)
Adding typed-builder-macro v0.18.2 (latest: v0.20.0)
Adding unicase v2.7.0
Adding utf-8 v0.7.6

Edit: Forgot to copy & paste this part:

# Before:
$ grep -F '[[package]]' Cargo.lock | wc -l
360
$ cargo tree | wc -l
740

# After:
$ grep -F '[[package]]' Cargo.lock | wc -l
451
$ cargo tree | wc -l
1076

# Diff:
>>> 451-360
91
>>> 451/360
1.2527777777777778

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feature is definitely not worth pulling in so many additional dependencies.
It would have been a "nice to have" feature from the beginning.
I suggest we just drop this feature completely. We have more important TODOs :)

@primeos-work primeos-work linked an issue Sep 16, 2024 that may be closed by this pull request
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add URL-Link checker
3 participants