Skip to content

Lazily prepare distribution for candidate #13239

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

notatallshaw
Copy link
Member

Often during resolution it would be useful to check the name & version of a potential distribution, but it's not required to actually prepare the distribution if the version doesn't match the requirements.

I’m still investigating if this is the best solution, but this certainly makes writing test cases for the resolver code easier.

@notatallshaw
Copy link
Member Author

notatallshaw commented Feb 24, 2025

I'm looking for input on maintainers who've touched this code, is there a reason that the distribution is prepared eagerly?

I’m motivated by resolver optimizations, so lazily preparing the distribution would be very useful, for example, say a distribution is a link to a wheel, and we can determine ahead of time the name and version of that wheel (because we will only accept wheel filenames that conform to the spec), if the version doesn't match other requirements on that name during the resolution, there's no need to actually download the wheel, but that’s not possible with the current interaction between the resolver and the collector because the distribution is prepared eagerly.

I see there a few failing tests, but at a glance I think these can be updated.

@uranusjr
Copy link
Member

I don’t think there’s a particular reason aside from the distribution is generally needed anyway for most candidates anyway so lazily preparing isn’t very useful. We have a lot more mechanism now that laziness may be worthwhile.

@notatallshaw
Copy link
Member Author

notatallshaw commented Feb 25, 2025

Okay, so investigating the failures a bit further:

  • If the candidate has invalid metadata it isn't discovered until later
  • Currently packages with invalid metadata are skipped as part of collecting
  • However if preparing the distribution was made lazy then candidates with invalid dependencies aren't discovered until resolvelib calls PipProvider.get_dependencies
  • This means pip will throw an error after the collection phase, and there's no way to tell the resolver to skip to the next candidate

I believe this is a design weakness in the current resolution process, and there should be a two phase collection:

  1. Does the candidate version match the current requirements?
  2. Is the candidate actually valid?

If 1 is False then 2 can be skipped over, I will think on if there's a reasonable way to add this to the resolvelib API, and come back to pip if/when that it done.

@ichard26
Copy link
Member

I believe this is a design weakness in the current resolution process, and there should be a two phase collection:

This is essentially the same conclusion (pip's resolution design doesn't support this) I came to while writing #13160 when I originally included the lazifying change, although I didn't think through redesigning the resolution process 🙂

@uranusjr
Copy link
Member

Historically (before static metadata is a thing) the only way you can only get version after preparation, so this was likely sort of a design decision made to simplify implementation.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 12, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants