Skip to content

client/updater design: let user download targets? #1142

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jku opened this issue Sep 14, 2020 · 7 comments
Closed

client/updater design: let user download targets? #1142

jku opened this issue Sep 14, 2020 · 7 comments
Labels
client Related to the client (updater) implementation

Comments

@jku
Copy link
Member

jku commented Sep 14, 2020

This is a future wishlist item for client API (#1135), based on working on pip integration.

Current issues with TUF from pip point of view:

  • No way to implement progress indication (a current feature) in pip
  • Difficult to implement parallel downloads in pip (a potential future feature)
  • No way to control low level HTTP details (authentication, timeouts, concurrency)

Some of these could be fixed by adding more complexity to TUF... but potentially the correct fix is to separate downloading from the other TUF functionality and let the clients do that themselves.

High level functionality list of the client currently looks like this:

  1. generic metadata update (client.updater.Updater.refresh())
  2. target metadata update (client.updater.Updater.get_one_valid_targetinfo())
  3. resolving potential download URLs for target (mirrors.get_list_of_mirrors(), happens during all metadata and target downloads)
  4. local cache check (client.updater.Updater.updated_targets())
  5. downloading target (client.updater.Updater.download_target())
  6. verifying target (happens inside client.updater.Updater.download_target())

Maybe we can optionally let user handle "downloading target" and expose "verifying target" step in a more usable way to the user. This means that "resolving download URLs" must be better exposed to user. Also "local cache check" need re-design: if user downloads files, they also control caching so we can't expect tuf cache directory structure to exist.

@jku
Copy link
Member Author

jku commented Sep 14, 2020

See also #1135 for more details on mirror configuration (which relates to resolving download URLs part)

@joshuagl joshuagl added the client Related to the client (updater) implementation label Sep 21, 2020
@joshuagl joshuagl added this to the Client Refactor milestone Sep 21, 2020
@sechkova
Copy link
Contributor

sechkova commented Oct 7, 2020

I had a look on the current implementation and I would vote for implementing minimal changes in the current Updater code trying to achieve "let user download targets" sooner and then transfer this to the client refactor.

I have some questions and comments for @jku but I'd appreciate general comments related to the client API:

let user handle "downloading target" and expose "verifying target"

I don't see a reason why we cannot expose a verify_target function and keep the current download_target (which I would rename to a more explicit download_verified_target but this will change the current API so maybe not).
Anyway, the community may have its security reasons about not letting the client handle downloads on its own and I would like to hear them before implementing it. (@joshuagl @lukpueh @trishankatdatadog )

"resolving download URLs" must be better exposed to user

I can speculate what do you mean @jku but can you be more specific :) Do you mean adding get_list_of_mirrors or similar to the client module or you dislike something from the function internals?

"local cache check" need re-design

Currently "updated_targets" looks for targets in a path constructed by filepath and destination_directory. Do you think this logic has to be handled by user too? Maybe an example form the pip local cache that don't match tuf's expectations?

@jku
Copy link
Member Author

jku commented Oct 7, 2020

So I think I was wrong and tuf can handle the cache -- and probably needs to (pip should just disable http caching for these downloads).

  1. user wants to download target name, checks cache with Updater.updated_targets()
  2. User asks updater what needs to be downloaded, Updater returns URLs that should be tried in sequence
  3. user downloads URLs until one works, returns file object to Updater
  4. Updater verifies the file object
    • if failure, returns error
    • if success, inserts file object to cache and returns filepath

So the issue about "resolving URLs" is just that there are multiple potential URLs for a target name

One potential solution might be to keep the API as is, but add an optional download handler function argument to download_target(). This user provided download handler would be called with a download URL argument (potentially multiple times) before download_target() returns.

@sechkova
Copy link
Contributor

sechkova commented Oct 7, 2020

I start liking the idea of passing a download handler, or even not passing it as a parameter to the download function but adding it as a config on updater init.

In TUF the download function signature is:

def _download_file(url, required_length, STRICT_REQUIRED_LENGTH=True):

I guess the requirement for a client would be to pass a download handler that can ensure the downloaded file size?

@joshuagl
Copy link
Member

joshuagl commented Oct 7, 2020

the community may have its security reasons about not letting the client handle downloads on its own and I would like to hear them before implementing it.

Protecting against endless data attacks require us to ensure we are only downloading up to a metadata defined number of bytes. It's easier for the framework to make assertions about that if the framework is controlling downloads.

If the handler implements a pre-defined interface, we can offload that responsibility to the client integrating the framework for targets downloads.

@trishankatdatadog
Copy link
Member

Anyway, the community may have its security reasons about not letting the client handle downloads on its own and I would like to hear them before implementing it. (@joshuagl @lukpueh @trishankatdatadog )

As long as we use 3rd-party code to download, we can never guarantee what it actually does.

@jku
Copy link
Member Author

jku commented Mar 18, 2021

This is now possible with #1250!

@jku jku closed this as completed Mar 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
client Related to the client (updater) implementation
Projects
None yet
Development

No branches or pull requests

4 participants