Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Passing credentials for staging in/out files #169

Open
uniqueg opened this issue Nov 5, 2021 · 6 comments
Open

Passing credentials for staging in/out files #169

uniqueg opened this issue Nov 5, 2021 · 6 comments
Milestone

Comments

@uniqueg
Copy link
Contributor

uniqueg commented Nov 5, 2021

The specs currently do not provide any specific support for handling/passing credentials to services that that TES implementations need to pull/push data from/to. This might be taken care in the wider discussion of using Passports in DRS and WES, but it might be good to have an issue for this as well here, for anything TES-specific.

@uniqueg
Copy link
Contributor Author

uniqueg commented Nov 5, 2021

Very much requested by ELIXIR Cloud & AAI DP (all participating nodes) and probably pretty much everyone else, I would assume 🙃

@uniqueg
Copy link
Contributor Author

uniqueg commented Nov 5, 2021

Might be overlapping with #151, though perhaps not quite. I guess both authorization for the TES itself (and its compute) and passing through of credentials to third party services need to be addressed, and perhaps best to discuss these points together (and with WES, DRS, Passport and FASP)...

@kellrott
Copy link
Member

kellrott commented Nov 5, 2021

We're looking at https://www.ga4gh.org/ga4gh-passports/ as a possible solution

@uniqueg
Copy link
Contributor Author

uniqueg commented Nov 6, 2021

Absolutely. But we should make sure that access to TES (and WES if they're not using TES) compute resources are covered as well, not just data access

@jmfernandez
Copy link

From my point of view, 3rd party credentials needed either to fetch inputs or to push outputs should be declared maybe in a similar way as it happens on the answer from https://ga4gh.github.io/data-repository-service-schemas/preview/release/drs-1.2.0/docs/#operation/GetAccessURL , where both the URL and the needed headers to successfully complete the request are provided.

In the case of outputs, additionally to the HTTP headers, maybe an additional field like the HTTP verb could also be needed.

But HTTP headers are too focused on .... HTTP . Other protocols might require to define the authentication in different ways (some private key for SFTP / Aspera, a complex JSON for Google Store, etc...)

@uniqueg
Copy link
Contributor Author

uniqueg commented Oct 31, 2022

Agreeing that we would probably need this on a per object basis, like the example you are citing, @jmfernandez.

If request size is a potential issue and reuse of the same credentials for multiple objects is common, we could also define credentials separately, each with a short identifier, then either refer to credentials directly via that identifier, or provide a one-to-many mapping of credential to object identifiers.

And yes, we should probably not name use the name headers in this case. I guess the interesting question is how we specify the schema such that each TES implementation knows what to do with it.

But if I'm not mistaken, the list of supported storage/transfer protocols is enumerated somewhere, so I guess we could use anyOf and define schemas and instructions for each (possibly some can be reused).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants