Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discover and use the cache authorization token #24

Merged
merged 2 commits into from
Jan 31, 2025

Conversation

bbockelm
Copy link
Collaborator

If a cache authorization token file is provided in the plugin's configuration, then periodically read it out and use it in the generated curl requests.

If the plugin is configured to use the cache authorization token,
then read it periodically from the file and add it to the HTTP
request via the query parameters.

Includes relevant unit and integration tests.
Copy link
Member

@jhiemstrawisc jhiemstrawisc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only had a few light questions/comments, so I'll approve ahead of time.

// Configure a curl handle to use the current cache token
//
// Adds the token to the curl handle URL's query string
// parameter `access_token`, as specified in RFC 6750, Sec 2.3.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two sets of questions here:

  1. Section 2.3 of the RFC you link to warns:
Because of the security weaknesses associated with the URI method,
including the high likelihood that the URL containing the access token
will be logged, it SHOULD NOT be used unless it is impossible to
transport the access token in the "Authorization" request header field
or the HTTP request entity-body.

My assumption is that you're using the query param instead of the Authz header because the Authz header likely contains a JWT for the origin. Can you confirm or deny that with a brief comment? If I'm not correct, what's the logic for doing something the RFC explicitly warns against?

  1. Does this imply the origin receives the two tokens, one client token potentially in the Authz header and one cache token as a URL query param under access_token? Why not use a simple JWT like we use everywhere else in Pelican Authz?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to get two tokens to the origin -- one identifying the cache, one showing that there was at least one valid user request to the cache.

I started with the idea of putting both in the Authorization header (as suggested in the RFC text you quote). However, that ran aground in upstream XRootD as sending two Authorization headers is not kosher: you're only supposed to send two headers in very limited cases (where the header is explicitly allowed to repeat or where the value is comma-separated).

So, eliminating that, this was the remaining option. The "good news" here is that we already have log scrubbing for tokens, meaning there's mitigation for the explicit issue called out. This URL is also not exposed to the browser, meaning there's no likelihood of a user copy/pasting it.

Comment on lines +1112 to +1117
std::string_view url{url_char};
auto has_query_string = url.find('?') != std::string::npos;
std::string final_url{url};
final_url += has_query_string ? "&" : "?";
final_url += "access_token=";
final_url += token;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This type of URL parsing feels like it should be common enough (if not already, then eventually) to make a utility function.

This comment was marked as duplicate.

This comment was marked as duplicate.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed!

This has a lot of the characteristics of a generic URL parser -- but only targets extracting a a single parameter. I think I'm going to keep it as-is for this PR but tackle things next time we need a "one-off".

@bbockelm
Copy link
Collaborator Author

Looks like GitHub was having A Morning today and managed to duplicate my comments a few times! Let's see what happens when I merge... 😆

@bbockelm bbockelm merged commit 7bb9680 into PelicanPlatform:main Jan 31, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants