Conversation
rtibbles
left a comment
There was a problem hiding this comment.
Thanks for this, this is a good start!
This can be done more simply, and more globally by leveraging the information we get from authenticating the user against the Studio server. Please ask any follow up questions about the directions I have suggested.
Also, please make these commits independently of the commits from your other PR.
ricecooker/managers/tree.py
Outdated
| ) | ||
| if response.status_code == 200: | ||
| return | ||
| url_response = config.SESSION.post(config.get_upload_url(), json=data) |
There was a problem hiding this comment.
Not sure why this code block has been dedented? It wasn't in your other PR.
| max_retries = 5 | ||
| request_headers = DEFAULT_HEADERS | ||
|
|
||
| configure_download_session(sess, user_email) |
There was a problem hiding this comment.
Note, I think we should be setting this more globally on config.DOWNLOAD_SESSION, not just within the the downloader utility. While it is most acute when we are downloading HTML pages, it is better we do this more broadly.
Once we have authenticated against Studio with the token, we should be able to use the user email from there, rather than requiring it be explicitly passed in.
See here where we receive the username (which is the email address) once we have authenticated with Studio: https://github.com/learningequality/ricecooker/blob/develop/ricecooker/commands.py#L90
Once we have that information in hand, we can do the required DOWNLOAD_SESSION update.
There was a problem hiding this comment.
Yes, passing an email manually every time can be tedious. I will work on an approach and make sure that the code is more of Don't Repeat Yourself and improves overall authentication and session management. But, I would like to ask if there any specific requirements or constraints for setting the download session that I need to take care of?
There was a problem hiding this comment.
Hi @Divyanshi750 - I'm sorry, I don't understand what isn't clear here - I have outlined which session and needs to be updated, and also shown where in the code this should happen.
|
Improved session configuration by adding global email setup after Studio authentication. Changes made:
This makes the download session configuration more consistent and eliminates the need |
rtibbles
left a comment
There was a problem hiding this comment.
In spite of tests being present, the code is full of issues that are evident from a code read through.
This has the appearance of being generated with a lot of LLM assistance, and not sufficient critical reflection of the generated code.
It is also not fixing the issue in the life cycle of a SushiChef class being executed. The header needs to be set on the central DOWNLOAD_SESSION in the config module - and should be done in the place where the user has already been authenticated with a token with the Studio server.
| cache.sqlite | ||
|
|
||
| chefdata/ | ||
| audio_cache.sqlite |
There was a problem hiding this comment.
This has been added in develop from your other PR - you are still committing the file itself here as well too, so that needs to be removed from the commit history.
| @@ -0,0 +1,11 @@ | |||
| [[source]] | |||
There was a problem hiding this comment.
This still needs to be removed, as does the file below.
| "console_scripts": [ | ||
| "corrections = ricecooker.utils.corrections:correctionsmain", | ||
| "jiro = ricecooker.cli:main", | ||
| "ricecooker = ricecooker.cli:main", |
| ) | ||
|
|
||
|
|
||
| def test_performance_overhead(): |
| baseline_time = timeit.timeit(baseline_request, number=100) | ||
| custom_email_time = timeit.timeit(custom_email_request, number=100) | ||
|
|
||
| assert custom_email_time - baseline_time < 0.01 |
There was a problem hiding this comment.
Why are we making assertions in the module level code? This looks like it was generated by an LLM and unthinkingly pasted in here. Please focus only on testing the code you have added around updating the user agent.
ricecooker/utils/__init__.py
Outdated
|
|
||
|
|
||
| def configure_download_session(session, user_email=None): | ||
| import ricecooker |
There was a problem hiding this comment.
Why is this being imported twice?
ricecooker/utils/downloader.py
Outdated
| from ricecooker.utils.html import download_file | ||
| from ricecooker.utils.html import replace_links | ||
| from ricecooker.utils.zip import create_predictable_zip | ||
| import logging |
There was a problem hiding this comment.
Why has logging been added here?
ricecooker/commands.py
Outdated
| Studio authentication method that also configures the download session. | ||
| """ | ||
| try: | ||
| auth_response = authenticate_studio(username, password) |
There was a problem hiding this comment.
Why is this function making a recursive call here?
ricecooker/utils/downloader.py
Outdated
|
|
||
|
|
||
| from . import configure_download_session | ||
| def download_file(url, clear_cookies=False, headers=None, timeout=60, |
|
|
||
| def make_request( | ||
| url, clear_cookies=False, headers=None, timeout=60, session=None, *args, **kwargs | ||
| url, clear_cookies=False, headers=None, timeout=60, session=None, user_email=None, *args, **kwargs |
There was a problem hiding this comment.
I don't think we need to pass this in, we instead copy the user agent from the globally configured DOWNLOAD_SESSION that should have had its user agent headers set properly. So far this PR is not doing this though, so is missing the main point of this issue.
|
The majority of comments here are still unaddressed - the main approach needs to be tweaked to focus on the config.DOWNLOAD_SESSION and to be triggered when the tree manager authenticates to Studio. |
|
This has been force pushed - but there are still conflicts to be resolved, and the principle feedback given is still unaddressed. |
Hi @rtibbles , I was thinking to approach the problem in this way: |
|
We just need to update the session on the core config object, and have that update when authentication has happened with Studio, so we have the registered email address. That's all that is needed here. |
|
This PR has not been updated in over a month, closing. |
Summary
This pull request enhances the User-Agent header generation mechanism in the Ricecooker download utility. The main changes made are:
configure_download_session()function to generate User-Agent headersmake_request()function to support optional user email identificationReferences
Fixes: #483
Reviewer guidance
Added test cases to: