Skip to content

Retry requests for Python API #964

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 10 commits into from
Closed

Retry requests for Python API #964

wants to merge 10 commits into from

Conversation

Kully
Copy link
Contributor

@Kully Kully commented Mar 8, 2018

Addresses #949

@Kully
Copy link
Contributor Author

Kully commented Mar 9, 2018

@scjody I'm opting to modify the params for retrying.retry a bit for the Python API servers.

Currently using:

@retry(wait_random_min=20000, wait_random_max=30000,
       stop_max_attempt_number=7)

It's quite slow right now, and I may decide to lower the wait_random_min but I want to ensure a good amount of time between tries, as the ServerError advises to wait 30 seconds.

Any thoughts?

NB currently the test fail is due to streaming failures

@scjody
Copy link

scjody commented Mar 12, 2018

Please use the timings from https://github.com/plotly/streambed/pull/10614/files unless you have some clear, well tested reasons to change them.

One change that could be useful is to remove or significantly increase stop_max_delay. That's set to 30 seconds for requests from the streambed backend to imageservers since these requests are made as part of an overall request (generally from a browser to the streambed backend) that has a 30 second timeout. I don't see a need for that for requests from plotly.py to the streambed backend.

Regarding a 20 second minimum, I don't see any evidence to suggest that this is necessary, and your tests show that it slows things down significantly. Note that the 502 error from the Google load balancers that you're referring to is only one possible failure type for a backend request, and the "30 seconds" suggestion is to set expectations for a human so they don't become frustrated, not overall systems engineering guidance.

@Kully
Copy link
Contributor Author

Kully commented Mar 12, 2018

and the "30 seconds" suggestion is to set expectations for a human so they don't become frustrated, not overall systems engineering guidance.

Great, I didn't know this. Useful to know.

@Kully
Copy link
Contributor Author

Kully commented Mar 15, 2018

@scjody Is there a canonical way to test if adding the retrying will reduce or completely remove the intermittent server errors from occurring? Rebuilding the tests on circle and counting the number of fails due to server errors doesn't make sense, for example.

@scjody
Copy link

scjody commented Mar 16, 2018

@Kully I wouldn't say there's a canonical way, but there are lots of ways to get more information here. One suggestion is to run the failing test in a loop on your vagrant and capture the output. Then you can analyze the output to determine how many failures vs. the total number of test runs.

@Kully
Copy link
Contributor Author

Kully commented Mar 19, 2018

Note to self: the fact that there was a server error on the last rebuild tells me I should test the retrying params

@jonmmease
Copy link
Contributor

This logic has now been rolled into #942 and will be part of 3.0.0

I stole it because I needed it to get the circleci test suite passing 🙂

cc @Kully @scjody @cldougl @chriddyp

@jonmmease jonmmease closed this Jul 5, 2018
@nicolaskruchten nicolaskruchten deleted the retry-requests branch June 19, 2020 16:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants