Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rearrange async code to fix httpx pool timeouts #566

Merged
merged 8 commits into from
Feb 9, 2025

Conversation

mbarton
Copy link
Member

@mbarton mbarton commented Feb 8, 2025

Fixes #560

Before we tried to run validation tasks for ALL the patients in parallel, using a task group for each patient.

That led to HTTPX pool timeouts with real data - the default is 5 seconds.

This rearranges the async handling to do bounded parallelism at the patient level - ie only handle max 5 patients in flight at once.

The old task group was actually redundant as we now parallelise lookups in external_patient_validators and external_visit_validators using asyncio.gather

Before we tried to run validation tasks for ALL the patients in parallel, using a task group for each patient.

That led to HTTPX pool timeouts with real data - the default is 5 seconds.

This rearranges the async handling to do bounded parallelism at the patient level - ie only handle max 5 patients in flight at once.

The old task group was actually redundant as we now parallelise lookups in external_patient_validators and external_visit_validators using asyncio.gather
Didn't handle patients that couldn't be saved
Was always using the last patient (needed to be a parameter to the task)
@mbarton mbarton force-pushed the mbarton/actually-do-parallelism-properly branch from e4abd44 to bb57803 Compare February 9, 2025 15:40
@@ -366,11 +423,6 @@ def test_invalid_nhs_number(test_user, single_row_valid_df):
errors = csv_upload_sync(test_user, single_row_valid_df)
assert "nhs_number" in errors[0]

# Catastrophic - Patient not save
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's fine to save an invalid NHS number, it will be flagged in the data quality report and UI

@mbarton mbarton marked this pull request as ready for review February 9, 2025 15:44
@mbarton mbarton merged commit a5cd6e6 into live Feb 9, 2025
1 check passed
@mbarton mbarton deleted the mbarton/actually-do-parallelism-properly branch February 9, 2025 20:33
@mbarton
Copy link
Member Author

mbarton commented Feb 9, 2025

Seen on STAGING, LIVE (merged by @mbarton 7 minutes and 46 seconds ago) Please check your changes!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Pool timeout uber crash when uploading real data
1 participant