-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor concurrency to not use qsize #323
Comments
to be honest @paolodi, I think the reliance on
|
taking some notes here:
|
WOW that's great investigative work, @StevieSong !!! Let me digest it a bit, and let's get some more eyes on it. FYI: @lucidtronix @ndiamant @mklarqvist see above some possible side effects of relying on |
dug some more, this line was at least partially culprit for this log, easy fix as |
EDIT: split off the deadlock issue that spawned this issue into a standalone bug report at #326
What
This segment of code tries to synchronize the workers before proceeding: https://github.com/broadinstitute/ml/blob/e3540e1eff2fc45301255c1e89b87c8bb5d18405/ml4cvd/tensor_generators.py#L429-L430
This is a task that is traditionally accomplished with barriers. Additionally, documentation for current version of python multiprocessing on
qsize
method used here states the value is unreliable and implies that the function is potentially not portable, like macOS https://docs.python.org/3.6/library/multiprocessing.html#multiprocessing.Queue.qsizeWhy
reusing familiar coding patterns is good for readability
additionally, portability of code is important
How
implement barriers in tensor generators
Acceptance Criteria
barriers in tensor generators
The text was updated successfully, but these errors were encountered: