-
Notifications
You must be signed in to change notification settings - Fork 24
Open
Description
I'm running ~200 jobs that use NetMHCpan at once on a supercomputer cluster. Some of these jobs are throwing this error message:
sh: fork: retry: Resource temporarily unavailable
sh: fork: retry: Resource temporarily unavailable
sh: fork: retry: Resource temporarily unavailable
Traceback (most recent call last):
File "NetMHCpan_trials_all.py", line 120, in <module>
epitope_predictions = get_epitope_predictions(HLA_alleles, vcf_file)
File "NetMHCpan_trials_all.py", line 39, in get_epitope_predictions
original_epitope_predictions = predictor.predict_subsequences(original_sequences).to_dataframe()
File "/n/home05/aewhatley/anaconda3/lib/python3.6/site-packages/mhctools/base_predictor.py", line 128, in predict_subsequences
binding_predictions = self.predict_peptides(sorted(peptide_set))
File "/n/home05/aewhatley/anaconda3/lib/python3.6/site-packages/mhctools/base_commandline_predictor.py", line 309, in predict_peptides
temp_dir_list=dirs)
File "/n/home05/aewhatley/anaconda3/lib/python3.6/site-packages/mhctools/base_commandline_predictor.py", line 256, in _run_commands_and_collect_predictions
process_limit=self.process_limit)
File "/n/home05/aewhatley/anaconda3/lib/python3.6/site-packages/mhctools/process_helpers.py", line 141, in run_multiple_commands_redirect_stdout
add_to_queue(p)
File "/n/home05/aewhatley/anaconda3/lib/python3.6/site-packages/mhctools/process_helpers.py", line 126, in add_to_queue
process.start()
File "/n/home05/aewhatley/anaconda3/lib/python3.6/site-packages/mhctools/process_helpers.py", line 51, in start
self.process = Popen(self.args, stdout=stdout, stderr=stderr)
File "/n/home05/aewhatley/anaconda3/lib/python3.6/subprocess.py", line 707, in __init__
restore_signals, start_new_session)
File "/n/home05/aewhatley/anaconda3/lib/python3.6/subprocess.py", line 1260, in _execute_child
restore_signals, start_new_session, preexec_fn)
BlockingIOError: [Errno 11] Resource temporarily unavailable
sh: fork: retry: Resource temporarily unavailable
Do you have any advice on how to deal with this situation? Would placing a mutex on the file be the right thing to do, or should I perhaps retry the prediction line if it fails due to the resource being temporarily unavailable? Thanks for your help.
ccario83
Metadata
Metadata
Assignees
Labels
No labels