IOErrors when running multiples processes at once.

I'm running ~200 jobs that use NetMHCpan at once on a supercomputer cluster. Some of these jobs are throwing this error message:

```
sh: fork: retry: Resource temporarily unavailable
sh: fork: retry: Resource temporarily unavailable
sh: fork: retry: Resource temporarily unavailable
Traceback (most recent call last):
  File "NetMHCpan_trials_all.py", line 120, in <module>
    epitope_predictions = get_epitope_predictions(HLA_alleles, vcf_file)
  File "NetMHCpan_trials_all.py", line 39, in get_epitope_predictions
    original_epitope_predictions = predictor.predict_subsequences(original_sequences).to_dataframe()
  File "/n/home05/aewhatley/anaconda3/lib/python3.6/site-packages/mhctools/base_predictor.py", line 128, in predict_subsequences
    binding_predictions = self.predict_peptides(sorted(peptide_set))
  File "/n/home05/aewhatley/anaconda3/lib/python3.6/site-packages/mhctools/base_commandline_predictor.py", line 309, in predict_peptides
    temp_dir_list=dirs)
  File "/n/home05/aewhatley/anaconda3/lib/python3.6/site-packages/mhctools/base_commandline_predictor.py", line 256, in _run_commands_and_collect_predictions
    process_limit=self.process_limit)
  File "/n/home05/aewhatley/anaconda3/lib/python3.6/site-packages/mhctools/process_helpers.py", line 141, in run_multiple_commands_redirect_stdout
    add_to_queue(p)
  File "/n/home05/aewhatley/anaconda3/lib/python3.6/site-packages/mhctools/process_helpers.py", line 126, in add_to_queue
    process.start()
  File "/n/home05/aewhatley/anaconda3/lib/python3.6/site-packages/mhctools/process_helpers.py", line 51, in start
    self.process = Popen(self.args, stdout=stdout, stderr=stderr)
  File "/n/home05/aewhatley/anaconda3/lib/python3.6/subprocess.py", line 707, in __init__
    restore_signals, start_new_session)
  File "/n/home05/aewhatley/anaconda3/lib/python3.6/subprocess.py", line 1260, in _execute_child
    restore_signals, start_new_session, preexec_fn)
BlockingIOError: [Errno 11] Resource temporarily unavailable
sh: fork: retry: Resource temporarily unavailable

```

Do you have any advice on how to deal with this situation? Would placing a mutex on the file be the right thing to do, or should I perhaps retry the prediction line if it fails due to the resource being temporarily unavailable? Thanks for your help. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

IOErrors when running multiples processes at once. #108

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

IOErrors when running multiples processes at once. #108

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions