You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Traceback (most recent call last):
File "..././torchrl_test_mps_fail.py", line 9, in <module>
collector = MultiSyncDataCollector(
File ".../.venv/lib/python3.10/site-packages/torchrl/collectors/collectors.py", line 1779, in __init__
self._run_processes()
File ".../.venv/lib/python3.10/site-packages/torchrl/collectors/collectors.py", line 1976, in _run_processes
proc.start()
File "/nix/store/ra1l4hyhxw3zlq62y8vg6fpxysq9ln6s-python3-3.10.16/lib/python3.10/multiprocessing/process.py", line 121, in start
self._popen = self._Popen(self)
File "/nix/store/ra1l4hyhxw3zlq62y8vg6fpxysq9ln6s-python3-3.10.16/lib/python3.10/multiprocessing/context.py", line 224, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "/nix/store/ra1l4hyhxw3zlq62y8vg6fpxysq9ln6s-python3-3.10.16/lib/python3.10/multiprocessing/context.py", line 288, in _Popen
return Popen(process_obj)
File "/nix/store/ra1l4hyhxw3zlq62y8vg6fpxysq9ln6s-python3-3.10.16/lib/python3.10/multiprocessing/popen_spawn_posix.py", line 32, in __init__
super().__init__(process_obj)
File "/nix/store/ra1l4hyhxw3zlq62y8vg6fpxysq9ln6s-python3-3.10.16/lib/python3.10/multiprocessing/popen_fork.py", line 19, in __init__
self._launch(process_obj)
File "/nix/store/ra1l4hyhxw3zlq62y8vg6fpxysq9ln6s-python3-3.10.16/lib/python3.10/multiprocessing/popen_spawn_posix.py", line 47, in _launch
reduction.dump(process_obj, fp)
File "/nix/store/ra1l4hyhxw3zlq62y8vg6fpxysq9ln6s-python3-3.10.16/lib/python3.10/multiprocessing/reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
File ".../.venv/lib/python3.10/site-packages/torch/multiprocessing/reductions.py", line 607, in reduce_storage
metadata = storage._share_filename_cpu_()
File ".../.venv/lib/python3.10/site-packages/torch/storage.py", line 450, in wrapper
return fn(self, *args, **kwargs)
File ".../.venv/lib/python3.10/site-packages/torch/storage.py", line 529, in _share_filename_cpu_
return super()._share_filename_cpu_(*args, **kwargs)
RuntimeError: _share_filename_: only available on CPU
limitations of mps device, which does not work well with a pickle-based sharing of parameters
limitations of torchrl , which assume a spawn-based multiprocessing library
as opposed to a fork-based multiprocess context; forcing fork through multiprocessing.set_start_method('fork') gives a warning and makes collectors crash
Describe the bug
When running experiments with multiprocess-based sampling of trajectories on macOS, the initialization of the data collectors fail
To Reproduce
This fails as follows:
System info
Reason and Possible fixes
I suspect this issue boils down to:
mps
device, which does not work well with a pickle-based sharing of parameterstorchrl
, which assume aspawn
-based multiprocessing libraryfork
-based multiprocess context; forcing fork throughmultiprocessing.set_start_method('fork')
gives a warning and makes collectors crashspawn
context is imposed bytorchrl
rl/torchrl/__init__.py
Line 38 in 619fec6
spawn
multiprocessing context using pickle to copy the state of a process on a newly spawned oneChecklist
The text was updated successfully, but these errors were encountered: