Add pause/resume/context to workers#101
Conversation
|
@petrhosek - I added a new dependency, psutil here |
b623fd0 to
b6db099
Compare
468adc3 to
f505374
Compare
|
so turns out this just sends SIGSTOP to the python worker and not the clang subprocess as well. I could either:
alternatively, could figure out a way to maintain a record of spawned clang processes PIDs, do the "cancel and requeue" version, or accept that some clang processes will run to completion when asked to stop. I'll probably get back to this next week. |
| from contextlib import AbstractContextManager | ||
| from multiprocessing import connection | ||
| from typing import Any, Callable, Dict, Optional | ||
| from typing import Any, Callable, Dict, Optional, List |
There was a problem hiding this comment.
nit: put list in alphabetical order
compiler_opt/distributed/worker.py
Outdated
| ContextAwareWorker can check for this with isinstance(obj, ContextAwareWorker) | ||
| """ | ||
|
|
||
| def set_context(self, local: bool) -> None: |
There was a problem hiding this comment.
ContextAwareWorker is used nowhere, remove it for now.
|
|
||
| def __init__(self): | ||
| @dataclasses.dataclass | ||
| class ProcData: |
There was a problem hiding this comment.
a thought: the motivating scenario here is the validator. For validation, we're actually OK to let compilation run longer than x seconds - because the goal is to get a thorough idea of what-if this model were shipped. So, how about:
- no ProcData
- just have the validator use a very large timeout, like 20 minutes (i.e. ~half of that in real compilation time)
- Allows a user to start/stop processes at will, via OS signals SIGSTOP and SIGCONT. - Allows a user to bind processes to specific CPUs. - Allows local_worker_pool to be used outside of a context manager - Switch workers to be Protocol based, so Workers are effectively duck-typed (i.e. anything that has the required methods passes as a Worker) Part of google#96
6575d59 to
d3ee08e
Compare
Part of #96