-
Notifications
You must be signed in to change notification settings - Fork 0
Description
It would be very convenient to run builds in parallel (e.g. all platforms at once) but then have a global synchronizer embedded within the wrappers for $CC, $FC, $AS, etc... that caused them to wait such that we can coordinate across multiple builds happening on the same machine. This trades off small amounts of RAM (due to having many processes spun up but waiting to execute) for preventing unnecessary serialization of multiple independent build processes.
We'd probably want to have a bit of customization here, potentially a few sub-domains that have limited parallel capacity. E.g. we wouldn't want to try to link all LLVM builds at the same time, because linking is a very memory-intensive procedure.
I'm thinking a rough architecture would be:
- Global synchronizer (allows
N = Sys.CPU_THREADS + 1jobs)ldsub-synchronizer (Allowsmax(1, N/4)jobs, all linker-like get placed into this bucket, still obeys the global synchronizer rules
Then in our wrapper definitions, we would determine which wrappers belong to which synchronizer, if any. For instance, I think make and ninja and tools like those probably should not belong to any synchronizer to avoid recursive invocation problems.
One implementation detail I'd like to get right is that it should be possible for multiple Julia processes to use the same synchronization socket or however we implement this. This will make it possible for all the build agents on Yggdrasil to share resources intelligently. We'd probably want to still split all the platforms for each build into their own job, but it would obviate the need to specify a CPU limit for each individual build.