[Documentation] How does `SingleNodeExecutor` touch the file system?

I'm trying to understand where `SingleNodeExecutor` with a `cache_directory` gets its instructions to actually write it's result to file.

For instance, consider the following:

```python
from executorlib import SingleNodeExecutor

def foo(x):
    return x + 1

with SingleNodeExecutor() as exe:
    f = exe.submit(foo, 1, resource_dict={"cache_key": "my_key", "cache_directory": "my_dir"})
    print("Result", f.result())
```

My understanding of the path of events is

Initialization:
- `SingleNodeExecutor.__init__` triggers `BaseExecutor.__init__` with a `DependencyTaskScheduler` as its underlying `_task_scheduler`
- That `DependencyTaskScheduler` is initialized with a `OneProcessTaskScheduler` as its `executor` arg, which gets stored in `_process_kwargs["executor"]`

Submission:
- `SingleNodeExecutor.submit` is inherited directly from`BaseExecutor.submit`
- `BaseExecutor.submit` passes everything (including the `resource_dict` as a single kwarg) to the `_task_scheduler.submit`
- Since `DependencyTaskScheduler._generate_dependency_graph` has fed through to `False` with all the default values, this generates the future by a plain `super()` call to `TaskSchedulerBase.submit`
- `TaskSchedulerBase.submit` sends our information (function, args, kwargs, resource dict, empty future) to `self._future_queue.put`

Here I get out of my depth, but it seems to me like putting stuff on the future queues is activating the associated `Thread`, which all the `TaskSchedulerBase` children initialize in `_set_process` using a `Thread` taking some function and the `_process_kwargs` (which includes the `_future_queue`!). On that assumption, that means that the `self._future_queue.put` call we got to from `TaskSchedulerBase.submit` would route back to the `Thread` set in the `DependencyTaskScheduler._set_process` invocation -- i.e. `_execute_tasks_with_dependencies`

Continuing submission:
- `_execute_tasks_with_dependencies` indeed takes an `executor`, which is `DependencyTaskScheduler._process_kwargs["executor"]` i.e. our `OneProcessTaskScheduler`; there's lots going on, but...
  - I don't see any reference to the cache, so I don't think the file system interaction is happening here
  - It looks like we ultimately do a `executor_queue.put` onto the underlying `OneProcessTaskScheduler._future_queue`
  - I.e. we move to `_execute_task_in_separate_process`
- `_execute_task_in_separate_process` is in turn re-directing to `_wrap_execute_task_in_separate_process` and both of these are now taking a `spawner: type[BaseSpawner]` argument
- But I'm at the end of the line, I don't see anything other than the default `MpiExecSpawner` being leveraged, and I never find any references to the `cache_directory` or any of the `task_scheduler.file` module tools

With the `SlurmClusterExecutor` we sometimes route through `create_file_executor`, in which case the file system connection is obvious, but in the other case we're still going through `DependencyTaskScheduler` -- this time with an `SrunSpawner` instead of a `MpiExecSpawner`. In this later case I also don't see the connection to file system tools, so I feel like I must be missing something at the diverging point: `DependencyTaskScheduler`.

What am I missing here? When does the `SingleNodeExecutor` figure out it needs to leverage the `"cache_directory"` field in the `resource_dict`?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Documentation] How does `SingleNodeExecutor` touch the file system? #703

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Documentation] How does SingleNodeExecutor touch the file system? #703

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[Documentation] How does `SingleNodeExecutor` touch the file system? #703