Skip to content

Commit 35aca9a

Browse files
committed
docs(guide): add guidance for manager
1 parent fe6704f commit 35aca9a

File tree

2 files changed

+124
-0
lines changed

2 files changed

+124
-0
lines changed

guide/src/SUMMARY.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@
88
- [Running a Coordinator](guide/coordinator.md)
99
- [Running a Worker](guide/worker.md)
1010
- [Running a Client](guide/client.md)
11+
- [Managing multiple Workers](guide/manager.md)
1112
- [Architecture Overview](guide/architecture.md)
1213
- [Troubleshooting](guide/troubleshooting.md)
1314

guide/src/guide/manager.md

Lines changed: 123 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,123 @@
1+
# Use Manager to manage multiple workers
2+
3+
The Manager provides a convenient way to manage multiple Workers from a single command interface. It allows you to spawn, monitor, and terminate Workers efficiently.
4+
5+
## Overview
6+
7+
The manager component is designed to handle worker lifecycle management, including:
8+
9+
- **Status monitoring**: Check the status of all running workers
10+
- **Worker spawning**: Launch multiple workers with configurable parameters
11+
- **Worker termination**: Kill all running workers at once
12+
13+
## Commands
14+
15+
### Status
16+
17+
Check the status of all running workers:
18+
19+
```bash
20+
mito manager status
21+
```
22+
23+
This command:
24+
25+
- Lists all currently running `mito worker` processes
26+
- Shows detailed process information (PID, CPU usage, memory, etc.)
27+
- Displays the total count of active workers
28+
29+
### Spawn Workers
30+
31+
Launch multiple worker processes:
32+
33+
```bash
34+
mito manager spawn <count> [worker-options]
35+
```
36+
37+
- `count`: Number of workers to spawn (must be greater than 0)
38+
- `worker-options` All available worker options can be passed to the spawn command.
39+
40+
**Example:**
41+
42+
```bash
43+
# Spawn 5 workers with default configuration
44+
mito manager spawn 5
45+
46+
# Spawn 3 workers with custom coordinator and tags
47+
mito manager spawn 3 --coordinator "127.0.0.1:5000" --tags "gpu,cuda"
48+
49+
# Spawn workers with specific groups and file logging
50+
mito manager spawn 2 --groups "batch-processing" --file-log --log-path "/var/log/mito"
51+
```
52+
53+
**Spawning Process:**
54+
55+
- Shows a progress bar during worker creation
56+
- Workers are spawned as detached processes (background execution)
57+
- Each worker runs independently with its own process ID
58+
- After spawning, displays the current total count of running workers
59+
60+
### Kill Workers
61+
62+
Terminate all running worker processes:
63+
64+
```bash
65+
mito manager kill
66+
```
67+
68+
This command:
69+
70+
- Finds all processes matching `mito worker`
71+
- Terminates them using `pkill`
72+
- Confirms successful termination or reports if no workers were found
73+
74+
## Environment Variables
75+
76+
When spawning workers, the manager automatically sets:
77+
78+
- `NO_COLOR=1`: Disables colored output for consistent logging
79+
- `MITO_FILE_LOG_LEVEL`: Controls file logging level, if will inherit the current environment variable value and default to `info` if not set
80+
81+
## Best Practices
82+
83+
1. **Monitor before spawning**: Always check current worker status before spawning new ones
84+
2. **Resource awareness**: Consider system resources when spawning multiple workers
85+
3. **Configuration consistency**: Use consistent worker configurations across spawned instances
86+
4. **Logging strategy**: Enable file logging for persistent worker logs
87+
88+
## Example Workflow
89+
90+
```bash
91+
# Check current worker status
92+
mito manager status
93+
94+
# Spawn 4 workers with GPU tags for ML workloads
95+
mito manager spawn 4 --tags "gpu,ml" --groups "training" --file-log
96+
97+
# Monitor status after spawning
98+
mito manager status
99+
100+
# When done, terminate all workers
101+
mito manager kill
102+
```
103+
104+
## Troubleshooting
105+
106+
**No workers showing in status:**
107+
108+
- Ensure workers are properly spawned
109+
- Check if coordinator is running and accessible
110+
- Verify worker configuration parameters
111+
112+
**Failed to spawn workers:**
113+
114+
- Check system resources (memory, CPU)
115+
- Verify coordinator connectivity
116+
- Ensure proper permissions for process creation
117+
118+
**Workers not terminating:**
119+
120+
- Check for zombie processes: `ps -aux | grep mito`
121+
- Force kill if necessary: `sudo pkill -9 -f "mito worker"`
122+
- Restart the manager if processes are stuck
123+

0 commit comments

Comments
 (0)