Skip to content

Commit 4e1479c

Browse files
committed
Introduce dev-doc for processes management
ilab should be able to run detached processes, re-attach to them, and monitor them Signed-off-by: Charlie Doern <[email protected]>
1 parent d6f77b1 commit 4e1479c

File tree

2 files changed

+70
-0
lines changed

2 files changed

+70
-0
lines changed

.spellcheck-en-custom.txt

+6
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,7 @@ Containerfile
2929
cpp
3030
cuBLAS
3131
CUDA
32+
ctrl
3233
customizations
3334
CVE
3435
CVEs
@@ -131,6 +132,7 @@ Params
131132
Pareja
132133
PEFT
133134
Pereira
135+
PID
134136
PlantUML
135137
PLOS
136138
pluggable
@@ -188,8 +190,10 @@ Standup
188190
subcommand
189191
subcommands
190192
subdirectory
193+
subprocess
191194
Sudalairaj
192195
supportability
196+
systemd
193197
Taj
194198
tatsu
195199
TBD
@@ -209,6 +213,8 @@ ui
209213
unquantized
210214
unstaged
211215
USM
216+
UUID
217+
UUIDs
212218
UX
213219
venv
214220
Vishnoi

docs/cli/ilab-processes.md

+64
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,64 @@
1+
# Processes in InstructLab
2+
3+
The ability to detach from processes is crucial to the user experience of InstructLab. However, the concept of multi-processing, process management, and the monitoring of processes is very complex.
4+
5+
It is important to try and add this concept in as simply as possible, expanding on the state reporting, logging, and other features as we go along.
6+
7+
## Phased approach to InstructLab Processes
8+
9+
This document is going to describe phase 1 of implementing processes in InstructLab. Phase 1 is to be described as the "ilab simple process management system". This will depend purely on python packages, PID tracking, and log files to create the experience of detachable processes. The key here is the concept of the UUID, allowing a future REST API to keep track of InstructLab processes using these unique identifiers.
10+
11+
We can re-visit all this in phase 2, when we discuss if we want to utilize something like systemd or a more in-depth process-monitor repo to track processes.
12+
13+
### Phase 1
14+
15+
Phase one would focus on adding the concept of detaching from processes, re-attaching to them, and managing the various artifacts from the processes.
16+
17+
Process management would only apply to `ilab data generate` and `ilab model train` in a first iteration. This would be followed by commands like `ilab model evaluate`, `ilab model serve`, and `ilab model download`. All of these commands have long running processes that would benefit from detachment.
18+
19+
The workflow would allow for:
20+
21+
`ilab data generate -dt` (run a detached generation process)
22+
`ilab model train -dt` (run a detached training process)
23+
24+
`ilab process list`
25+
26+
```console=
27+
+------------+-------+--------------------------------------+------------------------------------------------------------------------------------------------------------------+----------+
28+
| Type | PID | UUID | Log File | Runtime |
29+
+------------+-------+--------------------------------------+------------------------------------------------------------------------------------------------------------------+----------+
30+
| Generation | 39832 | 82d00a5b-5ed5-4cfd-9a75-a87e4f420b27 | /Users/charliedoern/.local/share/instructlab/logs/generation/generation-82d00a5b-5ed5-4cfd-9a75-a87e4f420b27.log | 69:26:28 |
31+
| Generation | 40791 | 09f9d301-4fd9-4045-bfda-8a56f1d96016 | /Users/charliedoern/.local/share/instructlab/logs/generation/generation-09f9d301-4fd9-4045-bfda-8a56f1d96016.log | 68:45:40 |
32+
| Generation | 47390 | 4ccabfa5-604f-49c6-b5c3-730ce328d62a | /Users/charliedoern/.local/share/instructlab/logs/generation/generation-4ccabfa5-604f-49c6-b5c3-730ce328d62a.log | 67:26:33 |
33+
| Generation | 50872 | 093ac2e9-080c-45fe-89c5-43d508d6369c | /Users/charliedoern/.local/share/instructlab/logs/generation/generation-093ac2e9-080c-45fe-89c5-43d508d6369c.log | 05:24:56 |
34+
+------------+-------+--------------------------------------+------------------------------------------------------------------------------------------------------------------+----------+
35+
```
36+
37+
`ilab process attach <UUID>`
38+
39+
This command would re-attach to the given process, allowing to user to view the live logs of the process. `attach` would trail the log file and listen for user-input to kill the process.
40+
41+
These commands will be done in a very simple way at first using the following architecture:
42+
43+
1. a detached process be re-attachable by tailing the log file and then allowing the user to ctrl+c the process as normal using `KeyboardInterrupt`
44+
2. The process registry will be maintained for tracking UUIDs created via the `uuid` python package, the PID of the actual process, a `log_file` where the process will be outputting its logs to so that the user can re-attach, and the start time of the process. The log file directory will be tracked using our `DEFAULTS` package and will be standard throughout releases.
45+
46+
The general flow would be:
47+
48+
1. a user runs `ilab data generate -dt`
49+
2. a UUID, PID, and log file is added to the process registry.
50+
3. the process would exit, and print the UUID of the sdg run
51+
4. a user could attach to this process using `ilab process attach <UUID>`.
52+
5. This command would look in the process registry for the PID and/or UUID, get the log file, tail the log file, and listen for a ctrl+c keyboard interrupt.
53+
54+
This allows us to detach from processes while still running them in the background and maintain log files all without the use of anything other than UUID and subprocess.
55+
56+
#### Log file management
57+
58+
If existing log files from the various libraries exist, those will be used in this scenario. If they do not, InstructLab will manage writing process logs to disk. Regardless of whether the libraries maintain their own log file, InstructLab will need to co-locate the log files in a centralized directory.
59+
60+
If a log file exists, it will be copied and renamed into the following directory format:
61+
62+
`~/.local/share/instructlab/logs/<command_name>/<command_name>-<timestamp>.log`
63+
64+
If the log file does not exist, InstructLab will create one with this format. Libraries are responsible for standardizing where their logs are stored if they already exist so the Core package can access them in a uniform fashion and copy them to the proper directory.

0 commit comments

Comments
 (0)