-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy path03-slurm.Rmd
264 lines (181 loc) · 12 KB
/
03-slurm.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
# SLURM: HPC scheduler
If you have written some scripts and want to execute them, it is advisable to send them to the scheduler.
The scheduler (SLURM) will distribute the jobs across the cluster (6+ machines) and make sure that there are no conflicts with respect to CPU and memory if multiple people send jobs to the cluster.
This is the essential job of a scheduler.
## First steps
Sending jobs to SLURM in R is supported via the R package [clustermq](https://github.com/mschubert/clustermq).
```{block, type='rmdcaution'}
The R interpreters and packages are not shared with RSW.
Therefore, all R packages your script needs must to be reinstalled on the HPC with the respective R version.
```
Rather than calling a R script directly, you need to wrap your code into a function and invoke it using `clustermq::Q()`.
Instead of using `clustermq` directly, you can make use of R packages like [{targets}](https://cran.r-project.org/web/packages/targets/index.html) or [{drake}](https://cran.r-project.org/web/packages/drake/index.html) to automatically wrap your whole analysis in a way that it executes all layers of your analysis on the HPC.
There is no other way to submit your R jobs to the compute nodes of the cluster than by using any of the tools mentioned above.
Also, it is essential to load all required system libraries you need (e.g. GDAL, PROJ) via environment modules so that they are available on all nodes.
```{block, type='rmdcaution'}
Note that most likely the versions of these libraries will differ to the ones used in the RSW container.
For reproducibility it might be worth not deviating too much or even using the same versions on the HPC and within RSW.
```
## SLURM commands
While the execution of jobs is explained in more detail in [Chapter 4](#submit-jobs), the following section aims familiarizing yourself with the usage of the scheduler.
The scheduler is queried via the terminal, i.e. you need to `ssh` into the server or switch to the "Terminal" tab in RStudio.
The most important SLURM commands are
- `sinfo`: An overview of the current state of the nodes
```sh
sinfo
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
all* up infinite 4 alloc c[0-2],edi
all* up infinite 2 idle c[3-4]
frontend up infinite 1 alloc edi
threadripper up infinite 4 alloc c[0-2],edi
opteron up infinite 2 idle c[3-4]
```
- `squeue`: An overview of the current jobs that are queued, including information about running jobs
```sh
squeue
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
129_[2-5] threadripper cmq7381 patrick PD 0:00 1 (Resources)
121_2 threadripper cmq7094 patrick R 6:24:17 1 c1
121_3 threadripper cmq7094 patrick R 6:24:17 1 c2
129_1 threadripper cmq7381 patrick R 5:40:44 1 c0
```
- `sacct`: Overview of jobs that were submitted in the past including their end state
```sh
122 cmq7094 threadripper (null) 0 COMPLETED 0:0
123 cmq7094 threadripper (null) 0 PENDING 0:0
121 cmq7094 threadripper (null) 0 PENDING 0:0
125 cmq6623 threadripper (null) 0 FAILED 1:0
126 cmq6623 threadripper (null) 0 FAILED 1:0
127 cmq6623 threadripper (null) 0 FAILED 1:0
128 cmq6623 threadripper (null) 0 FAILED 1:0
124 cmq6623 threadripper (null) 0 FAILED 1:0
130 cmq7381 threadripper (null) 0 PENDING 0:0
```
- `scancel`: Cancel running jobs using the job ID identifier
If you want to cancel all jobs for your specific user, you can call `scancel -u <username>`.
## Submitting jobs {#submit-jobs}
### `clustermq` setup
Every job submission is done via `clustermq::Q()` (either directly or via `drake`).
See the setup instructions in the [clustermq](https://mschubert.github.io/clustermq/) package on how to setup the package.
First, you need to set some options in your `.Rprofile` (on the master node or in your project root when you use {renv} or {packrat}):
```r
options(
clustermq.scheduler = "slurm",
clustermq.template = "</path/to/file/"
)
```
See the [package vignette](https://mschubert.github.io/clustermq/articles/userguide.html#slurm) on how to set up the file.
Note that you can have multiple `.Rprofile` files on your system:
1. Your default R interpreter will use the `.Rprofile` found in the home directory (`~/`).
1. But you can also save an `.Rprofile` file in the root directory of a (RStudio) project (which will be preferred over the one in $HOME).
This way you can use customized `.Rprofile` files tailored to a project.
At this stage you should be able to run the [example](https://github.com/mschubert/clustermq) at the top of the `README` of the {clustermq} package.
It is a very simple example which finishes in a few seconds.
If it does not work, you either did something wrong or the nodes are busy.
Check with `sinfo` and `squeue`.
Otherwise see the [troubleshooting](#troubleshooting) chapter.
```{block, type='rmdcaution'}
Be aware of setting `n_cpus` in the `template` argument of `clustermq::Q()` if your submitted job is parallelized!
If you submit a job that is parallelized without telling the scheduler, the scheduler will reserve 1 core for this job (because it thinks it is sequential) but in fact multiple processes will spawn.
This will potentially affect all running processes on the server since the scheduler will accept more processing than it actually can take.
```
### The scheduler template
To successfully submit jobs to the scheduler, you need to set the `.Rprofile` options given above.
Note that you can add any bash commands into the scripts between the `SBATCH` section and the final R call.
For example, a template could look as follows:
```sh
#!/bin/sh
#SBATCH --job-name={{ job_name }}
#SBATCH --partition=all
#SBATCH --output={{ log_file | /dev/null }} # you can add .%a for array index
#SBATCH --error={{ log_file | /dev/null }}
#SBATCH --cpus-per-task={{ n_cpus }}
#SBATCH --mem={{ memory }}
#SBATCH --array=1-{{ n_jobs }}
source ~/.bashrc
cd /full/path/to/project
# load desired R version via an env module
module load r-3.5.2-gcc-9.2.0-4syrmqv
CMQ_AUTH={{ auth }} R --no-save --no-restore -e 'clustermq:::worker("{{ master }}")'
```
Note: The `#` signs are no mistakes here, they are no "comment" signs in this context.
The `SBATCH` commands will be executed here.
You can simply copy it and adjust it to your needs.
You only need to set the right path to your project and specify the R version you want to use.
### Allocating resources
There are two approaches/packages you can use:
- `drake` / `targets`
- `clustermq`
The `drake` approach is only valid if you have set up your project as a `drake` or `targets` project.
```{r eval = FALSE}
drake::make(parallelism = "clustermq", n_jobs = 1,
template = list(n_cpus = <X>, log_file = <Y>, memory = <Z>))
```
```{r eval = FALSE}
clustermq::Q(template = list(n_cpus = <X>, log_file = <Y>, memory = <Z>))
```
(The individual components of these calls are explained in more detail below.)
Note that `drake` uses `clustermq` under the hood.
Notations like `<X>` are meant to be read as placeholders, meaning they need to be replaced with valid content.)
When submitting jobs via `clustermq::Q()`, it is important to tell the scheduler how many cores and memory should be reserved for you.
This step is very important.
If you specify less cores than you actually use in your script (e.g. by internal parallelization), the scheduler will plan with X cores although your submitted code will spawn Y processes in the background.
This might overload the node and eventually cause your script (and more importantly) the processes of others to crash.
There are two ways to specify these settings, depending on which approach you use:
1. via `clustermq::Q()` directly
Pass the values via argument `template` like `template = list(n_cpus = <X>, memory = <Y>)`.
It will then be passed to the `clustermq.template` file (frequently named `slurm_clustermq.tmpl`) which contains following lines:
```sh
#SBATCH --cpus-per-task{{ n_cpus }}
#SBATCH --mem={{ memory }}
```
This tells the scheduler how many resources (here cpus) your job needs.
2. via `drake::make()`
Again, set the options via argument `template = list(n_cpus = X, memory = Y)`.
See section ["The resources column for transient workers"](https://ropenscilabs.github.io/drake-manual/hpc.html#advanced-options) in the drake manual.
```{block, type='rmdcaution'}
Please think upfront how many cpus and memory your task requires.
The following two examples show you the implications of wrong specifications.
```
```{block, type='rmdcaution'}
`mclapply(cores = 20)` (in your script) > `n_cpus = 16`
In this case, four workers will always be in "waiting mode" since only 16 cpus can be used by your resource request.
This slows down your parallelization but does no harm to other users.
```
```{block, type='rmdcaution'}
`mclapply(cores = 11)` < `n_cpus = 16`
In this case, you reserve 16 CPUs from the machine but only use 11 at most.
This blocks five CPUs of the machine for no reason potentially causing other people to be added to the queue rather than getting their job processed immediately.
```
Furthermore, if you want to use all resources of a node and run into memory problems, try reducing the number of CPUs (if you already increased the memory to its maximum).
If you scale down the number of CPUs, you will have more memory/cpu available.
### Monitoring progress
When submitting jobs you can track its progress by specifying a `log_file` in the `clustermq::Q()` call, e.g. `clustermq::Q(template = list(log_file = path/to/file))`.
For `drake`, the equivalent is to specify `console_log_file()` in either `make()` or `drake_config()`.
If your jobs are running on a node, you can SSH into the node, e.g. `ssh c0`.
There you can take a look at the current load by using `htop`.
Note that you can only log in if you have a running progress on a specific node.
### `renv` specifics
If {renv} is used and jobs should be sent from within RSW, Slurm tries to load {clustermq} and {renv} from the following library
```
<your/project/renv/library/linux-centos-7/R-4.0/x86_64-pc-linux-gnu/`
```
This library is not used by default and only in this very special occasion (Slurm + RSW).
The reason for this is that Slurm thinks its on CentoOS when invoking the `CMQ_AUTH={{ auth }} R --no-save --no-restore -e 'clustermq:::worker("{{ master }}")'` call and tries to find {clustermq} in this specific library.
When working directly on the HPC via a terminal, the {renv} library path is `renv/library/R-4.0/x86_64-pc-linux-gnu/`.
Simply copying {clustermq} and {renv} to this location is enough:
```
mkdir renv/library/linux-centos-7/R-4.0/x86_64-pc-linux-gnu
cp -R renv/library/R-4.0/x86_64-pc-linux-gnu/clustermq renv/library/linux-centos-7/R-4.0/x86_64-pc-linux-gnu/
cp -R renv/library/R-4.0/x86_64-pc-linux-gnu/renv renv/library/linux-centos-7/R-4.0/x86_64-pc-linux-gnu/
```
### RStudio Slurm Job Launcher Plugin
While it would simplify some things to use the Launcher GUI in RStudio, the problem is that one requirement is to have R versions shared across all nodes.
Since the RSW container uses its one R versions and is decoupled from the R environment modules used on the HPC, adding these would duplicate the R versions in the container and create confusion.
Also it seems the RStudio GUI does not allow to load additional env modules which is a requirement for loading certain R packages.
## Summary
1. Set up your `.Rprofile` with `options(clustermq.template = "/path/to/file")`.
The `clustermq.template` should point to a SLURM template file in your $HOME or project directory.
1. Decide which approach you want to use `drake`/`targets` or `clustermq`
2. A Slurm template file is required.
This template needs to be linked in your `.Rprofile` with `options(clustermq.template = "/path/to/file")`.