|
| 1 | +# Slurm |
| 2 | + |
| 3 | +We have changed schedulers from SGE to Slurm. Both are job schedulers for HPC clusters - they have different architecture, commands, and features. |
| 4 | + |
| 5 | +Here is a comparative table of the main SGE commands and their equivalent in Slurm. (Since we had a very custom SGE install, a few commands may be slightly different or may not have been available in our past setup). |
| 6 | + |
| 7 | +<figure id="rosetta"> |
| 8 | +<div class="center"> |
| 9 | +<img src="../img/Rosetta_Slurm-SGE.png" style="width:150mm" alt="SGE and Slurm command Rosetta stone" /> |
| 10 | +</div> |
| 11 | +<figcaption><em> <span id="rosetta" label="rosetta">SGE and Slurm command Rosetta stone</span></em></figcaption> |
| 12 | +</figure> |
| 13 | + |
| 14 | +[Full Slurm Rosetta stone of Workload Managers available as PDF](https://slurm.schedmd.com/rosetta.html) |
| 15 | + |
| 16 | +## How do I submit a job to Slurm? |
| 17 | + |
| 18 | +You can submit a bash jobscript that has Slurm directives in it. You can also pass in Slurm directives on the command line to the Slurm submit commands. |
| 19 | + |
| 20 | +### Using a bash jobscript |
| 21 | + |
| 22 | +To submit a job to the scheduler you need to write a jobscript that contains the resources the job is asking for and the actual commands/programs you want to run. This jobscript is then submitted using the `sbatch` command: |
| 23 | + |
| 24 | +``` |
| 25 | +sbatch myjobscript |
| 26 | +``` |
| 27 | + |
| 28 | +Lines beginning with `#SBATCH` in your jobscript contain options to `sbatch`. Slurm will take each line starting with `#SBATCH` and use the contents beyond that as an instruction. The job will be put in to the queue and will begin running on the compute nodes at some point later when it has been allocated resources. |
| 29 | + |
| 30 | +Slurm allows you to specify various options for how your job is laid out across nodes. |
| 31 | + |
| 32 | +``` |
| 33 | +#!/bin/bash |
| 34 | +
|
| 35 | +# Request two nodes on Kathleen and run 40 tasks per node, one cpu each: |
| 36 | +#SBATCH --nodes=2 |
| 37 | +#SBATCH --ntasks-per-node=40 |
| 38 | +#SBATCH --cpus-per-task=1 |
| 39 | +#SBATCH --time=00:10:00 |
| 40 | +``` |
| 41 | + |
| 42 | +This will give you the same layout on Kathleen: |
| 43 | + |
| 44 | +``` |
| 45 | +#!/bin/bash |
| 46 | +
|
| 47 | +# Request 80 tasks with one cpu each: |
| 48 | +#SBATCH --ntasks=80 |
| 49 | +#SBATCH --cpus-per-task=1 |
| 50 | +#SBATCH --time=00:10:00 |
| 51 | +``` |
| 52 | + |
| 53 | +You can launch parallel tasks by using `srun` inside the jobscript you are running with `sbatch`. For most programs, this replaces the `mpirun` or our previous `gerun` command. |
| 54 | + |
| 55 | +``` |
| 56 | +#!/bin/bash |
| 57 | +
|
| 58 | +# Request 80 tasks: |
| 59 | +#SBATCH --ntasks=80 |
| 60 | +#SBATCH --time=00:10:00 |
| 61 | +
|
| 62 | +srun myprog |
| 63 | +``` |
| 64 | + |
| 65 | +Practical examples of how to run parallel tasks can be found in [Slurm Example Jobscripts](Slurm_Example_Jobscripts.md). |
| 66 | + |
| 67 | +### Passing in command-line arguments |
| 68 | + |
| 69 | +You can also pass options directly to the `sbatch` command and this will override the settings in your script. This can be useful if you are scripting your job submissions in more complicated ways. |
| 70 | + |
| 71 | +For example, if you want to change the name of the job for this one instance of the job you can submit your script with: |
| 72 | + |
| 73 | +``` |
| 74 | +sbatch --job-name=NewName myscript.sh |
| 75 | +``` |
| 76 | + |
| 77 | +Or if you want to alter the wallclock time in the existing script to 24 hours: |
| 78 | + |
| 79 | +``` |
| 80 | +sbatch -t 0-24:00:00 myscript.sh |
| 81 | +``` |
| 82 | + |
| 83 | +You can also use `srun` to submit your jobscript. `srun` only accepts command-line arguments. |
| 84 | + |
| 85 | +``` |
| 86 | +srun --ntasks=80 --time=00:10:00 myjobscript |
| 87 | +``` |
| 88 | + |
| 89 | +You can submit jobs with dependencies by using the `--depend` option. For example, the command below submits a job that won't run until job 12345 has finished: |
| 90 | + |
| 91 | +``` |
| 92 | +sbatch --depend=12345 myscript.sh |
| 93 | +``` |
| 94 | + |
| 95 | +Note that for future reference, it helps if you have these options inside your jobscript rather than passed in on the command line whenever possible, so there is one place to check what your past jobs were requesting. |
| 96 | + |
| 97 | +### Checking your previous jobscripts |
| 98 | + |
| 99 | +If you want to check what you submitted for a specific job ID, you can still do this with the `scriptfor` utility. (This now runs the `sacct` command for you with relevant options). |
| 100 | + |
| 101 | +``` |
| 102 | +scriptfor 12345 |
| 103 | +``` |
| 104 | + |
| 105 | +As mentioned above, this will not show any command line options you passed in. |
| 106 | + |
| 107 | +### Checking your whole submit-time environment |
| 108 | + |
| 109 | +By default, jobs will copy your current environment that you have on the login node you are submitting from (`--export=ALL` from the command comparison table). This is different from our previous setup. You can view the whole environment that your job was submitted with using the `envfor` utility. (This also runs `sacct` with relevant options). |
| 110 | + |
| 111 | +``` |
| 112 | +envfor 12345 |
| 113 | +``` |
| 114 | + |
| 115 | +This output can be long and contain multi-line shell functions. |
| 116 | + |
| 117 | +## How do I monitor a job? |
| 118 | + |
| 119 | +### squeue |
| 120 | + |
| 121 | +The `squeue --me` command shows the status of your jobs. By default, if you run it with no options, it shows all jobs. This makes it easier to keep track of your jobs. |
| 122 | + |
| 123 | +The output will look something like this: |
| 124 | + |
| 125 | +``` |
| 126 | + JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) |
| 127 | + 22 kathleen lammps_b ccxxxxx R 0:04 2 node-c11b-[002-003] |
| 128 | +``` |
| 129 | + |
| 130 | +This shows you the job ID, the partition it is using, the first 8 characters from the name you have given the job, your username, the state the job is in, how long it has been waiting (or running, if it has begun), the number of nodes it requested, and what nodes it is running on. |
| 131 | + |
| 132 | +You can get a little more information with the `-l` or `--long` option: |
| 133 | + |
| 134 | +``` |
| 135 | +Tue Jun 17 12:15:57 2025 |
| 136 | + JOBID PARTITION NAME USER STATE TIME TIME_LIMI NODES NODELIST(REASON) |
| 137 | + 22 kathleen lammps_b ccxxxxx RUNNING 1:16 2:00:00 2 node-c11b-[002-003] |
| 138 | +``` |
| 139 | + |
| 140 | +`squeue --help` will show how you can format the output differently. |
| 141 | + |
| 142 | +### scontrol |
| 143 | + |
| 144 | +This is a utility to display detailed information about the specified active job. |
| 145 | + |
| 146 | +``` |
| 147 | +scontrol show job 123454 |
| 148 | +``` |
| 149 | + |
| 150 | +### scancel |
| 151 | + |
| 152 | +You use `scancel` to delete a job from the queue. |
| 153 | + |
| 154 | +``` |
| 155 | +scancel 123454 |
| 156 | +``` |
| 157 | + |
| 158 | +You can delete all your jobs at once (this acts on your own user): |
| 159 | + |
| 160 | +``` |
| 161 | +scancel '*' |
| 162 | +``` |
| 163 | + |
| 164 | +### More scheduler commands |
| 165 | + |
| 166 | +Have a look at `man squeue` and note the commands shown in the SEE ALSO section of the manual page. Exit the manual page and then look at the man pages for those. (You will not be able to run all commands). |
| 167 | + |
| 168 | +### Past jobs |
| 169 | + |
| 170 | +Once a job ends, it no longer shows up in `squeue`. To see information about your finished jobs - when they started, when they ended, what node they ran on - use the command `sacct`. |
| 171 | + |
| 172 | +``` |
| 173 | +# show my jobs from the last two days |
| 174 | +sacct -X -o "jobid,user,jobname,start,end,alloccpus,nodelist,state,exitcode" -S now-2days |
| 175 | +``` |
| 176 | + |
| 177 | +``` |
| 178 | +JobID User JobName Start End AllocCPUS NodeList State ExitCode |
| 179 | +------------ --------- ---------- ------------------- ------------------- ---------- --------------- ---------- -------- |
| 180 | +19 ccxxxxx lammps_bt+ 2025-06-16T14:13:39 2025-06-16T14:13:44 160 node-c11b-[002+ FAILED 2:0 |
| 181 | +22 ccxxxxx lammps_bt+ 2025-06-17T12:14:41 Unknown 160 node-c11b-[002+ RUNNING 0:0 |
| 182 | +``` |
| 183 | + |
| 184 | +If a job ended and didn't create the files you expect, check the start and end times to see whether it ran out of wallclock time. |
| 185 | + |
| 186 | +If a job only ran for seconds and didn't produce the expected output, there was probably something wrong in your script - check the .out.txt and .err.txt files in the directory you submitted the job from for errors. |
0 commit comments