@@ -12,7 +12,7 @@ the resource manager [Slurm](https://slurm.schedmd.com/). Users can select the
12
12
resources for their notebooks from the JupyterHub interface thanks to the
13
13
[ JupyterHub MOdular Slurm Spawner] ( https://github.com/silx-kit/jupyterhub_moss ) ,
14
14
which leverages [ batchspawner] ( https://github.com/jupyterhub/batchspawner ) to
15
- submit jobs to Slurm in user's behalf that will launch the single-user server.
15
+ submit jobs to Slurm in user's behalf to launch the single-user server.
16
16
17
17
The main particularity of our setup is that such jobs are not submitted to
18
18
Slurm from the host running JupyterHub, but from the login nodes of the HPC
@@ -24,21 +24,23 @@ installation capable of submitting jobs to the HPC cluster.
24
24
## Rootless
25
25
26
26
JupyterHub is run by a non-root user in a rootless container. Setting up a
27
- rootless container is well described in the [ podman rootless
28
- tutorial] ( https://github.com/containers/podman/blob/main/docs/tutorials/rootless_tutorial.md ) .
29
-
30
- We use a [ system service] ( host/etc/systemd/system/jupyterhub.service ) to
31
- execute ` podman ` by a non-root user ` jupyterhub ` (* aka* JupyterHub operator).
32
- This service relies on a [ custom shell script] ( host/usr/local/bin/jupyterhub-init.sh )
33
- to automatically initialize a new image of the rootless container or start an
34
- existing one.
35
-
36
- The container [ binds a few mounts with sensitive configuration
37
- files] ( host/usr/local/bin/jupyterhub-init.sh#L59-L66 ) for JupyterHub, SSL
38
- certificates for the web server and SSH keys to connect to the login nodes.
39
- Provisioning these files in the container through bind-mounts allows to have
40
- secret-free container images and seamlessly deploy updates to the configuration
41
- of the hub.
27
+ rootless container is well described in the
28
+ [ podman rootless tutorial] ( https://github.com/containers/podman/blob/main/docs/tutorials/rootless_tutorial.md ) .
29
+
30
+ We use a [ custom system service] ( container_host/etc/systemd/system/container-jupyterhub.service )
31
+ to start the container with ` podman ` by the non-root user ` jupyterhub ` (* aka*
32
+ JupyterHub operator).
33
+ This service regenerates at (re)start any running container with a new one from
34
+ the container image. This approach ensures a clean state of the container and
35
+ allows to easily recover from any runtime issues with it.
36
+
37
+ The root filesystem in the container is read-only. The only writable space is
38
+ the home directory of the non-root user running the container. We also [ bind a
39
+ few read-only mounts] ( container_host/etc/systemd/system/container-jupyterhub.service#L38 )
40
+ with sensitive configuration files for JupyterHub, SSL certificates for the web
41
+ server and SSH keys to connect to the login nodes. Provisioning these files in
42
+ the container through bind-mounts allows to have secret-free container images
43
+ and seamlessly deploy updates to the configuration of the hub.
42
44
43
45
## Network
44
46
@@ -70,34 +72,34 @@ from JupyterHub:
70
72
* [ URLs of the VSC OAuth] ( container/Dockerfile#L72-L76 ) are defined in the
71
73
environment of the container
72
74
73
- * [ OAuth secrets] ( container/.config/jupyterhub_config.py#L40-L45 ) are
75
+ * [ OAuth secrets] ( container/.config/jupyterhub_config.py#L43-L48 ) are
74
76
defined in JupyterHub's configuration file
75
77
76
78
* local users beyond the non-root user running JupyterHub are ** not needed**
77
79
78
80
## Slurm
79
81
80
82
Integration with Slurm is leveraged through a custom Spawner called
81
- [ VSCSlurmSpawner] ( container/.config/jupyterhub_config.py#L60 ) based on
83
+ [ VSCSlurmSpawner] ( container/.config/jupyterhub_config.py#L63 ) based on
82
84
[ MOSlurmSpawner] ( https://github.com/silx-kit/jupyterhub_moss ) .
83
85
` VSCSlurmSpawner ` allows JupyterHub to generate the user's environment needed
84
86
to spawn its single-user server without any local users. All user settings are
85
87
taken from ` vsc-config ` .
86
88
87
- We modified the [ submission command] ( container/.config/jupyterhub_config.py#L295 )
89
+ We modified the [ submission command] ( container/.config/jupyterhub_config.py#L317 )
88
90
to execute ` sbatch ` in the login nodes of the HPC cluster through SSH.
89
91
The login nodes already run Slurm and are the sole systems handling job
90
92
submission in our cluster. Delegating job submission to them avoids having to
91
93
install and configure Slurm in the container running JupyterHub. The hub
92
94
environment is passed over SSH with a strict control over the variables that
93
- are [ sent] ( container/.ssh/config ) and [ accepted] ( slurm_login /etc/ssh/sshd_config)
95
+ are [ sent] ( container/.ssh/config ) and [ accepted] ( slurm_host /etc/ssh/sshd_config)
94
96
on both ends.
95
97
96
98
The SSH connection is established by the non-root user running JupyterHub (the
97
99
hub container does not have other local users). This jupyterhub user has
98
100
special ` sudo ` permissions on the login nodes to submit jobs to Slurm as other
99
101
users. The specific group of users and list of commands allowed to the
100
- jupyterhub user are defined in the [ sudoers file] ( slurm_login /etc/sudoers) .
102
+ jupyterhub user are defined in the [ sudoers file] ( slurm_host /etc/sudoers) .
101
103
102
104
Single-user server spawn process:
103
105
@@ -114,7 +116,7 @@ Single-user server spawn process:
114
116
hub environment
115
117
116
118
5 . single-user server job script fully [ resets the
117
- environment] ( container/.config/jupyterhub_config.py#L264-L285 ) before any
119
+ environment] ( container/.config/jupyterhub_config.py#L286-L312 ) before any
118
120
other step is taken to minimize tampering from user's own environment
119
121
120
122
6 . single-user server is launched ** without** the mediation of ` srun ` to be
0 commit comments