Skip to content

Commit

Permalink
Merge pull request #6 from lexming/jh12
Browse files Browse the repository at this point in the history
update configuration files and documentation for deployment of JupyterHub 3.1
  • Loading branch information
wpoely86 authored Jul 4, 2023
2 parents c6a1463 + ea9e30d commit df7d621
Show file tree
Hide file tree
Showing 8 changed files with 158 additions and 221 deletions.
46 changes: 24 additions & 22 deletions jupyterhub/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ the resource manager [Slurm](https://slurm.schedmd.com/). Users can select the
resources for their notebooks from the JupyterHub interface thanks to the
[JupyterHub MOdular Slurm Spawner](https://github.com/silx-kit/jupyterhub_moss),
which leverages [batchspawner](https://github.com/jupyterhub/batchspawner) to
submit jobs to Slurm in user's behalf that will launch the single-user server.
submit jobs to Slurm in user's behalf to launch the single-user server.

The main particularity of our setup is that such jobs are not submitted to
Slurm from the host running JupyterHub, but from the login nodes of the HPC
Expand All @@ -24,21 +24,23 @@ installation capable of submitting jobs to the HPC cluster.
## Rootless

JupyterHub is run by a non-root user in a rootless container. Setting up a
rootless container is well described in the [podman rootless
tutorial](https://github.com/containers/podman/blob/main/docs/tutorials/rootless_tutorial.md).

We use a [system service](host/etc/systemd/system/jupyterhub.service) to
execute `podman` by a non-root user `jupyterhub` (*aka* JupyterHub operator).
This service relies on a [custom shell script](host/usr/local/bin/jupyterhub-init.sh)
to automatically initialize a new image of the rootless container or start an
existing one.

The container [binds a few mounts with sensitive configuration
files](host/usr/local/bin/jupyterhub-init.sh#L59-L66) for JupyterHub, SSL
certificates for the web server and SSH keys to connect to the login nodes.
Provisioning these files in the container through bind-mounts allows to have
secret-free container images and seamlessly deploy updates to the configuration
of the hub.
rootless container is well described in the
[podman rootless tutorial](https://github.com/containers/podman/blob/main/docs/tutorials/rootless_tutorial.md).

We use a [custom system service](container_host/etc/systemd/system/container-jupyterhub.service)
to start the container with `podman` by the non-root user `jupyterhub` (*aka*
JupyterHub operator).
This service regenerates at (re)start any running container with a new one from
the container image. This approach ensures a clean state of the container and
allows to easily recover from any runtime issues with it.

The root filesystem in the container is read-only. The only writable space is
the home directory of the non-root user running the container. We also [bind a
few read-only mounts](container_host/etc/systemd/system/container-jupyterhub.service#L38)
with sensitive configuration files for JupyterHub, SSL certificates for the web
server and SSH keys to connect to the login nodes. Provisioning these files in
the container through bind-mounts allows to have secret-free container images
and seamlessly deploy updates to the configuration of the hub.

## Network

Expand Down Expand Up @@ -70,34 +72,34 @@ from JupyterHub:
* [URLs of the VSC OAuth](container/Dockerfile#L72-L76) are defined in the
environment of the container

* [OAuth secrets](container/.config/jupyterhub_config.py#L40-L45) are
* [OAuth secrets](container/.config/jupyterhub_config.py#L43-L48) are
defined in JupyterHub's configuration file

* local users beyond the non-root user running JupyterHub are **not needed**

## Slurm

Integration with Slurm is leveraged through a custom Spawner called
[VSCSlurmSpawner](container/.config/jupyterhub_config.py#L60) based on
[VSCSlurmSpawner](container/.config/jupyterhub_config.py#L63) based on
[MOSlurmSpawner](https://github.com/silx-kit/jupyterhub_moss).
`VSCSlurmSpawner` allows JupyterHub to generate the user's environment needed
to spawn its single-user server without any local users. All user settings are
taken from `vsc-config`.

We modified the [submission command](container/.config/jupyterhub_config.py#L295)
We modified the [submission command](container/.config/jupyterhub_config.py#L317)
to execute `sbatch` in the login nodes of the HPC cluster through SSH.
The login nodes already run Slurm and are the sole systems handling job
submission in our cluster. Delegating job submission to them avoids having to
install and configure Slurm in the container running JupyterHub. The hub
environment is passed over SSH with a strict control over the variables that
are [sent](container/.ssh/config) and [accepted](slurm_login/etc/ssh/sshd_config)
are [sent](container/.ssh/config) and [accepted](slurm_host/etc/ssh/sshd_config)
on both ends.

The SSH connection is established by the non-root user running JupyterHub (the
hub container does not have other local users). This jupyterhub user has
special `sudo` permissions on the login nodes to submit jobs to Slurm as other
users. The specific group of users and list of commands allowed to the
jupyterhub user are defined in the [sudoers file](slurm_login/etc/sudoers).
jupyterhub user are defined in the [sudoers file](slurm_host/etc/sudoers).

Single-user server spawn process:

Expand All @@ -114,7 +116,7 @@ Single-user server spawn process:
hub environment

5. single-user server job script fully [resets the
environment](container/.config/jupyterhub_config.py#L264-L285) before any
environment](container/.config/jupyterhub_config.py#L286-L312) before any
other step is taken to minimize tampering from user's own environment

6. single-user server is launched **without** the mediation of `srun` to be
Expand Down
118 changes: 73 additions & 45 deletions jupyterhub/container/.config/jupyterhub_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,12 +20,15 @@
#
#------------------------------------------------------------------------------
# Network configuration
# - listen on all interfaces: proxy is in localhost, users are external and
# spawners are internal
#------------------------------------------------------------------------------
# Listen on all interfaces
# proxy is in localhost, users are external and spawners are internal
# Public facing proxy
c.JupyterHub.bind_url = 'https://0.0.0.0:8000'
c.JupyterHub.port = 8000
# Internal hub
c.JupyterHub.hub_ip = '0.0.0.0'
# IP address or hostname that spawners should use to connect to the Hub API
c.JupyterHub.hub_port = 8081
c.JupyterHub.hub_connect_ip = 'jupyterhub.internal.domain'

#------------------------------------------------------------------------------
Expand Down Expand Up @@ -95,6 +98,8 @@ def user_env(self, env):
# - SSH connection stablished as JupyterHub operator
# - define job script parameters and commands launching the notebook
#------------------------------------------------------------------------------
JHUB_VER = "3.1.1"

set_config(c)
c.JupyterHub.spawner_class = VSCSlurmSpawner
c.Spawner.start_timeout = 600 # seconds from job submit to job start
Expand All @@ -104,98 +109,119 @@ def user_env(self, env):
vub_lab_environments = {
"2022_default": {
# Text displayed for this environment select option
"description": "2022a: Python v3.10.4 + kernels (default)",
"description": "2022a Default: minimal with all modules available",
# Space separated list of modules to be loaded
"modules": "JupyterHub/2.3.1-GCCcore-11.3.0",
"modules": f"JupyterHub/{JHUB_VER}-GCCcore-11.3.0",
# Path to Python environment bin/ used to start jupyter on the Slurm nodes
"path": "",
# Toggle adding the environment to shell PATH (default: True)
"add_to_path": False,
"group": "Python v3.10.4",
},
"2022_scipy": {
"description": "2022a DataScience: SciPy-bundle + matplotlib + dask",
"modules": (
f"JupyterHub/{JHUB_VER}-GCCcore-11.3.0 "
"SciPy-bundle/2022.05-foss-2022a "
"ipympl/0.9.3-foss-2022a "
"dask-labextension/6.0.0-foss-2022a "
),
"path": "",
"add_to_path": False,
"group": "Python v3.10.4",
},
"2022_nglview": {
"description": "2022a Molecules: DataScience + nglview + 3Dmol",
"modules": (
f"JupyterHub/{JHUB_VER}-GCCcore-11.3.0 "
"SciPy-bundle/2022.05-foss-2022a "
"ipympl/0.9.3-foss-2022a "
"dask-labextension/6.0.0-foss-2022a "
"nglview/3.0.3-foss-2022a "
"py3Dmol/2.0.1.post1-GCCcore-11.3.0 "
),
"path": "",
"add_to_path": False,
"group": "Python v3.10.4",
},
"2022_rstudio": {
"description": "2022a: Python v3.10.4 + RStudio",
"description": "2022a RStudio with R v4.2.1",
"modules": (
"JupyterHub/2.3.1-GCCcore-11.3.0 "
f"JupyterHub/{JHUB_VER}-GCCcore-11.3.0 "
"jupyter-rsession-proxy/2.1.0-GCCcore-11.3.0 "
"RStudio-Server/2022.07.2+576-foss-2022a-Java-11-R-4.2.1 "
"IRkernel/1.3.2-foss-2022a-R-4.2.1 "
),
"path": "",
"add_to_path": False,
"group": "Python v3.10.4",
},
"2022_matlab": {
"description": "2022a: Python v3.10.4 + MATLAB",
"description": "2022a MATLAB",
"modules": (
"MATLAB/2022a-r5 "
"JupyterHub/2.3.1-GCCcore-11.3.0 "
f"JupyterHub/{JHUB_VER}-GCCcore-11.3.0 "
"jupyter-matlab-proxy/0.5.0-GCCcore-11.3.0 "
),
"path": "",
"add_to_path": False,
"group": "Python v3.10.4",
},
"2022_dask": {
"description": "2022a: Python v3.10.4 + dask",
"modules": (
"JupyterHub/2.3.1-GCCcore-11.3.0 "
"dask-labextension/6.0.0-foss-2022a "
),
"2021_default": {
"description": "2021a Default: minimal with all modules available",
"modules": "JupyterHub/2.3.1-GCCcore-10.3.0",
"path": "",
"add_to_path": False,
"group": "Python v3.9.5",
},
"2022_nglview": {
"description": "2022a: Python v3.10.4 + nglview",
"2021_scipy": {
"description": "2021a DataScience: SciPy-bundle + matplotlib + dask",
"modules": (
"JupyterHub/2.3.1-GCCcore-11.3.0 "
"nglview/3.0.3-foss-2022a "
f"JupyterHub/{JHUB_VER}-GCCcore-10.3.0 "
"SciPy-bundle/2021.05-foss-2021a "
"ipympl/0.8.8-foss-2021a "
"dask-labextension/5.3.1-foss-2021a "
),
"path": "",
"add_to_path": False,
"group": "Python v3.9.5",
},
"2021_default": {
"description": "2021a: Python v3.9.5 + kernels (default)",
"modules": "JupyterHub/2.3.1-GCCcore-10.3.0",
"2021_nglview": {
"description": "2021a Molecules: DataScience + nglview",
"modules": (
f"JupyterHub/{JHUB_VER}-GCCcore-10.3.0 "
"SciPy-bundle/2021.05-foss-2021a "
"ipympl/0.8.8-foss-2021a "
"dask-labextension/5.3.1-foss-2021a "
"nglview/3.0.3-foss-2021a "
),
"path": "",
"add_to_path": False,
"group": "Python v3.9.5",
},
"2021_rstudio": {
"description": "2021a: Python v3.9.5 + RStudio",
"description": "2021a RStudio with R v4.1.0",
"modules": (
"JupyterHub/2.3.1-GCCcore-10.3.0 "
f"JupyterHub/{JHUB_VER}-GCCcore-10.3.0 "
"jupyter-rsession-proxy/2.1.0-GCCcore-10.3.0 "
"RStudio-Server/1.4.1717-foss-2021a-Java-11-R-4.1.0 "
"IRkernel/1.2-foss-2021a-R-4.1.0 "
),
"path": "",
"add_to_path": False,
"group": "Python v3.9.5",
},
"2021_matlab": {
"description": "2021a: Python v3.9.5 + MATLAB",
"description": "2021a MATLAB",
"modules": (
"MATLAB/2021a "
"JupyterHub/2.3.1-GCCcore-10.3.0 "
f"JupyterHub/{JHUB_VER}-GCCcore-10.3.0 "
"jupyter-matlab-proxy/0.3.4-GCCcore-10.3.0 "
"MATLAB-Kernel/0.17.1-GCCcore-10.3.0 "
),
"path": "",
"add_to_path": False,
},
"2021_dask": {
"description": "2021a: Python v3.9.5 + dask",
"modules": (
"JupyterHub/2.3.1-GCCcore-10.3.0 "
"dask-labextension/5.3.1-foss-2021a "
),
"path": "",
"add_to_path": False,
},
"2021_nglview": {
"description": "2021a: Python v3.9.5 + nglview",
"modules": (
"JupyterHub/2.3.1-GCCcore-10.3.0 "
"nglview/3.0.3-foss-2021a "
),
"path": "",
"add_to_path": False,
"group": "Python v3.9.5",
},
}

Expand Down Expand Up @@ -302,7 +328,9 @@ def user_env(self, env):
c.SlurmSpawner.batch_cancel_cmd = "scancel {{job_id}} "
# protect argument quoting in squeque and sinfo sent through SSH
c.SlurmSpawner.batch_query_cmd = r"squeue -h -j {{job_id}} -o \'%T %B\' "
c.MOSlurmSpawner.slurm_info_cmd = r"sinfo -a --noheader -o \'%R %D %C %G %m\'"
c.MOSlurmSpawner.slurm_info_cmd = (
r"sinfo -N -a --noheader -O \'PartitionName,StateCompact,CPUsState,Gres,GresUsed,Memory,Time\'"
)

# directly launch single-user server (without srun) to avoid issues with MPI software
# job environment is already reset before any step starts
Expand Down
18 changes: 9 additions & 9 deletions jupyterhub/container/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -20,11 +20,11 @@
#
###
#
# JupyterHub 2.3 + Oauthenticator + batchspawner
# JupyterHub 3.1 + Oauthenticator + batchspawner + jupyterhub_moss
# based on https://github.com/jupyterhub/oauthenticator/blob/main/examples/full/Dockerfile
# JupyterHub run as non-root user

FROM jupyterhub/jupyterhub:2.3
FROM jupyterhub/jupyterhub:3.1

MAINTAINER VUB-HPC <[email protected]>

Expand All @@ -44,16 +44,16 @@ RUN python3 -m pip install --upgrade pip
# install Oauthenticator
RUN python3 -m pip install oauthenticator
# install BatchSpawner and Modular Slurm Spawner (vub-hpc fork)
RUN python3 -m pip install https://github.com/vub-hpc/batchspawner/archive/refs/tags/v1.2.1.tar.gz
RUN python3 -m pip install https://github.com/vub-hpc/jupyterhub_moss/archive/refs/tags/v5.5.2.tar.gz
RUN python3 -m pip install https://github.com/vub-hpc/batchspawner/archive/refs/tags/v1.2.2.tar.gz
RUN python3 -m pip install https://github.com/vub-hpc/jupyterhub_moss/archive/refs/tags/v6.2.1.tar.gz
# install vsc-config
ADD vsc-config /opt/vsc-config
RUN python3 -m pip install vsc-base
COPY vsc-config-master.tar.gz /usr/local/src/
RUN python3 -m pip install /usr/local/src/vsc-config-master.tar.gz
RUN python3 -m pip install /opt/vsc-config
# install static resources for theming
COPY vub-hpc-logo-horiz-color.png /usr/local/share/jupyterhub/static/images/
COPY vub-hpc-logo-square-color.png /usr/local/share/jupyterhub/static/images/
COPY vsc-logo.png /usr/local/share/jupyterhub/static/images/
COPY assets/vub-hpc-logo-horiz-color.png /usr/local/share/jupyterhub/static/images/
COPY assets/vub-hpc-logo-square-color.png /usr/local/share/jupyterhub/static/images/
COPY assets/vsc-logo.png /usr/local/share/jupyterhub/static/images/

# --- JupyterHub operator: non-root user ---
# create user with same UID as outside of container
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
# Copyright 2023 Vrije Universiteit Brussel
#
# This file is part of notebook-platform,
# originally created by the HPC team of Vrij Universiteit Brussel (http://hpc.vub.be),
# with support of Vrije Universiteit Brussel (http://www.vub.be),
# the Flemish Supercomputer Centre (VSC) (https://www.vscentrum.be),
# the Flemish Research Foundation (FWO) (http://www.fwo.be/en)
# and the Department of Economy, Science and Innovation (EWI) (http://www.ewi-vlaanderen.be/en).
#
# https://github.com/vub-hpc/notebook-platform
#
# notebook-platform is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License v3 as published by
# the Free Software Foundation.
#
# notebook-platform is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
###
#
# Unit file for a service running a rootless container in Podman
# generated with `podman generate systemd`
# based on:
# - https://www.redhat.com/sysadmin/podman-shareable-systemd-services
# - https://www.redhat.com/sysadmin/podman-run-pods-systemd-services
#

[Unit]
After=network-online.target
Description=Podman rootless container for JupyterHub
RequiresMountsFor=%t
Wants=network-online.target

[Service]
Environment="PODMAN_SYSTEMD_UNIT=%n"
ExecStart=/usr/bin/podman run --cidfile=%t/%n/%n.ctr-id --cgroups=no-conmon --sdnotify=conmon --rm --replace -d --read-only --mount=type=tmpfs,tmpfs-size=128M,destination=/home/jupyterhub,chown -v /home/jupyterhub/.ssh:/home/jupyterhub/.ssh:ro -v /home/jupyterhub/.config:/home/jupyterhub/.config:ro -v /home/jupyterhub/.ssl:/home/jupyterhub/.ssl:ro --log-driver=journald -v /dev/log:/dev/log -p 8000:8000/tcp -p 8081:8081/tcp --userns=keep-id --name=jupyterhub ghcr.io/vub-hpc/azure-pipelines-jupyterhub:latest jupyterhub -f /home/jupyterhub/.config/jupyterhub_config.py
ExecStartPre=/bin/rm -f %t/%n/%n.ctr-id
ExecStop=/usr/bin/podman stop --ignore --cidfile=%t/%n/%n.ctr-id
ExecStopPost=/usr/bin/podman rm -f --ignore --cidfile=%t/%n/%n.ctr-id
NotifyAccess=all
Restart=on-failure
RuntimeDirectory=%n
TimeoutStopSec=70
Type=notify
User=jupyterhub
Group=jupyterhub

[Install]
WantedBy=multi-user.target

Loading

0 comments on commit df7d621

Please sign in to comment.