Skip to content

Add Dockerfile #8

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 22 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
FROM azul/zulu-openjdk:21-jdk-crac-latest as builder
WORKDIR application

ADD ./.mvn .mvn/
ADD ./mvnw mvnw
ADD ./pom.xml pom.xml
ADD ./src src/
ADD ./.git .git/
RUN ./mvnw -V clean package -DskipTests --no-transfer-progress && \
cp target/*.jar application.jar && \
java -Djarmode=layertools -jar application.jar extract

FROM azul/zulu-openjdk:21-jdk-crac-latest
WORKDIR application

COPY --from=builder application/dependencies/ ./
COPY --from=builder application/spring-boot-loader/ ./
COPY --from=builder application/snapshot-dependencies/ ./
COPY --from=builder application/application/ ./
COPY entrypoint.sh ./

ENTRYPOINT ["/application/entrypoint.sh"]
65 changes: 65 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,3 +38,68 @@ jcmd target/example-spring-boot-0.0.1-SNAPSHOT.jar JDK.checkpoint
```
$JAVA_HOME/bin/java -XX:CRaCRestoreFrom=cr
```

## Docker image

### Building

Create a Docker image using the provided [`Dockerfile`](./Dockerfile) with the following command:

```
docker build -t example-spring-boot .
```

### Running

Run the built Docker image with the following command:

```
docker run -p 8080:8080 \
--cap-add CHECKPOINT_RESTORE \
--cap-add NET_ADMIN \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you really need NET_ADMIN and SYS_ADMIN in here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I tried, NET_ADMIN is required on restore. Without this option you will get the following error:

Restore checkpoint from /var/crac
Error (criu/libnetlink.c:54): -1 reported by netlink: Operation not permitted
Error (criu/net.c:3744): Unable to create a veth pair: -1
2023-11-27T09:32:22.200Z  INFO 10 --- [Attach Listener] o.s.c.support.DefaultLifecycleProcessor  : Restarting Spring-managed lifecycle beans after JVM restore
2023-11-27T09:32:22.204Z  INFO 10 --- [Attach Listener] o.s.b.w.embedded.tomcat.TomcatWebServer  : Tomcat started on port 8080 (http) with context path ''
2023-11-27T09:32:22.205Z  INFO 10 --- [Attach Listener] o.s.c.support.DefaultLifecycleProcessor  : Spring-managed lifecycle restart completed (restored JVM running for 24 ms)

Similarly, SYS_ADMIN is required at checkpoint. Without this option you will get the following error:

2023-11-27T09:34:05.291Z  INFO 10 --- [Attach Listener] jdk.crac                                 : Starting checkpoint
CR: Checkpoint ...
/application/entrypoint.sh: line 13:    10 Killed                  java -XX:CRaCCheckpointTo=$CHECKPOINT_RESTORE_FILES_DIR org.springframework.boot.loader.launch.JarLauncher
Error (criu/cr-restore.c:1518): Can't fork for 10: Read-only file system
Error (criu/cr-restore.c:1835): Pid 140 do not mat

Strictly speaking, the options required at checkpoint and restore are different, but normally we would want to pass options that can be used in both cases.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I overlooked this PR.

The errors coming from the lack of NET_ADMIN are not critical, this is a bug they were reported as errors.

Regarding SYS_ADMIN, I believe it's possible to avoid it with changes in entrypoint.sh, for which I have a very dirty PoC. remove-extra-caps.diff.txt

But before going that route, how do you find #12, which also demonstrates creation of
docker container?

--cap-add SYS_PTRACE \
--cap-add SYS_ADMIN \
-v /tmp/crac:/var/crac \
-e CHECKPOINT_RESTORE_FILES_DIR=/var/crac \
--rm \
example-spring-boot
```

The following logs will be outputted. A Checkpoint is created 10 seconds after the application starts. This time can be changed with the `SLEEP_BEFORE_CHECKPOINT` environment variable in [`entrypoint.sh`](./entrypoint.sh).

```
. ____ _ __ _ _
/\\ / ___'_ __ _ _(_)_ __ __ _ \ \ \ \
( ( )\___ | '_ | '_| | '_ \/ _` | \ \ \ \
\\/ ___)| |_)| | | | | || (_| | ) ) ) )
' |____| .__|_| |_|_| |_\__, | / / / /
=========|_|==============|___/=/_/_/_/
:: Spring Boot :: (v3.2.0)

2023-11-27T07:51:47.375Z INFO 8 --- [ main] com.example.springboot.Application : Starting Application v0.0.1-SNAPSHOT using Java 21.0.1 with PID 8 (/application/BOOT-INF/classes started by root in /application)
2023-11-27T07:51:47.378Z INFO 8 --- [ main] com.example.springboot.Application : No active profile set, falling back to 1 default profile: "default"
2023-11-27T07:51:48.138Z INFO 8 --- [ main] o.s.b.w.embedded.tomcat.TomcatWebServer : Tomcat initialized with port 8080 (http)
2023-11-27T07:51:48.146Z INFO 8 --- [ main] o.apache.catalina.core.StandardService : Starting service [Tomcat]
2023-11-27T07:51:48.146Z INFO 8 --- [ main] o.apache.catalina.core.StandardEngine : Starting Servlet engine: [Apache Tomcat/10.1.16]
2023-11-27T07:51:48.176Z INFO 8 --- [ main] o.a.c.c.C.[Tomcat].[localhost].[/] : Initializing Spring embedded WebApplicationContext
2023-11-27T07:51:48.177Z INFO 8 --- [ main] w.s.c.ServletWebServerApplicationContext : Root WebApplicationContext: initialization completed in 747 ms
2023-11-27T07:51:48.459Z INFO 8 --- [ main] o.s.b.w.embedded.tomcat.TomcatWebServer : Tomcat started on port 8080 (http) with context path ''
2023-11-27T07:51:48.471Z INFO 8 --- [ main] com.example.springboot.Application : Started Application in 1.433 seconds (process running for 1.673)
Picked up JAVA_TOOL_OPTIONS: -XX:+ExitOnOutOfMemoryError
8:
2023-11-27T07:51:57.048Z INFO 8 --- [Attach Listener] jdk.crac : Starting checkpoint
CR: Checkpoint ...
/application/entrypoint.sh: line 13: 8 Killed java -XX:CRaCCheckpointTo=$CHECKPOINT_RESTORE_FILES_DIR org.springframework.boot.loader.launch.JarLauncher
2023-11-27T07:52:00.491Z INFO 8 --- [Attach Listener] o.s.c.support.DefaultLifecycleProcessor : Restarting Spring-managed lifecycle beans after JVM restore
2023-11-27T07:52:00.494Z INFO 8 --- [Attach Listener] o.s.b.w.embedded.tomcat.TomcatWebServer : Tomcat started on port 8080 (http) with context path ''
2023-11-27T07:52:00.495Z INFO 8 --- [Attach Listener] o.s.c.support.DefaultLifecycleProcessor : Spring-managed lifecycle restart completed (restored JVM running for 30 ms)
```

Stop the docker with Ctrl+C, and run the same image again with the same command. This time, logs like the following will be outputted due to the restore. You can see that the JVM starts very quickly.

```
Restore checkpoint from /var/crac
2023-11-27T07:52:19.196Z INFO 8 --- [Attach Listener] o.s.c.support.DefaultLifecycleProcessor : Restarting Spring-managed lifecycle beans after JVM restore
2023-11-27T07:52:19.199Z INFO 8 --- [Attach Listener] o.s.b.w.embedded.tomcat.TomcatWebServer : Tomcat started on port 8080 (http) with context path ''
2023-11-27T07:52:19.200Z INFO 8 --- [Attach Listener] o.s.c.support.DefaultLifecycleProcessor : Spring-managed lifecycle restart completed (restored JVM running for 32 ms)
```
24 changes: 24 additions & 0 deletions entrypoint.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
#!/bin/bash
mkdir -p $CHECKPOINT_RESTORE_FILES_DIR
export JAVA_TOOL_OPTIONS="${JAVA_TOOL_OPTIONS} -XX:+ExitOnOutOfMemoryError"

if [ -z "$(ls $CHECKPOINT_RESTORE_FILES_DIR/core-*.img 2>/dev/null)" ]; then
echo "Save checkpoint to $CHECKPOINT_RESTORE_FILES_DIR" 1>&2
java -XX:CRaCCheckpointTo=$CHECKPOINT_RESTORE_FILES_DIR org.springframework.boot.loader.launch.JarLauncher &
sleep ${SLEEP_BEFORE_CHECKPOINT:-10}
jcmd org.springframework.boot.loader.launch.JarLauncher JDK.checkpoint
sleep ${SLEEP_AFTER_CHECKPOINT:-3}
else
echo "Restore checkpoint from $CHECKPOINT_RESTORE_FILES_DIR" 1>&2
fi

(echo 128 > /proc/sys/kernel/ns_last_pid) 2>/dev/null || while [ $(cat /proc/sys/kernel/ns_last_pid) -lt 128 ]; do :; done
java -XX:CRaCRestoreFrom=$CHECKPOINT_RESTORE_FILES_DIR &
JAVA_PID=$!

stop_java_app() {
kill -SIGTERM $JAVA_PID
}

trap stop_java_app SIGINT
wait $JAVA_PID