Skip to content

Commit a11f01e

Browse files
authored
Add example for Kubernetes (#15)
1 parent ecb1f1d commit a11f01e

File tree

3 files changed

+199
-0
lines changed

3 files changed

+199
-0
lines changed

Dockerfile.k8s

+37
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
# syntax=docker/dockerfile:1.3-labs
2+
3+
FROM example-spring-boot-checkpoint
4+
RUN apt-get update && apt-get install -y ncat
5+
ENV CRAC_FILES_DIR=/cr
6+
7+
# This script is going to be used in the checkpointing job
8+
COPY <<'EOF' /checkpoint.sh
9+
#!/bin/sh
10+
11+
mkdir -p $CRAC_FILES_DIR
12+
rm $CRAC_FILES_DIR/* || true
13+
14+
# After receiving connection on port 1111 trigger the checkpoint (using numeric address to avoid IPv6 problems)
15+
(nc -v -l -p 1111 && jcmd example-spring-boot.jar JDK.checkpoint) &
16+
# we cannot exec java ... because the pod would be marked as failed when it exits
17+
# with exit code 137 after checkpoint
18+
java -XX:CRaCCheckpointTo=$CRAC_FILES_DIR -XX:CRaCMinPid=128 -jar /example-spring-boot.jar &
19+
PID=$!
20+
trap "kill $PID" SIGINT SIGTERM
21+
wait $PID || true
22+
EOF
23+
24+
COPY <<'EOF' /restore-or-start.sh
25+
#!/bin/sh
26+
27+
if [ -z "$(ls -A $CRAC_FILES_DIR 2> /dev/null)" ]; then
28+
echo "No checkpoint found, starting the application normally..."
29+
exec java -jar /example-spring-boot.jar
30+
else
31+
echo "Checkpoint is present, restoring the application..."
32+
exec java -XX:CRaCRestoreFrom=$CRAC_FILES_DIR
33+
fi
34+
EOF
35+
36+
ENTRYPOINT [ "bash" ]
37+
CMD [ "/restore-or-start.sh" ]

README.md

+74
Original file line numberDiff line numberDiff line change
@@ -135,3 +135,77 @@ export URL=$(gcloud run services describe example-spring-boot-direct --format 'v
135135
curl $URL
136136
Greetings from Spring Boot!
137137
```
138+
139+
## Preparing checkpoint and running in Kubernetes
140+
141+
One way to run in Kubernetes is to perform the checkpoint locally or as part of Docker build, as we have done in the previous examples. Here we will show you how to do it end-to-end inside Kubernetes.
142+
143+
Let's begin by starting a new Minikube cluster. We will create a new namespace `example` and use this for the demo:
144+
145+
```bash
146+
minikube start
147+
eval $(minikube docker-env)
148+
kubectl create ns example
149+
kubectl config set-context --current --namespace=example
150+
```
151+
152+
Now we can build an image using `Dockerfile.k8s`, based on `example-spring-boot-checkpoint` - that image hosts a built application. We will add the `netcat` utility and two scripts:
153+
* `checkpoint.sh` starts the application with `-XX:CRaCCheckpointTo=...` and `netcat` server listening on port 1111. When somebody connects to this port, the checkpoint via `jcmd` will be triggered.
154+
* `restore-or-start.sh` will check the presence of checkpoint image files and either restores from this image, or fallbacks to a regular application startup.
155+
156+
```bash
157+
docker build -f Dockerfile.checkpoint -t example-spring-boot-checkpoint .
158+
docker build -f Dockerfile.k8s -t example-spring-boot-k8s .
159+
```
160+
161+
Now we can apply resources from `k8s.yaml`: this hosts a PersistentVolumeClaim representing a storage (in Minikube this is bound automatically to a PersistentVolume), a Deployment that will create the application using the `restore-or-start.sh` script, and a Job that will create the checkpoint image. You can apply that now and observe that this has created two pods:
162+
163+
```bash
164+
kubectl apply -f k8s.yaml
165+
kubectl get po
166+
```
167+
```
168+
NAME READY STATUS RESTARTS AGE
169+
create-checkpoint-fsfs4 2/2 Running 0 4s
170+
example-spring-boot-68b69cc8-bbxnx 1/1 Running 0 4s
171+
```
172+
173+
When you explore application logs (`kubectl logs example-spring-boot-68b69cc8-bbxnx`) you will find that the application is started normally; the checkpoint image was not created yet. The other pod, though, hosts two containers: one running `checkpoint.sh` and the other warming the application up using `siege`, and then triggering the checkpoint through connection on port 1111 (this is not a built-in feature, remember that we use `netcat` in the background).
174+
175+
After a while the job completes:
176+
177+
```bash
178+
kubectl get job
179+
NAME STATUS COMPLETIONS DURATION AGE
180+
create-checkpoint Complete 1/1 19s 44m
181+
```
182+
183+
And now you can rollout a new deployment, this time restoring the application from the checkpoint image:
184+
185+
```bash
186+
kubectl rollout restart deployment/example-spring-boot
187+
```
188+
189+
After a short moment that application is back up:
190+
191+
```
192+
NAME READY STATUS RESTARTS AGE
193+
create-checkpoint-fsfs4 0/2 Completed 0 95s
194+
example-spring-boot-79b98966db-ml2pj 1/1 Running 0 15s
195+
```
196+
197+
In the logs you can see that it performed the restore:
198+
199+
```
200+
2024-09-30T07:52:11.858Z INFO 129 --- [Attach Listener] o.s.c.support.DefaultLifecycleProcessor : Restarting Spring-managed lifecycle beans after JVM restore
201+
2024-09-30T07:52:11.866Z INFO 129 --- [Attach Listener] o.s.b.w.embedded.tomcat.TomcatWebServer : Tomcat started on port 8080 (http) with context path ''
202+
2024-09-30T07:52:11.868Z INFO 129 --- [Attach Listener] o.s.c.support.DefaultLifecycleProcessor : Spring-managed lifecycle restart completed (restored JVM running for 45 ms)
203+
```
204+
205+
At last, let's verify that the application responds to our requests. You should get the "Greetings from Spring Boot!" reply:
206+
207+
```bash
208+
kubectl expose deployment example-spring-boot --type=NodePort --port=8080
209+
URL=$(minikube service example-spring-boot -n example --url)
210+
curl $URL
211+
```

k8s.yaml

+88
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,88 @@
1+
apiVersion: v1
2+
kind: PersistentVolumeClaim
3+
metadata:
4+
name: crac-image
5+
namespace: example
6+
spec:
7+
accessModes:
8+
- ReadWriteOnce
9+
resources:
10+
requests:
11+
storage: 500Mi
12+
storageClassName: "standard"
13+
---
14+
apiVersion: batch/v1
15+
kind: Job
16+
metadata:
17+
name: create-checkpoint
18+
namespace: example
19+
spec:
20+
template:
21+
spec:
22+
containers:
23+
- name: workload
24+
image: example-spring-boot-k8s
25+
imagePullPolicy: IfNotPresent
26+
env:
27+
- name: CRAC_FILES_DIR
28+
value: /var/crac/image
29+
args:
30+
- /checkpoint.sh
31+
securityContext:
32+
capabilities:
33+
add:
34+
- CHECKPOINT_RESTORE
35+
- SYS_PTRACE
36+
volumeMounts:
37+
- mountPath: /var/crac
38+
name: crac-image
39+
- name: warmup
40+
image: jstarcher/siege
41+
imagePullPolicy: IfNotPresent
42+
command:
43+
- /bin/sh
44+
- -c
45+
- |
46+
while ! nc -z localhost 8080; do sleep 0.1; done
47+
siege -c 1 -r 100000 -b http://localhost:8080
48+
echo "Do checkpoint, please" | nc -v localhost 1111
49+
restartPolicy: Never
50+
volumes:
51+
- name: crac-image
52+
persistentVolumeClaim:
53+
claimName: crac-image
54+
---
55+
apiVersion: apps/v1
56+
kind: Deployment
57+
metadata:
58+
name: example-spring-boot
59+
namespace: example
60+
labels:
61+
app: example-spring-boot
62+
spec:
63+
replicas: 1
64+
selector:
65+
matchLabels:
66+
app: example-spring-boot
67+
template:
68+
metadata:
69+
labels:
70+
app: example-spring-boot
71+
spec:
72+
containers:
73+
- name: workload
74+
image: example-spring-boot-k8s
75+
imagePullPolicy: IfNotPresent
76+
env:
77+
- name: CRAC_FILES_DIR
78+
value: /var/crac/image
79+
ports:
80+
- containerPort: 8080
81+
volumeMounts:
82+
- mountPath: /var/crac
83+
name: crac-image
84+
volumes:
85+
- name: crac-image
86+
persistentVolumeClaim:
87+
claimName: crac-image
88+
readOnly: true

0 commit comments

Comments
 (0)