The basic idea was to see if I can create a CI/CD pipeline that allows me to deploy a scoring model in a micro service with the following requirements:
- Always on, even through new code deployments
- Resitent to failures during deployments and other possible issues
- Arbitrarily scalable allowing auto scaling with increased load
- Fast enough to allow for real-time on the fly scoring
- Integrated testing making sure that the model and the code passes all tests before it is deployed
So, the point of the exercise is more about studying the build and deployment process rather than the deployed service or machine learning.
- A Git repository that you can mess around with. You may just want to clone this one
- A running docker machine
- A Jenkins instance. You can use the image "docker pull jonasberlin/my-jenkins" or yor favorite one
- A functioning Kubernetes cluster (either minikube or your favorite)
- An image repository for docker images. You can use Docker Hub or a private one
- Jenkinsfile this file contains the Jenkins pipeline script used to build and deploy the application
- Dockerfile this file is used by Jenkins to build the final docker image that contains the service
- deploy.yaml this file contains the Kubernetes deployment instruction. Note that there is a reference to one of my docker hub repositories. You have to change this one to your own. See below
- pom.xml this file contains the maven build instructions for the Spring Boot application
- deeplearning.zip this file contains an H2O Mojo deplyment file which was created off the H2O deep learning example using the MNIST dataset. The basic idea of the model is to recognize hand written images of the number 0 through 9
The server application which gets deployed on the Kubernetes cluster exposes a RESTful web services which on startup loads the deeplearning.zip Mojo model file and then waits for requests. Once a request is received it:
- Parses the incoming JSON
- Creates an H2O ROWData object
- Scores the row through the model
- Returns a JSON object representing the result
A separate client application is available which can be used to test either your local copy of the application or one deployed on a Kubernetes cluster. This application simply:
- Reads a test data file
- Creates a JSON object for each record
- Calls the scoring web service (server application)
- Receives the result
- Writes out the result
You can run the client in your favorite way with the following arguments:
-url <url> the url to hit
-data <file path> the path used to load test data
-delay <number> the number of milli seconds to delay each call
-threads <number> the number of threads to use to hit the url
-showGraphics [none | error | all] the level of output for image rendering
For example:
java org.theberlins.citest.ImageRecognitionClientApplication -url http://192.168.99.101:32700 -data test.csv -threads 2 -delay 100 -showGraphics error
If you need a copy of the test data it is available here:
https://s3.amazonaws.com/h2o-public-test-data/bigdata/laptop/mnist/test.csv.gz
Once you get it to run it should look something like this:
15:11:53.001 [main] INFO org.theberlins.citest.ImageRecognitionClientApplication - Sample=9949 Actual=0 Predicted=0 in 5 ms
15:11:53.004 [main] INFO org.theberlins.citest.ImageRecognitionClientApplication - Sample=9950 Actual=2 Predicted=2 in 3 ms
15:11:53.007 [main] INFO org.theberlins.citest.ImageRecognitionClientApplication - Sample=9951 Actual=8 Predicted=8 in 3 ms
15:11:53.011 [main] INFO org.theberlins.citest.ImageRecognitionClientApplication - Sample=9952 Actual=4 Predicted=4 in 4 ms
15:11:53.016 [main] INFO org.theberlins.citest.ImageRecognitionClientApplication - Sample=9953 Actual=3 Predicted=3 in 5 ms
15:11:53.019 [main] INFO org.theberlins.citest.ImageRecognitionClientApplication - Sample=9954 Actual=3 Predicted=3 in 3 ms
15:11:53.023 [main] INFO org.theberlins.citest.ImageRecognitionClientApplication - Sample=9955 Actual=7 Predicted=7 in 4 ms
15:11:53.027 [main] INFO org.theberlins.citest.ImageRecognitionClientApplication - Sample=9956 Actual=1 Predicted=1 in 4 ms
15:11:53.032 [main] INFO org.theberlins.citest.ImageRecognitionClientApplication - Sample=9957 Actual=1 Predicted=1 in 4 ms
15:11:53.036 [main] INFO org.theberlins.citest.ImageRecognitionClientApplication - Sample=9958 Actual=0 Predicted=0 in 3 ms
15:11:53.041 [main] INFO org.theberlins.citest.ImageRecognitionClientApplication - Sample=9959 Actual=7 Predicted=7 in 4 ms
15:11:53.045 [main] INFO org.theberlins.citest.ImageRecognitionClientApplication - Sample=9960 Actual=2 Predicted=2 in 4 ms
The Jenkins build process does the following:
- Waits for a change to happen to the configured repository (see configuration below)
- Pulls down the Jenkinsfile file from the repository which contains the following build process steps
- Checks out the code from the Git repository
- Uses Maven to build the Java Spring Boot application that runs the service
- Uses the configured Docker machine and the Dockerfile instructions to build a new docker image containing the application and the deeplearning.zip Mojo model file
- Uploads the new docker image to Docker Hub (you can use your favorite repository)
- Creates a new deployment configuration file from the template deploy.yaml and uploads it to the configured Kubernetes cluster
The only thing you should have to add manually is the service. To do this use:
kubectl expose image-recognizer
after the first build (which will create the deployment).
If you want to set up a Kubernetes ingress the ingress.yaml file is available as an example. Remember to install the NGINX ingress controll er:
minikube addons enable ingress
Instead of launching the docker image on your local Kubernetes cluster as described above, it can also be run directly on Amazon ECS (Elastic Container Service). The easiest way to run it is to use the getting started wizard and specify the repo location "docker.io/jonasberlin/ci-demo:<version>". Check Docker Hub for the version number. Just use the defaults and don't specify a load balancer. If you do you have to start understanding VPCs, security groups, health monitors, etc. etc.
To get it all to work you have to set up a few things in Jenkins.
Create a new Jenkins pipeline and set up the connection to the Git repository. To do this you also have to setup up credentials to connect to your repository. My setup looks like this:
The Kubernetes configuration is a little bit more tricky as you have to make sure the Jenkins build machine has access to the Kubernetes certs. My setup looks like this: