oc-spark-cluster

It is based on https://github.com/gettyimages/docker-spark project The difference is that it allows launching a spark cluster on the latest spark version with 2 or more nodes, and it gives a Dockerfile that allows to inject your dependencies in the spark jars folder.

spark

A debian:jessie based Spark container. Use it in a standalone cluster with the accompanying docker-compose.yml, or as a base for more complex recipes.

docker example

To run SparkPi, run the image with Docker:

docker run --rm -it -p 4040:4040 wassim/oc-spark-cluster bin/run-example SparkPi 10

To start spark-shell with your AWS credentials:

docker run --rm -it -e "AWS_ACCESS_KEY_ID=YOURKEY" -e "AWS_SECRET_ACCESS_KEY=YOURSECRET" -p 4040:4040 wassim/oc-spark-cluster bin/spark-shell

To do a thing with Pyspark

echo "import pyspark\nprint(pyspark.SparkContext().parallelize(range(0, 10)).count())" > count.py
docker run --rm -it -p 4040:4040 -v $(pwd)/count.py:/count.py wassim/oc-spark-cluster bin/spark-submit /count.py

docker-compose example

To create a simplistic standalone cluster with docker-compose:

docker-compose up

The SparkUI will be running at http://${YOUR_DOCKER_HOST}:8082 with one worker listed. To run pyspark, exec into a container:

docker exec -it ocsparkcluster_master_1 /bin/bash
bin/pyspark

To run SparkPi, exec into a container:

docker exec -it ocsparkcluster_master_1 /bin/bash
bin/run-example SparkPi 10

Scale In/Out

To add more workers or remove workers you can use the scale command of docker-compose :

docker-compose scale worker=2

Access the worker or master nodes

docker exec -ti ocsparkcluster_master_1 bash
docker exec -ti ocsparkcluster_worker_1  bash

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
conf		conf
dep		dep
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

oc-spark-cluster

spark

docker example

docker-compose example

Scale In/Out

Access the worker or master nodes

About

Uh oh!

Releases

Packages

License

wassimd/oc-spark-cluster

Folders and files

Latest commit

History

Repository files navigation

oc-spark-cluster

spark

docker example

docker-compose example

Scale In/Out

Access the worker or master nodes

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages