- Make sure you have Docker installed and running. If not, then install the latest docker version: https://docs.docker.com/engine/install/
git clone
this project.cd revive
. Make sure you don't have an already running mysql service on 3306 port. If yes, stop the existing mysql servicedocker-compose build
docker-compose up
- Wait for spark job to finish. Step 4 logs should end with "spark exited with code 0", or, if you ran step 4 command in detach (-d) mode then wait for 20 seconds
- Unit tests will automatically run as a part of step 4, it takes around 30 seconds to start executing unit test cases. To run unittest container on demand, run:
docker-compose up unittest
- If you get "Port already in use" error while runing step 3 (from "How to run"), then find and kill the process already running on that port. Re-run step 4
- Clean up existing container:
docker-compose down --remove-orphans
ERROR: for mysql Cannot start service mysql: Ports are not available: listen tcp 0.0.0.0:3306: bind: address already in use
- Make sure you don't have an already running mysql service on 3306 port.
- Currently, flask app is not production ready. Move Flask app to nginx
- Make batch pipeline more scalable by adding spark executors
- Take input data from a S3/GCS bucket