This project is an end-to-end stock market prediction system that automates data extraction, transformation, model training, and prediction delivery. It is powered by AlphaVantage, FastAPI, MLflow, Docker, and GitHub Actions — and is live on an AWS EC2 instance.
🔗 API Base URL: http://13.60.179.140/
📘 Swagger Docs: http://13.60.179.140/docs
- ✅ ETL pipeline to fetch and process stock data via AlphaVantage
- ✅ 170+ technical indicators with statistical feature selection (41 selected)
- ✅ MLPClassifier model for binary stock movement prediction
- ✅ FastAPI server for real-time prediction and historical data access
- ✅ MLflow for experiment tracking and model versioning
- ✅ Dockerized for easy deployment
- ✅ CI/CD with GitHub Actions
- ✅ Deployed and running on AWS EC2
stock-etl-pipeline/
│
├── app/ # FastAPI app logic
│ ├── main.py
│ ├── models/ #Saved Models
│ ├── mlruns/ # MLflow tracking
│ ├── data/
│ │ ├── raw/ #Raw Dataset
│ │ ├── processed/ #Dataset after feature engineering
│ │ ├── results/ # Model Training Results
│ │
│ └── scripts/
│ ├──api_data_fetcher.py/ # ETL scripts
│ ├── config.py
│ ├── database_handler.py
│ ├── feature_engineering.py
│ ├── model_trainer.py
│ └── statistical_measures.py
│
├── .github/workflows/ # CI/CD configs
├── Dockerfile
├── requirements.txt
└── README.md
- Clone the repo
git clone https://github.com/your-username/stock-etl-pipeline.git
cd stock-etl-pipeline- Set environment variable
Create a v.env file with your API key from AlphaVantage:
API_KEY="your_key_here"
DB_NAME="data.sqlite"
MODEL_DIR="models"
- Build and run Docker containers
docker-compose up --build- Access API locally
Go to http://localhost:8000/docs
- Type: MLPClassifier (Multi-layer perceptron)
- Input: 41 selected technical indicators
- Output: Up or Down stock movement
- Training Pipeline: Includes feature scaling, selection, and resampling (if needed)
- Tracking: Experiments tracked using MLflow
- GitHub Actions runs linting, testing, and Docker build on push
- Deployment handled via SSH to EC2 or Docker Compose on server
- ML models are versioned and logged via MLflow
POST /historical_prediction/– Make Predictions Based on Existing DataPOST /predict_stock_directions/– Get Prediction for Input FeaturesPOST /load_data_from_api/– Get data from AlphaVantageGET /docs– Interactive API Documentation (Swagger UI)GET /model_accuracy– Get Model Accuracy
- Python · FastAPI · Scikit-learn · MLflow · Docker · GitHub Actions · AWS EC2 · AlphaVantage API
Ridwan Yusuf: Data Scientist | Exploring MLOps