- Python: Core programming language.
- Scikit-learn: For machine learning models.
- Pandas & NumPy: For data manipulation and analysis.
- XGBoost: For gradient boosting models.
- Streamlit: For creating an interactive user interface.
- Data Preprocessing: Handles cleaning, normalization, and feature selection.
- Prediction Engine: Combines multiple classifiers (Random Forest, Gradient Boosting, etc.) to enhance accuracy.
- Recommendation System: Generates advice based on prediction outcomes.
- Cross-validation: Performed using K-Fold cross-validation to ensure model robustness.
- Performance Metrics: Models were evaluated based on accuracy, precision, recall, F1 score, and ROC-AUC.
- Best Model: Ensemble classifier achieved an accuracy of 91.9%.
- Feature Importance: Highlighted key symptoms contributing to accurate predictions
Real-Time Data Integration: Incorporating real-time data to enhance prediction accuracy. Mobile Application: Developing a mobile-friendly version for wider accessibility. Additional Features: Expanding the system to include preventive recommendations.
Hastie, T., Tibshirani, R., & Friedman, J. The Elements of Statistical Learning. Chen, T., & Guestrin, C. XGBoost: A Scalable Tree Boosting System. Pedregosa, F., et al. Scikit-learn: Machine Learning in Python.