Heart-Disease-Classification

This Google Collab notebook is designed to predict the presence of Cardiovascular Disease using the patient examination results on Kaggle 70000 Records Cardio disease dataset. The dataset is split on the basis of gender and K-Mode Clustering is applied to further divide the dataset into 4 clusters. This resulted in an increase of almost 10% accuracy compared to previous models.

The algorithm used for this project includes XGBoost, RandomForest, and Multi-Layer Perceptron. The highest accuracy achieved was 87.28% with the Multi-Layer Perceptron model.

To run this notebook, simply open the notebook in Google Collab and follow the instructions. Make sure to upload the dataset to the notebook before running it.

FEATURES

• Gender-based dataset splitting: The dataset is split on the basis of gender for better accuracy.

• K-Mode Clustering: K-Mode Clustering is applied to further divide the dataset into 4 clusters, which improves accuracy.

• Multiple algorithms: The project uses XGBoost, RandomForest, and Multi-Layer Perceptron algorithms for classification.

• High accuracy: The highest accuracy achieved was 87.28% with the Multi-Layer Perceptron model.

We hope that this project helps in the early detection and prevention of Cardiovascular Disease.

Notebook contains:

Exploratory data analysis (EDA) - the process of going through a dataset and finding out more about it.
Model training - create model(s) to learn to predict a target variable based on other variables.
Model evaluation - evaluating a models predictions using problem-specific evaluation metrics.
Model comparison - comparing several different models to find the best one.
Model fine-tuning - once we've found a good model, how can we improve it?
Feature importance - since we're predicting the presence of heart disease, are there some things which are more important for prediction?
Cross-validation - if we do build a good model, can we be sure it will work on unseen data?
Reporting what we've found - if we had to present our work, what would we show someone?

To work through these topics, we'll use pandas, Matplotlib and NumPy for data anaylsis, as well as, Scikit-Learn for machine learning and modelling tasks.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
Heart-Disease-Classification-main		Heart-Disease-Classification-main
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Heart-Disease-Classification

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Heart-Disease-Classification

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages