Pulsar Dataset Analysis - First Year Data Science Project

This repository contains the code and analysis for our first-year data science project at the University of British Columbia (UBC). The goal of the project was to train a model to analyze and classify a pulsar dataset, focusing on eight key variables:

Mean of the integrated profile.
Standard deviation of the integrated profile.
Excess kurtosis of the integrated profile.
Skewness of the integrated profile.
Mean of the DM-SNR curve.
Standard deviation of the DM-SNR curve.
Excess kurtosis of the DM-SNR curve.
Skewness of the DM-SNR curve.

Dataset and License

The dataset used for this project is publicly available and can be accessed from Kaggle here and is distributed under the CC BY-NC-SA 4.0 license.

Team Members

Danial Ramzan (me) - Code implementation using R in Jupyter Notebook
L. L. - Text and analysis
X. L. - Text and analysis

Project Overview

In this project, we utilized the R programming language within a Jupyter Notebook to perform data analysis and build a classification model. The dataset contains information on pulsars, and our objective was to classify whether a particular observation corresponds to a pulsar star or not. We did this by working with the eight specified variables, which we isolated to the ones which would play a crucial role in achieving accurate predictions.

Approach

Our team followed an iterative and collaborative approach to complete the project successfully. I primarily worked on implementing the code using R, handling data preprocessing, feature selection, and model training in a Jupyter Notebook. L. L. and X. L. took charge of the textual aspects, including data exploration, analysis of results, and crafting the project report.

Through trial and error, coupled with thorough analysis and discussion among team members, we were able to fine-tune our model and achieve an impressive classification accuracy of 95%. Our hard work and dedication led to interestingly, also receiving a project grade of 95%.

Conclusion

This project provided us with a valuable opportunity to apply our data science skills and collaborate effectively as a team. We successfully analyzed the pulsar dataset, developed a robust classification model, and achieved outstanding results. We wish to express our gratitude to our course instructors for their guidance and support throughout the project and for designing an amazing course.

Note: The full names of our team members, L. L. and X. L., have been redacted in this README for privacy reasons.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
LICENSE		LICENSE
README.md		README.md
writeup.ipynb		writeup.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pulsar Dataset Analysis - First Year Data Science Project

Dataset and License

Team Members

Project Overview

Approach

Conclusion

About

Releases

Packages

Languages

License

danialramzan/pulsar-classifier

Folders and files

Latest commit

History

Repository files navigation

Pulsar Dataset Analysis - First Year Data Science Project

Dataset and License

Team Members

Project Overview

Approach

Conclusion

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages