In order to complete IBM's Professional Data Science Certificate, I completed a capstone project using what I had learned from the previous courses. In this project, I analyze historical SpaceX rocket data in order to make accurate predictions about future launch attempts. This project is split up into 8 distinct Jupyter notebooks, which I will briefly describe below.
In this module, I performed the following tasks:
- Utilized SpaceX's API to collect data through GET requests
- Cleaned / parsed the data using pandas, numpy
In this module, I performed the following tasks:
- Used pandas, numpy to perform EDA
- Understand important metrics, patterns from my imported data
In this module, I performed the following tasks:
- Extracted launch data (HTML tables) from Wikipedia
- Parse table, convert to Pandas dataframe
- Use HTTP get requests
In this module, I performed the following tasks:
- Use Matplotlib and Seaborn to visualize / compare different features of my dataset
- Perform feature engineering and create dummy variables
In this module, I performed the following tasks:
- Utilized / connected to IBM's DB2 database to perform SQL queries
- Used SQL to perform further EDA
In this module, I performed the following tasks:
- Used Folium to mark launch sites on map of USA
- Mark successful vs failed launches on map
- Create clusters using Marker Cluster
- Calculate distances from site to proximities
In this module, I performed the following tasks:
- Used JupyterDash to create a Dash application
- Designed app using HTML
- Utilized callback functionality within App
- Ran app on external server
In this module, I performed the following tasks:
- Used sklearn preprocessing module to standardize my data
- Split data into Train/Test split
- Applied 3 different classification algorithms on data to predict future launch outcomes: Logistic Regression, SVM, Decision Tree, KNN
- Used Grid Search to find best parameters, mapped outcome to confusion matrix and analyzed best model
- Python3
- Pandas
- Numpy
- BeautifulSoup
- Matplotlib
- Seaborn
- SQL
- Folium
- Dash
- HTML
- Sklearn