Skip to content

Latest commit

 

History

History
49 lines (36 loc) · 2.5 KB

README.md

File metadata and controls

49 lines (36 loc) · 2.5 KB

Predicting Penguins species

ML project:- supervised learning

Made by:-Koustubh sinha

Penguin dataset where we want to predict the species of penguins based on certain feature using appropriate model.

About penguin dataset

It is a great intro dataset for data exploration & visualization

The dataset consists of 7 columns.

-species: penguin species (Chinstrap, Adélie, or Gentoo) -culmen_length_mm: culmen length (mm) -culmen_depth_mm: culmen depth (mm) -flipper_length_mm: flipper length (mm) -body_mass_g: body mass (g) -island: island name (Dream, Torgersen, or Biscoe) in the Palmer Archipelago (Antarctica) -sex: penguin sex

What are culmen length & depth? p

What are flippers? Pen

The description of dataset and detailed exploratory analysis of it is done in predicting_penguins.ipynb file PLEASE go through it

Getting started

Workflow:-

* Step 1 :- Data Cleaning and visualization

Done through different plots.

* Step 2:- Exploratory data analysis and data preprocessing

Understand the data and make the dataset ready to be fitted in model(label encoder,normalization).

Screenshot 2022-12-18 192449 Screenshot 2022-12-18 191547

* Step 3:- Model Training and selection

Screenshot 2022-12-18 191609 As seen from the step 2, Knc has the best parameters to satisfy our dataset.

About KNeighborsClassifier

  • KNeighborsClassifier is one of the simplest Machine Learning algorithms based on Supervised Learning technique.
  • KNC algorithm assumes the similarity between the new case/data and available cases and put the new case into the category that is most similar to the available categories.
  • KNC algorithm stores all the available data and classifies a new data point based on the similarity. This means when new data appears then it can be easily classified into a well suite category by using KNC algorithm.

* Step 4:- Testing new data points

Screenshot 2022-12-18 193326