EIE4121 Password Strength Classification Project

Project Overview

This is a mini-project for the EIE4121 Machine Learning for Cyber-security course at The Hong Kong Polytechnic University. The goal is to develop a machine learning model to classify password strength on a scale from 0 to 4, where 0 represents very weak passwords and 4 represents very strong passwords.

Team Members

21106181D Chen Chen
JamesHsu-porcupine

Repository Structure

├── data/                      # Password datasets
│   └── password_Set1.csv      # Main dataset file
├── docs/                      # Project documentation and reports
│   ├── MiniProject.doc        # Project document (Word format)
│   └── Miniproject.pdf        # Project document (PDF format)
├── model/MachineLearning/     # Traditional ML models
│   ├── KNN                    # K-Nearest Neighbors model
│   └── RF                     # Random Forest model
├── notebooks/DeepLearning/    # Deep learning implementation notebooks
│   ├── GP19project_EIE4121_DEEPLEA...  # Deep learning implementation
│   └── GP19project_EIE4121_EDA_Che...  # EDA notebook
├── .gitattributes             # Git attributes file
└── README.md                  # This file

Project Description

This project focuses on developing and comparing different machine learning approaches for password strength classification. We implement both traditional machine learning algorithms (KNN, Random Forest) and deep learning models to classify passwords into five strength categories.

Dataset

The dataset (password_Set1.csv) contains password samples with the following features:

password: The password string
strength: Password strength level (0-4)
- 0: Very Weak
- 1: Weak
- 2: Average
- 3: Strong
- 4: Very Strong

Methodology

Our approach involves:

Exploratory Data Analysis (EDA) to understand password characteristics
Feature Engineering to extract meaningful features from passwords:
- Length, character diversity, entropy
- Character type counts and ratios
- Pattern detection (sequential and repeated characters)
Model Implementation:
- Traditional ML: K-Nearest Neighbors, Random Forest
- Deep Learning: Hybrid CNN-LSTM model with character embeddings

Deep Learning Model

Our deep learning approach combines:

Character-level embeddings to capture semantic information
CNN layers to detect local patterns
LSTM layers to understand sequential patterns
Numerical features to incorporate password characteristics
Class weighting to handle imbalanced data

Results

Performance metrics for each model are evaluated using:

Accuracy
Precision, Recall, F1-score
Confusion matrix
Per-class performance

Usage

To use the notebooks:

Clone the repository

Install required dependencies:

pip install pandas numpy tensorflow scikit-learn matplotlib seaborn

Run the notebooks in the following order:
- EDA notebook
- Deep learning implementation

Future Work

Implement ensemble methods
Explore additional feature engineering techniques
Optimize hyperparameters
Develop a user-friendly password strength checker

References

Course materials from EIE4121
Relevant research papers on password strength classification
Documentation for scikit-learn, TensorFlow, and other libraries used

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

EIE4121 Password Strength Classification Project

Project Overview

Team Members

Repository Structure

Project Description

Dataset

Methodology

Deep Learning Model

Results

Usage

Future Work

References

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
data		data
docs		docs
model/MachineLearning		model/MachineLearning
notebooks		notebooks
.gitattributes		.gitattributes
README.md		README.md

KrisameReimu/Password-Strength-Classification_ex

Folders and files

Latest commit

History

Repository files navigation

EIE4121 Password Strength Classification Project

Project Overview

Team Members

Repository Structure

Project Description

Dataset

Methodology

Deep Learning Model

Results

Usage

Future Work

References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages