Skip to content

priyankaz-ml/toxicity-prediction-gnn

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

3 Commits
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐Ÿงช Molecular Toxicity Prediction with Graph Neural Networks (GNNs)

Predicting molecular toxicity is crucial for drug discovery, chemical safety, and environmental research. This project explores both classical ML models and Graph Neural Networks (GNNs) on molecular graph data, comparing their performance on toxicity prediction.

๐Ÿš€ Features

Dataset: Tox21 โ€“ benchmark dataset for molecular toxicity prediction

Models Implemented:

Random Forest โ€“ robust baseline

XGBoost โ€“ gradient boosting model

Graph Isomorphism Network (GIN) โ€“ deep learning model for graph data

Evaluation Metrics: ROC-AUC for classification tasks

๐Ÿ“Š Model Performance

Model ROC-AUC Notes
Random Forest 0.84 Best baseline
XGBoost 0.79 Strong classical model
GIN (GNN) 0.819 Per-task ROC-AUC: 0.70 โ€“ 0.87

Key Insights:

Classical ML (Random Forest) still outperforms deep learning for this dataset.

GNNs are competitive and scale well for graph-structured data.

Visualizations (training curves, boxplots) reveal per-task variability and model stability.

๐Ÿ›  Tech Stack

Python ๐Ÿ

PyTorch Geometric (GNNs)

Scikit-learn (Random Forest, XGBoost)

Pandas & NumPy

Matplotlib

๐Ÿ“ˆ Visuals

Per-task ROC-AUC comparison image

GNN Training Curve image

Boxplot of per-task AUCs image

๐Ÿ”ฎ Future Work

Explore other GNNs like GCN, GraphSAGE, GAT for better performance.

Apply hyperparameter tuning and model ensembling.

Add explainability to understand key molecular features.

Expand datasets and try data augmentation for improved generalization.

Build a web app/API for real-time toxicity prediction.

๐Ÿ“ Author

Priyanka

About

Molecular toxicity prediction using classical ML models and Graph Neural Networks (GIN)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published