GitHub - bzzling/colon-cancer-classification

Colon Cancer Classification using Neural Networks

This project aims to classify images in the given dataset as colon adenocarcinoma or benign colon tissue, making use of various neural network architectures and the PyTorch library.

Overview:

Currently, diagnostic measures diagnosing colon cancer involves a relatively simple procedure called a colonoscopy. While physicians are generally able to extract potentially cancerous polyps, they cannot tell simply by looking at them whether or not the polyps are cancerous without sending them to the lab for further analysis. It is in the lab where errors can be made when determining the cancerous nature of polyps, since certain polyps may be mistakenly overlooked. In this regard, machine learning holds immense promise for improving the screening efforts of physicians to prevent colorectal cancer cases and improve early-detection efforts. By training a neural network on existing lab samples of colon tissue, we can hope to accurately identify polyp samples (which are simply extensions of tissue) when they reach the lab. The below graphic contains 16 labelled image samples that were used in the training of the model. "colon-aca" and "colon-n" represent colon adenocarcinoma and benign colon tissue, respectively.

Results:

Various model architectures were experimented with, each providing different accuracy on test data
- CNN: 55.6%
- Fine-tuning VGG-16: N/A
- Fine-tuning ResNet: N/A
- VGG-16 + Random Forest: 99.0%
- ResNet + Random Forest: 99.8%
Best hyperparameters overall
- SGD optimizer
- cross-entropy loss
- lr from 0.0004 to 0.004
- batch size of 32
- 30 epochs

Data & Data Processing:

10 000 images of colon adenocarcinoma and benign colon tissue
- training data / validation data split = 80/20
Various transforms were applied to the image data for better generalization
- resize to 180 x 180 for the CNN or 224 x 224 pixels for VGG16/ResNet
- random horizontal flip for the CNN
- random rotation by 10 degrees for the CNN
- transform to tensor
- normalize tensor data

Try it:

Refer to the report and code for further details and analysis.

Try the most accurate model (ResNet feature extraction + random forest) by uploading your own samples here, as demonstrated in the image below.

Acknowledgments:

Borkowski AA, Bui MM, Thomas LB, Wilson CP, DeLand LA, Mastorides SM. Lung and Colon Cancer Histopathological Image Dataset (LC25000). arXiv:1912.12142v1 [eess.IV], 2019

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
.gradio		.gradio
assets		assets
code		code
dataset		dataset
examples		examples
models		models
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
inference.py		inference.py
report.pdf		report.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Colon Cancer Classification using Neural Networks

Overview:

Results:

Data & Data Processing:

Try it:

Acknowledgments:

About

Releases

Packages

Languages

License

bzzling/colon-cancer-classification

Folders and files

Latest commit

History

Repository files navigation

Colon Cancer Classification using Neural Networks

Overview:

Results:

Data & Data Processing:

Try it:

Acknowledgments:

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages