Skip to content

cmishra/Text-Clustering

Repository files navigation

Text-Clustering

Implementation of clustering Yelp reviews with Naive Bayes/KNN. Java 8 required (Java Streams used for multithreading).

Files go through entire process from reading in files, stemming, eliminating stopwards, and storing them as sparse matrices or language models to getting clusters based on Naive Bayes and random projection KNN.

The models were compared with cross validation (also implemented).

About

Implementation of clustering Yelp reviews with Naive Bayes/KNN.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published