CategorizeAI: Multi-Model NLP Text Classification Platform

A versatile NLP text classification application that seamlessly integrates supervised learning, zero-shot classification, and LLM powered classification reasoning within an intuitive interface. Designed for precision and adaptability, it enables sophisticated text categorization across diverse domains and use cases while supporting dynamic experimentation and testing.

🌟 Key Features

Multi-Model Architecture: Choose from three distinct classification approaches based on your specific domain, technical needs and data availability.
Supervised Classification: Complete pipeline from experiment creation and analysis to model training and prediction.
Zero-Shot Classification: Classify text into any categories without specific training using state-of-the-art transformer models.
LLM-Powered Classification: Utilize local LLMs through Ollama for context-aware, reasoning-based classification.
Experiment Management: Save, load, and track different classification experiments with customizable settings for reproducible results.
Responsive Design: Modern, responsive UI with animated components and intuitive navigation.

Streamlit Demo APP

👏 Acknowledgments

📋 Prerequisites

Python 3.10 or higher
Streamlit
Transformers
For LLM Classification: Ollama API (latest version)
8GB+ RAM (16GB+ recommended for LLM classification)

⚙️ Installation

Clone the Repository

git clone https://github.com/TsLu1s/categorizeai.git
cd categorizeai

Set Up Conda Environment

First, ensure you have Conda installed. Then create and activate a new environment with Python 3.10:

# Create new environment
conda create -n categorizeai_env python=3.10

# Activate the environment
conda activate categorizeai_env

Install Dependencies

pip install -r requirements.txt

Install Ollama

Visit Ollama API and follow the installation instructions for your operating system.

Start the Application

streamlit run navigation.py

💻 Usage & Architecture

🏠 Home Page

Overview of all classification approaches, use cases, and documentation resources. Designed for easy navigation and quick understanding of application capabilities.

Homepage dashboard showcasing the three core classification methodologies
Explore detailed technical specifications and comparative performance capabilities across approaches
Review comprehensive use case scenarios with external data source samples recommendations

🎯 Supervised Classification

Train specialized models with your labeled data. Ideal for domain-specific classifications, efficiency requirements, and data with very consistent patterns.

Experiments Analysis
- Upload and configure your experiements with specific context and labeled classes
- Generate comprehensive distributional and statistical analysis
- Visualize lexical distributions, word frequency patterns, and semantic clustering
Model Training
- Select experiment datasets and configure train/test split parameters
- Process text data with automated preprocessing mechanisms
- Visualize performance metrics including classification reports and word feature importance
Real-Time Prediction
- Select from trained models across different experiments
- Enter text directly for immediate classification analysis
- Visualize prediction probability distribution across all potential categories
Batch Prediction
- Process and classify multiple text files simultaneously using trained classification models
- Visualize prediction distributions and confidence levels across files
- Export detailed prediction results with probability scores for each category

🔮 Zero-Shot Classification

Classify text without training using pre-trained transformer models. Perfect for dynamic categorization needs, potentially limited labeled data scenarios, and effective implementation requirements.

Batch Prediction
- Select from state-of-the-art transformer-based models optimized for zero-shot classification
- Configure custom classification labels and hypothesis templates with domain-specific contexts
- Visualize prediction distributions and confidence levels across files
- Export comprehensive results with classification metadata

🤖 LLM Text Classification

Leverage any of the Ollama LLMs for advanced classification tasks. Best suited for complex classification contexts, nuanced text understanding, and reasoning-based categorization.

LLMs Management
- Browse and download Large Language Models through the Ollama API
- Manage installed models through an intuitive interface with detailed technical information
- Install custom models by name with real-time download tracking
Batch Prediction
- Select from downloaded LLM to process multiple text files simultaneously
- Create detailed context prompts with domain-specific instructions for classification purposes
- Visualize prediction distributions and confidence levels across files
- Export structured results with probabilities for each category

🤝 Contributing

Fork the repository
Create your feature branch (git checkout -b feature/AmazingFeature)
Commit your changes (git commit -m 'Add some AmazingFeature')
Push to the branch (git push origin feature/AmazingFeature)
Open a Pull Request

📄 License

Distributed under the MIT License. See LICENSE for more information.

🔗 Contact

Luis Santos - LinkedIn

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
scheme		scheme
st_pages		st_pages
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
home.py		home.py
navigation.py		navigation.py
requirements.txt		requirements.txt
styles.css		styles.css

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CategorizeAI: Multi-Model NLP Text Classification Platform

🌟 Key Features

Streamlit Demo APP

👏 Acknowledgments

📋 Prerequisites

⚙️ Installation

💻 Usage & Architecture

🏠 Home Page

🎯 Supervised Classification

🔮 Zero-Shot Classification

🤖 LLM Text Classification

🤝 Contributing

📄 License

🔗 Contact

About

Releases

Packages

Languages

License

TsLu1s/categorizeai

Folders and files

Latest commit

History

Repository files navigation

CategorizeAI: Multi-Model NLP Text Classification Platform

🌟 Key Features

Streamlit Demo APP

👏 Acknowledgments

📋 Prerequisites

⚙️ Installation

💻 Usage & Architecture

🏠 Home Page

🎯 Supervised Classification

🔮 Zero-Shot Classification

🤖 LLM Text Classification

🤝 Contributing

📄 License

🔗 Contact

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages