Skip to content

This is a streamlit app that classifies some reddit users' comments into certain categories. Leveraging the BERT transformer model, data was trained and classified into categories that were eventually used for the app creation.

Notifications You must be signed in to change notification settings

Jess607/Reddit-usernames-comments

Repository files navigation

Reddit Comment Classification App

This project is a Streamlit web application that classifies reddit comments into three categories: "Veterinarian", "Medical Doctor", or "Others". The classification is performed using a fine-tuned BERT model.

Features

  • Upload a CSV file containing a column named 'comments'
  • Preprocess comments by removing duplicates and preserving order
  • Classify each comment into one of three categories
  • Download the classified results as a CSV file

Setup Instructions

Prerequisites

  • Python 3

Clone the Repository

  1. Open your terminal or command prompt.

  2. Clone the repository using the following command:

    git clone <your-repo-url>
  3. Navigate to the project directory:

    cd <your-repo-name>

Set Up the Virtual Environment

  1. Create a virtual environment:

    python3 -m venv myenv
  2. Activate the virtual environment:

    • On macOS/Linux:

      source myenv/bin/activate
    • On Windows:

      myenv\Scripts\activate
  3. Install the required packages:

    pip install -r requirements.txt

Run the Streamlit App

  1. Ensure the virtual environment is activated.

  2. Run the Streamlit app:

    streamlit run app.py
  3. Open your web browser and navigate to the URL provided by Streamlit (usually http://localhost:8501).

Usage

  1. Upload a CSV file containing a column named 'comments'.
  2. The app will preprocess and classify each comment.
  3. View the classified results in the app.
  4. Download the classified results as a CSV file.

Error handling on page

  1. In a case where streamlit app throws a 'Cannot import BertTokenizer' error, simply refresh page to get required frontend.

About

This is a streamlit app that classifies some reddit users' comments into certain categories. Leveraging the BERT transformer model, data was trained and classified into categories that were eventually used for the app creation.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published