PDF-QA Chatbot with Local LLM (llama3)

A web application to upload PDFs, embed their content into vector stores (Qdrant), and interactively ask questions answered by a local LLM (Ollama) based on PDF content.

Features

Upload PDF documents and extract text chunks
Generate embeddings with a pre-trained embedder
Store embeddings in a Qdrant vector database per document
Query vector store to retrieve relevant chunks
Context-aware question answering via local llama3 model using Ollama API
React frontend with live chat interface
FastAPI backend serving API endpoints with CORS support

Tech Stack

Backend: FastAPI, Qdrant, Python
Frontend: React, Vite, Axios
Embedding: Sentence Transformers
LLM: Local Ollama server running llama3
Storage: Qdrant vector DB + pickled text chunks on disk

Prerequisites

Hardware

NVIDIA GPU recommended (RTX 3060 and higher)
At least 16 GB RAM, 32 GB+ recommended

Software

Python 3.8+
Node.js 16+
Ollama installed and running locally with llama3 model
NVIDIA CUDA drivers (for GPU acceleration)

Start Qdrant locally if installed natively (check their repo)

Qdrant (install with pip install qdrant-client or run via Docker)

Setup

1. Clone the repo

git clone https://github.com/NutrinoDaya/chat-with-your-pdf.git
cd chat-with-your-pdf

2. Backend Setup

cd backend
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r requirements.txt

3. Frontend Setup

cd ../frontend
npm install

4. Run Ollama (Locally)

Make sure Ollama is installed and the llama3 model is pulled.

ollama pull llama3
ollama run llama3

5. Run Qdrant Locally

Option 1: Using Docker

docker run -p 6333:6333 -p 6334:6334 qdrant/qdrant

Option 2: Native Python (No Docker)

pip install qdrant-client
# Start Qdrant locally if installed natively (check their docs)

Running the Project

Start Backend

cd backend
uvicorn main:app --host 0.0.0.0 --port 8000 --reload

Start Frontend

cd frontend
npm run dev

Open your browser to http://localhost:5173

Usage

Upload a PDF via the frontend UI
Wait for the PDF to be processed and embedded (a document ID is generated)
Ask questions about the PDF content in the chatbox
Answers are generated based on retrieved vector chunks using your local LLM

Project Structure

/
├── backend/
│   ├── main.py           # FastAPI app entrypoint
│   ├── routes/
│   │   ├── pdf.py        # PDF upload & processing endpoints
│   │   └── chat.py       # Chat endpoint querying LLM
│   ├── services/
│   │   ├── embedder.py   # Embedding model wrapper
│   │   ├── qdrant_store.py # Qdrant vector store wrapper
│   │   ├── llm.py        # Ollama LLM API interface
│   │   └── pdf_processor.py # PDF text extraction & chunking
│   └── vector_store_data/  # Stored chunks per doc
├── frontend/
│   ├── src/
│   │   ├── components/
│   │   │   ├── ChatBox.jsx
│   │   │   └── FileUploader.jsx
│   │   ├── App.jsx
│   │   └── main.jsx
│   └── vite.config.js
└── README.md

Error Handling & Debugging

Ollama connection issues:
Backend returns error if Ollama is unreachable or model not running.
Make sure Ollama service is running on http://localhost:11434.
Missing document vector store:
Occurs if chat requested for unprocessed or deleted document ID.
Empty or no relevant chunks:
Happens when PDF content is too sparse or question unrelated.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PDF-QA Chatbot with Local LLM (llama3)

Features

Tech Stack

Prerequisites

Hardware

Software

Start Qdrant locally if installed natively (check their repo)

Setup

1. Clone the repo

2. Backend Setup

3. Frontend Setup

4. Run Ollama (Locally)

5. Run Qdrant Locally

Option 1: Using Docker

Option 2: Native Python (No Docker)

Running the Project

Start Backend

Start Frontend

Usage

Project Structure

Error Handling & Debugging

License

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

PDF-QA Chatbot with Local LLM (llama3)

Features

Tech Stack

Prerequisites

Hardware

Software

Start Qdrant locally if installed natively (check their repo)

Setup

1. Clone the repo

2. Backend Setup

3. Frontend Setup

4. Run Ollama (Locally)

5. Run Qdrant Locally

Option 1: Using Docker

Option 2: Native Python (No Docker)

Running the Project

Start Backend

Start Frontend

Usage

Project Structure

Error Handling & Debugging

License