A web application to upload PDFs, embed their content into vector stores (Qdrant), and interactively ask questions answered by a local LLM (Ollama) based on PDF content.
- Upload PDF documents and extract text chunks
- Generate embeddings with a pre-trained embedder
- Store embeddings in a Qdrant vector database per document
- Query vector store to retrieve relevant chunks
- Context-aware question answering via local llama3 model using Ollama API
- React frontend with live chat interface
- FastAPI backend serving API endpoints with CORS support
- Backend: FastAPI, Qdrant, Python
- Frontend: React, Vite, Axios
- Embedding: Sentence Transformers
- LLM: Local Ollama server running llama3
- Storage: Qdrant vector DB + pickled text chunks on disk
- NVIDIA GPU recommended (RTX 3060 and higher)
- At least 16 GB RAM, 32 GB+ recommended
- Python 3.8+
- Node.js 16+
- Ollama installed and running locally with llama3 model
- NVIDIA CUDA drivers (for GPU acceleration)
- Qdrant (install with
pip install qdrant-clientor run via Docker)
git clone https://github.com/NutrinoDaya/chat-with-your-pdf.git
cd chat-with-your-pdfcd backend
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -r requirements.txtcd ../frontend
npm installMake sure Ollama is installed and the llama3 model is pulled.
ollama pull llama3
ollama run llama3docker run -p 6333:6333 -p 6334:6334 qdrant/qdrantpip install qdrant-client
# Start Qdrant locally if installed natively (check their docs)cd backend
uvicorn main:app --host 0.0.0.0 --port 8000 --reloadcd frontend
npm run devOpen your browser to http://localhost:5173
- Upload a PDF via the frontend UI
- Wait for the PDF to be processed and embedded (a document ID is generated)
- Ask questions about the PDF content in the chatbox
- Answers are generated based on retrieved vector chunks using your local LLM
/
├── backend/
│ ├── main.py # FastAPI app entrypoint
│ ├── routes/
│ │ ├── pdf.py # PDF upload & processing endpoints
│ │ └── chat.py # Chat endpoint querying LLM
│ ├── services/
│ │ ├── embedder.py # Embedding model wrapper
│ │ ├── qdrant_store.py # Qdrant vector store wrapper
│ │ ├── llm.py # Ollama LLM API interface
│ │ └── pdf_processor.py # PDF text extraction & chunking
│ └── vector_store_data/ # Stored chunks per doc
├── frontend/
│ ├── src/
│ │ ├── components/
│ │ │ ├── ChatBox.jsx
│ │ │ └── FileUploader.jsx
│ │ ├── App.jsx
│ │ └── main.jsx
│ └── vite.config.js
└── README.md
-
Ollama connection issues:
Backend returns error if Ollama is unreachable or model not running.
Make sure Ollama service is running onhttp://localhost:11434. -
Missing document vector store:
Occurs if chat requested for unprocessed or deleted document ID. -
Empty or no relevant chunks:
Happens when PDF content is too sparse or question unrelated.
MIT License © Mohammad Dayarneh