CoC Inheritance 2025
GitSage: Code Confusion? We've Git You Covered
By Team GitSage
Table of Contents
GitSage is an AI-powered repository intelligence system that enables users to:
- Ask natural language questions about any GitHub repository
- Automatically generate structured documentation
- Compare two repositories intelligently
It solves the problem of developer onboarding and repository understanding using:
- Retrieval-Augmented Generation (RAG)
- Code + text embeddings
- Persistent vector database search
- Large Language Models for reasoning
GitSage transforms raw source code into structured insights.
- 🔗 GitHub Repository – Explore GitSage
- 🎥 Demo Video – Watch the Demo
- 🖼 Screenshots – View Project Gallery
graph LR
A[User Input] --> B[FastAPI Backend]
B --> C[Ingestion Pipeline]
C --> D[Embedding Pipeline]
D --> E[ChromaDB Vector Store]
E --> F[Retriever]
F --> G[LLM - Groq API]
G --> H[Final Response]
- React (Vite)
- TypeScript
- Tailwind CSS
- Lucide Icons
- Responsive UI
- FastAPI
- Python 3.11
- Async ingestion pipeline
- RESTful API architecture
- Modular service design
- Retrieval-Augmented Generation (RAG)
- Code embedding model
- Sentence embedding model
- Groq LLM API
- Prompt engineering with hallucination control
- ChromaDB (Persistent Vector Database)
- Metadata-based filtering
- Separate collections for code and text embeddings
- Natural language repository queries
- Context-aware retrieval
- Grounded LLM responses
- Controlled inference without hallucination
- Structured documentation generation
- Overview, architecture, modules
- Tech stack detection
- Setup and usage instructions
- Side-by-side metadata comparison
- LLM-based architectural analysis
- Strengths, trade-offs, verdict
- Feature comparison table
- Detects repository updates
- Avoids redundant embeddings
- Maintains ingestion consistency
- Advanced tech stack inference
- AST-based deeper code analysis
- Performance optimization for large repositories
- Query caching system
- Cloud deployment with scalable vector storage
- Multi-repository cross-analysis
- Visual architecture diagram generation
- Authentication & saved workspaces
- Enterprise-level CI/CD integration
- Developer Onboarding – Understand unfamiliar codebases quickly
- Open Source Exploration – Analyze large repositories before contributing
- Code Review Support – Gain instant architectural insights
- Academic Learning – Explore algorithm-heavy repositories
- Technical Interviews – Evaluate GitHub projects efficiently
- Python 3.11+
- Node.js 18+ and npm
- Groq API Key
- GitHub Personal Access Token (PAT)
Create a .env file inside the backend/ folder:
GROQ_API_KEY=your_groq_api_key
GITHUB_PAT=your_github_patgit clone https://github.com/your-username/GitSage.git
cd GitSagecd backend
python3.11 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
uvicorn main:app --reloadBackend runs at:
http://127.0.0.1:8000
cd frontend
npm install
npm run devFrontend runs at:
http://localhost:5173
- Ankita Sagar – https://github.com/Sagarankita
- Rudrakshi Chincholkar – https://github.com/RudrakshiChincholkar
- Soham Rane – https://github.com/soham30rane
- Harshal Kamble – https://github.com/xyz-harshal
- Sakshi Bhirud – https://github.com/bsakshiii
- Modular AI architecture
- Persistent vector search
- Dual embedding pipeline
- Structured LLM reasoning
- Real-world developer problem solving
- End-to-end full-stack system
⭐ Built with intelligence. Powered by code understanding.