Gem - Your Real Time AI Assistant
Problem Statement 5 – Build the future of AI Computer Control with Screenpipe's Terminator
Traditional AI assistants often feel unresponsive and rigid, behaving more like preprogrammed robots that passively wait for user commands. This stiffness makes interactions feel unnatural and disconnected.
Additionally, most traditional agents operate without real awareness of what the user is doing, acting only when explicitly told to. They function like isolated machines, reactive rather than proactive.
While this design still works today, it leaves much room for improvement. By introducing real-time assistance that actively understands and adapts to the user’s context, we can create a far more fluid, helpful, and human-like experience.
Gem is a next-generation AI agent designed to assist users seamlessly in their everyday tasks. Unlike traditional AI assistants that passively wait for instructions, Gem actively observes the user's screen, intelligently searching for opportunities to help. When it detects something it can assist with, Gem proactively prompts the user — offering support exactly when it's needed.
Byte Warriors
- Anshu Sharma (AnshuSharma111 / https://www.linkedin.com/in/anshu-sharma-618917287/ / Leader)
- Hardik Kumar (hardikgoesgit/ https://www.linkedin.com/in/hardik-kumar-rrh/ /member)
- We chose this problem because it seemed like the one most fun to tackle and the one we would have the most fun building
- One of the biggest challenges we faced was trying to get the Screenpipe SDK to work. Apart from that, creating the installer and making sure it worked propely also took it's fair share of time
- A key point of building Gem was designing it's architecture. Trying to keep it responsive and fast while also processing huge amounts of data through LLMs. The 5 layer architecture is something we are most proud of
- Frontend: C++, Qt6
- Backend: Javascript, NodeJS, Redis
- APIs: Groq SDK, Screenpipe SDK
- Hosting: None, installer for windows
- [✅] Groq: We used Groq for fast, low latency processing of data from screenpipe. From cleaning that data to suggesting actions by performing sentiment analysis on it, Groq provided a fast and reliable API with various available LLMs to use.
- [✅] Screenpipe: We used Screenpipe to peerform OCR and get data of what is happening on the user's screen so that the LLM can suggest some actions
The key features of Gem are as follows:
- ✅ Real-time Agent: Gem works when you work. Just switch her ON and she will observe what is going on in the screen and chip in to help
- ✅ Fast and Alive agent: Due to Groq, the processing of data through 5 layers is fast and therefore, Gem acts fast. She feels responsive and smooth
- ✅ Privacy Controls: You can select what apps you do not want Gem to monitor and she won't monitor those apps. This ensures user privacy
- ✅ Logs: The user can press Ctrl + Shift + D on their keyboards when Gem is open and this will open the Debug Log Viewer where the user can safely monitor all activity Gem is doing
- Demo Video Link: https://youtu.be/5jpG8ykBkFY
- Pitch Deck / PPT Link: https://drive.google.com/file/d/1qPjkg2eDsrbL4hA5GOTKZQpat8zFOzTx/view?usp=sharing
- [✅] All members of the team completed the mandatory task - Followed at least 2 of our social channels and filled the form (Details in Participant Manual)
- [✅] All members of the team completed Bonus Task 1 - Sharing of Badges and filled the form (2 points) (Details in Participant Manual)
- [✅] All members of the team completed Bonus Task 2 - Signing up for Sprint.dev and filled the form (3 points) (Details in Participant Manual)
[FOR WINDOWS ONLY]
- Go to screenpipe's repository and install screenpipe on your machine: https://github.com/mediar-ai/screenpipe/tree/main using the command given there
- Download the installer from this link: https://drive.google.com/file/d/1tp6eun473hlaHl6h7f8ktVwSbGr6LLUR/view?usp=sharing
- [Optional] Have wsl installed on your system and have redis installed in it
What’s Next for Gem: 📈 Expanding Tools and Capabilities Thanks to Gem’s modular architecture, the core boilerplate is already in place — making it easy to build and expand. We can rapidly add new tools, such as scheduling meetings, taking notes, and much more, tailored to user needs.
🛡️ Stronger Privacy Controls Future updates will give users even greater control over what Gem monitors. Users will also have the option to switch to fully offline LLMs (such as through Ollama) for maximum privacy and local processing.
🌐 Localization and Accessibility We plan to make Gem accessible to a global audience by introducing multi-language support and improving accessibility features — ensuring everyone can benefit from real-time assistance.
- Groq SDK: https://console.groq.com/home
- Screenpipe SDK: https://docs.screenpi.pe/terminator/js-sdk-reference
It was a lot of fun building this application. We are grateful to the organisers for holding such a well organised event at such a large scale. We are also grateful to our fellow participants for making this event such a resounding success. May the best hacker win! Cheers!


