-
Notifications
You must be signed in to change notification settings - Fork 162
Milestones
Mission: Build the market's best single-node inference server - methodical yet surprisingly fast.
We're tightening Cortex's foundation this month. Improving security protocols, enhancing reliability, and making configuration more flexible. We'll roll out new benchmarking tools to measure our progress and create tutorials to help users get the most out of Cortex.
April is all about making Cortex more accessible. We're launching a redesigned documentation website and a revamped Model Hub with improved search and metrics. Two videos are in the pipeline - an introduction to Cortex on YouTube and a technical presentation at C++Now to showcase our capabilities to new audiences.
Key Deliverables:
- Hardened security measures
- Flexible configuration options
- Comprehensive benchmarking suite
- New documentation website
- Revamped Model Hub interface
- Introduction and technical videos
Mission: Expand Cortex's reach through multiple platforms and enhanced performance.
This month focuses on making Cortex available across more platforms and boosting performance. We're integrating with multiple package managers for different operating systems, enhancing our Python Engine with vLLM integration, and adding support for more GPU architectures including Intel GPUs. These improvements will dramatically expand our compatibility and processing capabilities.
Key Deliverables:
- OS package manager integrations (apt, brew, chocolatey, etc.)
- Enhanced Python Engine with vLLM integration
- Support for additional GPU architectures including Intel
- Specialized Docker images for various deployment scenarios
- Performance optimization across platforms
- Cross-platform compatibility improvements
Mission: Transform Cortex from a standalone inference server into a robust AI development platform with advanced context capabilities.
During these months, we're building the foundation for more sophisticated AI applications. We'll focus on developing SDKs for multiple languages to make Cortex integration seamless across different dev environments. Adding memory capabilities will allow for persistent conversations, while our RAG implementation will enable knowledge-grounded responses. A new syncing layer will keep distributed deployments consistent across environments.
We could also incorporate:
- Workflow orchestration for multi-step AI processes
- Fine-tuning toolkit for model customization
- Vector database integrations for efficient knowledge retrieval
- Custom plugin architecture for extensibility
- Interactive playground for rapid prototyping
- Monitoring dashboard for inference tracking
Key Deliverables:
- JavaScript and Python SDKs
- Conversation memory system
- Retrieval Augmented Generation (RAG) framework
- Multi-environment syncing mechanism (optional)
- Metrics Endpoint
- Workflow orchestration tools (optional)
- Plugin system for custom extensions
- Model fine-tuning capabilities (optional)
- Performance monitoring dashboard
The Thinking Brick milestone represents our evolution from a fast inference engine to a complete AI development platform - solid as a brick but with the intelligence to adapt and learn.