A passionate student exploring the world of AI, ML, DL, and GPU programming with a strong focus on PyTorch, CUDA C++, and Reinforcement Learning.
| Project | Description | Links | Tech Used |
|---|---|---|---|
| Flash Attention (CUDA) | Forward & Backward pass of Flash Attention in CUDA | Forward, Backward | CUDA C++ |
| Conv1D & Conv2D (CUDA) | Shared & tiled memory implementations of Conv1D/2D | Link | CUDA C++ |
| HIP DL Algorithms | 14+ DL and linear algebra kernels using HIP/rocBLAS | Link | HIP C++ |
| LLaMA 2 (PyTorch) | LLaMA2 from scratch with training & inference loops | Link | Python, PyTorch |
| AnyGrad | Minimal tensor library in C++ with Python bindings | Link | C++, Python |
-
Learning CUDA in 100 Days
github.com/Ruhaan838/100Day-GPU -
Learning Reinforcement Learning in 100 Days
(Private Repository)
|
Python |
C++ |
CUDA |
PyTorch |
|
HIP / ROCm |
CLion |
PyCharm |
Linux |
|
|
|
“Optimization is not just speed — it's elegance under pressure.”


