Skip to content
View Ruhaan838's full-sized avatar
:octocat:
Avg. ADHD
:octocat:
Avg. ADHD

Block or report Ruhaan838

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Ruhaan838/README.md

👋 Hi there! I'm Ruhaan

A passionate student exploring the world of AI, ML, DL, and GPU programming with a strong focus on PyTorch, CUDA C++, and Reinforcement Learning.

🌐 Visit My Portfolio


Featured Projects

Project Description Links Tech Used
Flash Attention (CUDA) Forward & Backward pass of Flash Attention in CUDA Forward, Backward CUDA C++
Conv1D & Conv2D (CUDA) Shared & tiled memory implementations of Conv1D/2D Link CUDA C++
HIP DL Algorithms 14+ DL and linear algebra kernels using HIP/rocBLAS Link HIP C++
LLaMA 2 (PyTorch) LLaMA2 from scratch with training & inference loops Link Python, PyTorch
AnyGrad Minimal tensor library in C++ with Python bindings Link C++, Python

Currently Learning


Tech Stack

Python
Python
C++
C++
CUDA
CUDA
PyTorch
PyTorch
HIP
HIP / ROCm
CLion
CLion
PyCharm
PyCharm
Linux
Linux

GitHub Stats

Top Langs GitHub Streak

Contribution Snake


Fun Fact

“Optimization is not just speed — it's elegance under pressure.”


Pinned Loading

  1. 100Day-GPU 100Day-GPU Public

    I am trying to Learn CUDA in 100 Days. (inspired by @hkproj)

    Cuda 6

  2. AnyGrad AnyGrad Public

    A Tensor module that allows a deep learning framework to switch seamlessly between different engines.

    C++ 1

  3. StableDiffusionPytorch StableDiffusionPytorch Public

    Trying to Implement the Stable Diffusion From scratch in Pytorch.

    Python

  4. LLaMA-2-pytorch LLaMA-2-pytorch Public

    I try to reproduce the Llama model with training and inference.

    Python

  5. Mini_Dino Mini_Dino Public

    A simplified implementation of the DINOv2 (Self-supervised Vision Transformers).

    Python

  6. DiverseXEngine DiverseXEngine Public

    It's an Engine allows users to runs the code without single line of code!!!

    C++ 1 4