Skip to content

CUDA-Accelerated McEliece KEM πŸ”‘ | Post-Quantum Cryptography on GPU Implementation of Classic McEliece key encapsulation, encryption, decryption, and decapsulation on CPU & GPU with CUDA, including benchmarking scripts and full FYP2 report

License

Notifications You must be signed in to change notification settings

DesmondJS/Cuda_McEliece_KEM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

13 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸš€ CUDA-Accelerated McEliece KEM (FYP2)

GitHub repo size License CUDA Ubuntu

πŸ“– Overview

This repository contains my Final Year Project 2 (FYP2) at Universiti Tunku Abdul Rahman (UTAR), focusing on GPU parallelization of the Classic McEliece post-quantum cryptosystem using CUDA.

The goal was to accelerate encryption, decryption, encapsulation, and decapsulation by offloading computationally intensive tasks (syndrome generation, error vector formation, FFT, Benes network, etc.) to NVIDIA GPUs.
Performance was benchmarked and compared with an optimised CPU vectorized implementation, showing significant throughput improvements.

✨ Features

  • βœ… Full KEM flow: Encapsulation, Decapsulation, Encryption, Decryption
  • βœ… CUDA kernels for FFT, Benes network, error vector generation
  • βœ… Benchmarking scripts – CPU vs GPU, multiple num_blocks configurations
  • βœ… CSV output for results & plotting
  • βœ… Full FYP2 Report (PDF) included

πŸ–₯️ Tested Environment

Component Specification
OS Ubuntu 24.04.2 LTS
GPU NVIDIA GeForce GTX 1650 (4 GB)
CUDA Toolkit 12.4
CPU AMD Ryzen 5 3550H @ 2.10 GHz
RAM 12 GB
Driver Version 550.144.03

πŸ“Š Key Results (Throughput in Bytes/Second)

Here’s a highlight of the performance improvement:

Encrypt:

Num Blocks CPU Encrypt (B/s) GPU Encrypt (B/s)
1 10,917,018 2,314,525
2 13,532,555 4,791,434
4 12,782,441 9,871,298
8 12,806,684 20,305,654
16 12,542,706 41,614,452
32 12,219,651 81,403,555

Decrypt:

Num Blocks CPU Decrypt (B/s) GPU Decrypt (B/s)
1 2,515,560 553,932
2 2,542,354 1,165,266
4 2,517,718 2,303,712
8 2,488,658 4,657,335
16 2,418,709 9,264,162
32 2,508,264 18,515,560

πŸ“ˆ Observation:

  • CPU throughput stays almost constant β€” no parallelism (loops run sequentially).
  • GPU throughput rises almost linearly with num_blocks, showing strong scaling thanks to parallel execution.
  • At 32 blocks, GPU achieves 6.7Γ— faster encryption and 7.4Γ— faster decryption compared to CPU.

βš™οΈ Installation & Usage

1️⃣ Clone the Repo

2️⃣ Build

  • make clean
  • make

3️⃣ Run

  • ./run_test

πŸ™Œ Acknowledgements

This project was completed as part of UTAR FYP under the supervision of Dr. Lee Wai Kong.

πŸ“œ License

This project is licensed under the MIT License – see the LICENSE file for details.

About

CUDA-Accelerated McEliece KEM πŸ”‘ | Post-Quantum Cryptography on GPU Implementation of Classic McEliece key encapsulation, encryption, decryption, and decapsulation on CPU & GPU with CUDA, including benchmarking scripts and full FYP2 report

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published