This repository explores why Vision Transformers (ViTs) are vulnerable to frequency‑based adversarial attacks and introduces a simple modification to the standard sinusoidal positional encoding (PE). Our key idea is to concatenate standard sin‑cos embedding with the learnable projection vector in a fixed ratio, rather than adding them. We evaluate this new PE (“OurPE”) across multiple attack types and compare it against:
- Learnable PE
- Sinusoidal PE
- Rotary PE (RoPE)
- Algebraic PE (APE)
We swept over different projection‑to‑sinusoid ratios for OurPE (3:1 and 4:1) and measured CIFAR‑10 ViT accuracy under various black‑box frequency attacks:
| Embedding | Clean Acc. | Phase Weak | Phase Strong | Mag Weak | Mag Strong |
|---|---|---|---|---|---|
| SinCos | 76.98 % | 39.94 % | 31.43 % | 39.88 % | 31.31 % |
| RoPE | 76.25 % | 40.10 % | 33.31 % | 40.02 % | 33.03 % |
| OurPE (3:1) | 77.67 % | 42.61 % | 33.18 % | 42.33 % | 32.76 % |
| OurPE (4:1) | 78.12 % | 42.82 % | 35.25 % | 42.79 % | 34.46 % |
- Best gains under strong phase attacks: OurPE (4:1) boosts accuracy by ~3.8 pp over SinCos.
- Clean‑train accuracy also increases by ~1.1 pp compared to SinCos.
We also ran white‑box FGSM and PGD variants and observed a slight increase in vulnerability there—see the notebooks for full breakdowns.
.
├── APE/ # Algebraic PE code & experiments
├── Fourier Projection/ # Fourier‑based attack modules
├── STFT/ # Short‑Time Fourier Transform experiments
├── ViT_Our_PE_Hyperparam_search/ # Hyperparameter sweeps for OurPE ratios
├── experimentation_pipeline/ # End‑to‑end training & evaluation scripts
├── wavelets/ # Wavelet‑based perturbation experiments
├── EDA.ipynb # Exploratory data analysis (original)
├── EDA copy.ipynb # EDA with recent runs
├── attack-on-vit.ipynb # Base ViT attack notebook (original)
├── attack-on-vit copy.ipynb # Updated attack experiments
├── eureka-sincos (Our PE).ipynb # Core OurPE implementation & results
├── eureka-sincos (Our PE) (1:3).ipynb # OurPE with 3:1 ratio (augmented, adv)
├── eureka-sincos (Our PE) (1:4).ipynb # OurPE with 4:1 ratio (augmented, adv)
├── eureka-sincos (Our PE) (1:7)... # Other ratio experiments
├── rope-vit.ipynb # RoPE baseline experiments
├── ES667__Project_Proposal.pdf # Original project proposal
├── ES_667__Deep_learning_Project_Presentation.pdf
└── .gitignore
-
Clone the repo
git clone https://github.com/yourusername/Adversarial-Robustness-ViTs.git cd Adversarial-Robustness-ViTs -
Run a notebook Open eureka-sincos (Our PE).ipynb for the core OurPE experiments. Or launch experimentation_pipeline/train_and_eval.py for scripted runs.
-
View results Check the results/ folder (generated by the scripts) for CSVs and plots.
- Authors: Karan Gandhi, Anurag Singh, Arjun Dikshit, Aarsh Wankar
- Advisor: Prof. Shanmuganathan Raman (IITGN)