This code tests the division capabilities of small transformers, and is based off the work of Andrej Karpathy's NanoGPT, and https://github.com/lee-ny/teaching_arithmetic.
This code tests the division capabilities of small transformers, and is based off the work of Andrej Karpathy's NanoGPT, and https://github.com/lee-ny/teaching_arithmetic.