You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
is it possible for us to have quick notebooks that show this kind of work - https://x.com/karpathy/status/1884678601704169965 my guess is that many folks are going to be trying this and if we can make it very easy that will create a lot of buzz.
Your contribution
linguistic
The text was updated successfully, but these errors were encountered:
oelachqar
changed the title
[Feature]: TinyZero reproduction of R1-Zero: experience the Ahah moment yourself for < $30
[Feature] TinyZero reproduction of R1-Zero: experience the Ahah moment yourself for < $30
Feb 4, 2025
Feature request
From: https://x.com/karpathy/status/1884678601704169965
TinyZero reproduction of R1-Zero
"experience the Ahah moment yourself for < $30"
Given a base model, the RL finetuning can be relatively very cheap and quite accessible.
From: https://x.com/jiayi_pirate/status/1882839370505621655
We reproduced DeepSeek R1-Zero in the CountDown game, and it just works
Through RL, the 3B base LM develops self-verification and search abilities all on its own
You can experience the Ahah moment yourself for < $30
Code: http://github.com/Jiayi-Pan/TinyZero
Motivation / references
is it possible for us to have quick notebooks that show this kind of work - https://x.com/karpathy/status/1884678601704169965 my guess is that many folks are going to be trying this and if we can make it very easy that will create a lot of buzz.
Your contribution
linguistic
The text was updated successfully, but these errors were encountered: