-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Could you release the weights of PRM? #4
Comments
We have illustrated how to train PRM. Specifically, you can download [$D_{V_0}$] and put them in |
Great work, but is it possible to just release the model weight? |
I also trained PRM/train_VM_mistral.py and the accuracy is 0.1530 after two epochs. |
Thank you for your contributions! I’m currently stuck with training the VM. Below are the statistics of the training data, showing each label and its corresponding number of samples: Counter({'0.0': 240594, '1.0': 48953, '0.1': 20901, '0.8': 20614, '0.5': 18341, '0.2': 17034, '0.3': 16688, '0.7': 14310, '0.6': 13104, '0.4': 9448, '0.9': 6462}) If the RM predicts only label |
The experimental settings are as follows: For ChatGLM3-6B, learning rate (lr) is 2e-5, the number of epochs is 2 or 3, and batch size is 3. For Mistral, learning rate (lr) is 3e-6, the number of epochs is 2 or 3, and batch size is 3. |
Can anyone let me know approximately how much time it required to run the 2 epochs for mistral on a A100. It shows me around 35 hrs!!! |
Thanks for your contribution! Could you release the weights of PRM? Or maybe there is something I omit?
The text was updated successfully, but these errors were encountered: