-
Notifications
You must be signed in to change notification settings - Fork 34
Issues: THUDM/ReST-MCTS
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Issue: FileNotFoundError when running self_train_dpo.py
about dataset
datasets of PRM and policy model
#15
opened Dec 18, 2024 by
KaiyuHu2001
About the synthetic data generation
about dataset
datasets of PRM and policy model
#14
opened Dec 8, 2024 by
yeppp27
Coding style & details of README
about readme
Improvements or additions to documentation
#11
opened Dec 1, 2024 by
Wloner0809
The difference between 'vm' and 'prm' in 'reward_model_type'?
about PRM
#10
opened Nov 28, 2024 by
Aurora-slz
Hope for a more detailed README!
about readme
Improvements or additions to documentation
#6
opened Oct 11, 2024 by
PKUfreshman
What is VALUE_MODEL_STATE_DICT?
about readme
Improvements or additions to documentation
#5
opened Oct 11, 2024 by
dszpr
Could you release the weights of PRM?
about dataset
datasets of PRM and policy model
about PRM
#4
opened Sep 30, 2024 by
cybisolated
Utilization of negative samples
about dataset
datasets of PRM and policy model
#2
opened Jun 25, 2024 by
HillZhang1999
ProTip!
What’s not been updated in a month: updated:<2024-11-25.