forked from huggingface/trl
-
Notifications
You must be signed in to change notification settings - Fork 2
Issues: August-murr/trl
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Allow more flexibility in the new PPOTrainer (aka PPOTrainerv2)
🧒 good approach
❓ question
Seeking clarification or more information
#11
opened Jan 16, 2025 by
qgallouedec
dpo_vlm.py
❓ question
Seeking clarification or more information
#10
opened Jan 12, 2025 by
qgallouedec
5 of 9 tasks
Dataset type conversion utilities
🏋 DDPO
Related to DDPO
🏋 DPO
Related to DPO
🏋 GKD
Related to GKD
🏋 Iterative SFT
Related to Iterative SFT
🏋 KTO
Related to KTO
🎯 optimal import sentence
❓ question
Seeking clarification or more information
#8
opened Jan 6, 2025 by
August-murr
9 tasks done
onlinedpo error when use deepspeed zero3
👖 action-adventure
✨ enhancement
New feature or request
🏋 Iterative SFT
Related to Iterative SFT
🎯 optimal import sentence
❓ question
Seeking clarification or more information
🏋 RLOO
Related to RLOO
#7
opened Jan 6, 2025 by
August-murr
5 of 9 tasks
Dataset type conversion utilities
🐛 bug
Something isn't working
📚 documentation
Improvements or additions to documentation
#6
opened Jan 6, 2025 by
qgallouedec
9 tasks done
Dummy question
❓ question
Seeking clarification or more information
#5
opened Jan 4, 2025 by
August-murr
ProTip!
Adding no:label will show everything without a label.