Skip to content

Commit

Permalink
fix loss
Browse files Browse the repository at this point in the history
  • Loading branch information
Малахов Алексей Павлович committed Nov 22, 2024
1 parent a9d9e6f commit 941dbe9
Showing 1 changed file with 0 additions and 1 deletion.
1 change: 0 additions & 1 deletion turbo_alignment/trainers/dpo.py
Original file line number Diff line number Diff line change
Expand Up @@ -452,7 +452,6 @@ def compute_loss(
rejected_rewards = self.beta * (policy_rejected_logps - reference_rejected_logps).detach()

loss = -F.logsigmoid(self.beta * (logits - penalty_term))
loss = policy_chosen_logps

return loss, chosen_rewards, rejected_rewards

Expand Down

0 comments on commit 941dbe9

Please sign in to comment.