You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The scorer computes the scores by sequence_cross_entropy_with_logits(). I notice that the begining index of the para is different from the implementation in EPR.
in UDR: loss_list = sequence_cross_entropy_with_logits(logits=output.logits[:, :-1].contiguous(), targets=entry.input_ids[:, 1:].contiguous(), weights=pad_mask, average=None)
in EPR: loss_list = sequence_cross_entropy_with_logits(logits=output.logits, targets=entry.input_ids[:,1:], weights=pad_mask, average=None)
So I wander what actually the input is and find this in scorer_dsr.py tokenized_example = self.tokenizer.encode_plus(enc_text, truncation=True, add_special_tokens=False, return_tensors='pt') tokenized_labels = self.tokenizer.encode_plus(test_answer, truncation=True, add_special_tokens=False, return_tensors='pt')
Since the special tokens aren't add into the inputs, Why do we need to exclude the first of the inputs and the end of the logits?
The text was updated successfully, but these errors were encountered:
In UDR's implementation, we score examples batch by batch, when EPR's code scores examples one by one. To implement this batch-parallel example scoring, we adjust the example's pad position for tokenization, which leads to the mentioned difference.
The scorer computes the scores by
sequence_cross_entropy_with_logits()
. I notice that the begining index of the para is different from the implementation in EPR.in UDR:
loss_list = sequence_cross_entropy_with_logits(logits=output.logits[:, :-1].contiguous(), targets=entry.input_ids[:, 1:].contiguous(), weights=pad_mask, average=None)
in EPR:
loss_list = sequence_cross_entropy_with_logits(logits=output.logits, targets=entry.input_ids[:,1:], weights=pad_mask, average=None)
So I wander what actually the input is and find this in scorer_dsr.py
tokenized_example = self.tokenizer.encode_plus(enc_text, truncation=True, add_special_tokens=False, return_tensors='pt') tokenized_labels = self.tokenizer.encode_plus(test_answer, truncation=True, add_special_tokens=False, return_tensors='pt')
Since the special tokens aren't add into the inputs, Why do we need to exclude the first of the inputs and the end of the logits?
The text was updated successfully, but these errors were encountered: