Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Pytorch MaxP Feature/ptmaxp #184

Merged
merged 51 commits into from
Aug 6, 2022

Conversation

crystina-z
Copy link
Collaborator

@crystina-z crystina-z commented Sep 16, 2021

  • added PyTorch MaxP
  • update the output of bertpassage id2vec function, so that it's compatible to both tf-maxp and pt-maxp
  • update the other extractor accordingly
  • updated the test case and repro docs

crystina-z and others added 30 commits September 2, 2021 05:02
MSMARCO reproductino logs - nima
@crystina-z crystina-z force-pushed the feature/eval+ptmaxp branch from 56ccaeb to db5e1ee Compare May 11, 2022 22:52
# # REF-TODO: save scheduler state along with optimizer
# self.lr_scheduler.step()
# hacky: use step instead the internally calculated epoch to support step-wise lr update
self.lr_scheduler.step(epoch=cur_step)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's a bit hacky here, where by default lr_scheduler.step takes in the epoch; changing here as when we passing epoch=0 into our lr_multiplier and the warmupiter is also 1, the lr would be almost 0 for the entire first epoch.

@@ -222,6 +207,30 @@ def parse_label_tensor(x):
label = tf.map_fn(parse_label_tensor, parsed_example["label"], dtype=tf.float32)

return (pos_bert_input, pos_mask, pos_seg, neg_bert_input, neg_mask, neg_seg), label

def _filter_inputs(self, bert_inputs, bert_masks, bert_segs, n_valid_psg):
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explicitly for training, this function randomly select one passage from the n-passages, this is done in extractor now so that pytorch and tensorflow trainer can both use it.

…ain_feature into two MixIn (depends when they generate list of passage or single passage per query at training time), so that they can be shared by each extractor as needed
@lgtm-com
Copy link

lgtm-com bot commented May 12, 2022

This pull request introduces 9 alerts when merging db0e405 into a568304 - view on LGTM.com

new alerts:

  • 3 for Unused local variable
  • 3 for Unused import
  • 2 for Conflicting attributes in base classes
  • 1 for Module is imported with 'import' and 'import from'



@Extractor.register
class BirchBertPassage(MultipleTrainingPassagesMixin, BertPassage):
Copy link
Collaborator Author

@crystina-z crystina-z May 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inherit the create_train_features and parse_train_features from MultipleTrainingPassagesMixin, and the other functions from BertPassage

@crystina-z crystina-z changed the title [WIP] Feature/eval+ptmaxp Support Pytorch MaxP Feature/ptmaxp May 12, 2022
@capreolus-ir capreolus-ir deleted a comment from lgtm-com bot May 15, 2022
@capreolus-ir capreolus-ir deleted a comment from lgtm-com bot May 15, 2022
@capreolus-ir capreolus-ir deleted a comment from lgtm-com bot May 15, 2022
@capreolus-ir capreolus-ir deleted a comment from lgtm-com bot May 15, 2022
@capreolus-ir capreolus-ir deleted a comment from lgtm-com bot May 15, 2022
@capreolus-ir capreolus-ir deleted a comment from lgtm-com bot May 15, 2022
@capreolus-ir capreolus-ir deleted a comment from lgtm-com bot May 15, 2022
@capreolus-ir capreolus-ir deleted a comment from lgtm-com bot May 15, 2022
@capreolus-ir capreolus-ir deleted a comment from lgtm-com bot May 15, 2022
@capreolus-ir capreolus-ir deleted a comment from lgtm-com bot May 15, 2022
@capreolus-ir capreolus-ir deleted a comment from lgtm-com bot May 15, 2022
@capreolus-ir capreolus-ir deleted a comment from lgtm-com bot May 15, 2022
@capreolus-ir capreolus-ir deleted a comment from lgtm-com bot May 15, 2022
@andrewyates andrewyates self-requested a review August 6, 2022 09:28
@andrewyates
Copy link
Member

I got a reasonable dev MRR with pytorch: 0.3548

@andrewyates andrewyates merged commit 5946640 into capreolus-ir:master Aug 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants