Skip to content
This repository has been archived by the owner on Jun 24, 2024. It is now read-only.

Update gpt2 to use wte if no lm_head #362

Merged
merged 3 commits into from
Jul 11, 2023
Merged

Conversation

steventrouble
Copy link
Contributor

@steventrouble steventrouble commented Jul 9, 2023

Closes #338

Based off of #343

Fixes the segfault issue mentioned in that bug, and adds a check that could help catch those errors faster.

@philpax
Copy link
Collaborator

philpax commented Jul 9, 2023

Looks good! Can you add the comments from the original PR about why the tensor's optional and why we substitute with wte?

@philpax philpax added issue:enhancement New feature or request model:gpt-2 GPT-2 model labels Jul 9, 2023
@steventrouble
Copy link
Contributor Author

👍 done, thanks!

@philpax
Copy link
Collaborator

philpax commented Jul 11, 2023

Brilliant, thanks 🚀

@philpax philpax merged commit cf6086c into rustformers:main Jul 11, 2023
13 checks passed
@hhamud hhamud mentioned this pull request Aug 7, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
issue:enhancement New feature or request model:gpt-2 GPT-2 model
Projects
None yet
Development

Successfully merging this pull request may close these issues.

GPT-2 doesn't always have an lm_head
2 participants