Update gpt2 to use wte if no lm_head #362

steventrouble · 2023-07-09T19:40:34Z

Closes #338

Based off of #343

Fixes the segfault issue mentioned in that bug, and adds a check that could help catch those errors faster.

philpax · 2023-07-09T19:45:15Z

Looks good! Can you add the comments from the original PR about why the tensor's optional and why we substitute with wte?

steventrouble · 2023-07-11T18:32:28Z

👍 done, thanks!

philpax · 2023-07-11T19:57:42Z

Brilliant, thanks 🚀

steventrouble added 2 commits July 9, 2023 12:34

Fix segfaults due to bad memory view offset.

ff4bb37

Update gpt2 to use wte if no lm_head

0f841cd

philpax mentioned this pull request Jul 9, 2023

fix #338 - use wte if no lm_head for gpt2 #343

Closed

philpax added issue:enhancement New feature or request model:gpt-2 GPT-2 model labels Jul 9, 2023

Address PR comments

c995ca8

philpax merged commit cf6086c into rustformers:main Jul 11, 2023
13 checks passed

steventrouble mentioned this pull request Jul 12, 2023

Perplexity can/will segfault over large input sizes #257

Open

hhamud mentioned this pull request Aug 7, 2023

Write a 0.2 changelog #244

Open

Provide feedback