Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training GPT first time, Encountering Error Opening Model file and error reading model header #1

Open
albertpurnama opened this issue May 4, 2024 · 3 comments

Comments

@albertpurnama
Copy link
Contributor

albertpurnama commented May 4, 2024

Hi,

Thanks for doing all the initial work to port over functionalities from llm.c to Go. I'm not familiar with model training in general but I have built a couple of web servers using Go before and I'd love to contribute to the llm.go project.

I'm encountering issue where make train causes the error:

$ make train     
go run ./cmd/traingpt2
2024/05/04 11:33:40 Error opening model file: open ./gpt2_124M.bin: no such file or directory
exit status 1
make: *** [train] Error 1

It's mainly because the gpt2_124M.bin is not available. Even when I try to touch the model binary by touch ./gpt2_124M.bin I encounter more errors regarding file headers. Like so:

go run ./cmd/traingpt2
2024/05/04 11:36:45 error reading model header: EOF
exit status 1
make: *** [train] Error 1

How can I resolve this?

Thanks in advance!

@albertpurnama
Copy link
Contributor Author

After reading a couple more issues from original llm.c repo. seems like generating initial model weights is what we need.

https://github.com/karpathy/llm.c/pull/288/files

Basically we need to run the script to generate the initial file checkpoint.

@albertpurnama
Copy link
Contributor Author

I figured it out.

What you need to do is to run python train_gpt2.py first. this way it will create the following files:

  • gpt2_124M.bin
  • gpt2_124M_debug_state.bin
  • gpt2_tokenizer.bin

I found out that tokenizer was created by https://github.com/albertpurnama/llm.go/blob/ffb034ebbd7792f1f1a9ba6766e5a940bc9084e8/train_gpt2.py#L346-L352

@joshcarp
Copy link
Owner

joshcarp commented May 8, 2024

Hey, didn't see this until now, but I think this presents a nice opportunity to completely remove any of python from the repo. It would be nice to download directly from huggingface and not need any of the llm.c binary files

@joshcarp joshcarp reopened this May 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants