-
I trained my custom model (basically a small GPT-2 Transformer with custom tokenization) in Pytorch (and HuggingFace Now I'd like to use this model with ggml. But I'm struggling to make it work. What I did so far is just copy-paste the GPT-2 example from this repo (the example is awesome), and :
It runs, but I get non-sense predictions (and sometimes NaN), so obviously it's going wrong somewhere. So I start debugging, viewing the tensors' shapes & contents, and compare these to the outputs in Python. But even the most basic operations give odd results ?? For example, when printing the weights and resulting tensor of the very first operation (embedding the input tokens) :
And after the operation Line 596 in 6b846cb I get :
Which... Doesn't make sense ?... The first token is In this situation, any tips on how to debug and fix the issues ? |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 1 reply
-
Do you get the correct values if you make the graph to have only the The reason I'm asking is because further operations can overwrite the results of previous ops, so if you are looking at the |
Beta Was this translation helpful? Give feedback.
-
Yeah. @ggerganov is right. How you actually print the tensor values when you debug is critical as memory is reused and this could lead to wrong debug outputs, so you can show us how you do it. In order to get actually the correct output one way to go about is set the tensor name:
Then if you are using the CPU backend:
where e.g. print_t_f16 I define as:
|
Beta Was this translation helpful? Give feedback.
Do you get the correct values if you make the graph to have only the
ggml_get_rows
operation?The reason I'm asking is because further operations can overwrite the results of previous ops, so if you are looking at the
ggml_get_rows
memory after computing the graph, you could be looking at new values generated from the next operations in the graph. So first step is to remove all other ops and make sure that result is expected.