Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

converter does not work with the current ggml #23

Open
yunghoy opened this issue Jul 19, 2023 · 4 comments
Open

converter does not work with the current ggml #23

yunghoy opened this issue Jul 19, 2023 · 4 comments

Comments

@yunghoy
Copy link

yunghoy commented Jul 19, 2023

Tried to convert https://huggingface.co/intfloat/e5-large-v2 to ggml with the current d9f04e609fb7f7e5fb3b20a77d4d685219971009 commit. However, execution of the converted f32, f16, q4_0, and q4_1 models shows the not enough space in the context's memory pool message. Maybe it is related to ggerganov/ggml#158 ?

@dranger003
Copy link
Contributor

I ran into the same issue, but after making these changes it works fine. 007f063

@yunghoy
Copy link
Author

yunghoy commented Aug 9, 2023

Thanks! I think latest ggml with your increasing memory size code can be used to convert the models.
I believe the code in this repository should be updated.

-        model_mem_req += (5 + 16 * n_layer) * 256; // object overhead
+        model_mem_req += (5 + 16 * n_layer) * 512; // object overhead
-        new_bert->buf_compute.resize(16 * 1024 * 1024);
+        new_bert->buf_compute.resize(32 * 1024 * 1024);

@redthing1
Copy link

I see that this has been updated: https://github.com/skeskinen/bert.cpp/blob/master/bert.cpp#L461
But I am still seeing this error.

@redthing1
Copy link

bert_load_from_file: n_vocab = 30522
bert_load_from_file: n_max_tokens   = 512
bert_load_from_file: n_embd  = 384
bert_load_from_file: n_intermediate  = 1536
bert_load_from_file: n_head  = 12
bert_load_from_file: n_layer = 12
bert_load_from_file: f16     = 0
bert_load_from_file: ggml ctx size = 126.80 MB
bert_load_from_file: ........................ done
bert_load_from_file: model size =   126.69 MB / num tensors = 197
bert_load_from_file: mem_per_token 898 KB, mem_per_input 538 MB
ggml_new_object: not enough space in the context's memory pool (needed 565700096, available 565174272)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants