-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Option to use CoreML only on iOS without CPU fallback #93
Comments
I agree with this, I noticed the CoreML .modelrc files are smaller in comparison to the ggml .bin files. So if we only use the ColeML files on iOS then that would be easier because we could bundle a smaller app size. |
At this time the ggml model still in use as decoder. This is not ideal as it will take up more memory space, we can see how we can improve this.
By default we use For easier debug, we may want to have a field like |
What does PR #123 add? I am not very familiar with all this ML stuff. I thought it might have something to do with this but now I think maybe not. Is it just a different (faster way) to allocate memory on apple devices so the context init will be faster? Of is it something different? Sorry for not understanding but I want to learn. |
ggml-alloc basically reduce memory usage of model compared to before. ggml-metal allow GGML to access GPU resources on Apple devices. If you use it, you don't need to load the CoreML model separately and you can get similar performance, but this depends on the performance difference of the GPU / Neural Engine on the device. The results on Mac / iPhone will be different. Currently it's not enabled yet as I mentioned in #123 (comment). |
Thanks for explaining! Do you know any good or interesting resources to read and learn more about this stuff? |
Do you mean ML? I'm not an expert so I afraid I can't provide helpful resources. If you want to learning things about GGML, I would recommend you watching ggml / llama.cpp / whisper.cpp and community project that using GGML. Especially llama.cpp, most things happen in this repo. |
Is there a reason why we can not exclusively use CoreML on iOS for example? Because now we will still need to bundle a regular CPU model but it is unclear if it would be used.
The documentation states that the library might fallback to using the CPU when you try to use CoreML. It would be nice if the docs also included a reason on why this would happen. When would it fall back to CPU mode? Or does that only happen on android? It is not really clear to me from the current documentation.
Thanks for the great work on this library!
The text was updated successfully, but these errors were encountered: