You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi everyone.
I'm not up to date, and also inexperienced, so please forgive my ignorance.
Llama.cpp is a really impressive and useful system, but it does not use all cores available, even on a very large system (100+ cores).
I am asking you all, why is this the case? Have you all evaluated and/or considered using the SequenceL interpreting compiler to generate massively thrraded cpp, or, is the real bottleneck memory bandwidth?
Thanks everyone and thanks Grigory!
-Richard
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Hi everyone.
I'm not up to date, and also inexperienced, so please forgive my ignorance.
Llama.cpp is a really impressive and useful system, but it does not use all cores available, even on a very large system (100+ cores).
I am asking you all, why is this the case? Have you all evaluated and/or considered using the SequenceL interpreting compiler to generate massively thrraded cpp, or, is the real bottleneck memory bandwidth?
Thanks everyone and thanks Grigory!
-Richard
Beta Was this translation helpful? Give feedback.
All reactions