Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable multiple CPU from arguments #680

Open
lij55 opened this issue Aug 13, 2024 · 6 comments
Open

Enable multiple CPU from arguments #680

lij55 opened this issue Aug 13, 2024 · 6 comments
Labels
new feature New feature or request

Comments

@lij55
Copy link

lij55 commented Aug 13, 2024

I have a 32 core AMD CPU and no GP.
mistral.rs will only use two of the cores. 2 cores is a bit less. Is it possible to allow to set it through arguments? Ollama will use half of core numbers by default.

Thanks!

@lij55 lij55 added the new feature New feature or request label Aug 13, 2024
@EricLBuehler
Copy link
Owner

Hi @lij55 can you please let me know what the command you are running is?

@lij55
Copy link
Author

lij55 commented Aug 17, 2024

sorry for late reply.
it is target/release/mistralrs-server -i plain -m microsoft/Phi-3-mini-128k-instruct -a phi3

@mert-kurttutan
Copy link

mert-kurttutan commented Sep 3, 2024

It also depends on which backend you are using. Is it default backend or mkl? (Depending on whether which version of gemm is being utilized)

Each of these have different settings for number of cpus to be used. For instance, mkl is controlled by OMP_NUM_THREADS or MKL_NUM_THREADS environment variable.
Irc, candle default backend is controllled by RAYON_NUM_THREADS, try to play with these environment variables to see if there is any change.

But, it is still weird that only 2 cores are being used.

Again, we need to know which cpu backend is used to solve the issue

@lij55
Copy link
Author

lij55 commented Sep 3, 2024

it is the default backend. I will try OMP_NUM_THREADS environments. I didn't notice it. Thanks in advance!

@EricLBuehler
Copy link
Owner

@lij55 did this work?

@lij55
Copy link
Author

lij55 commented Sep 19, 2024

sorry for late reply. I tried both OMP_NUM_THREADS and MKL_NUM_THREADS but no effect. It still use 2 cores.
Is it related with model ? I used phi3 by target/release/mistralrs-server -i plain -m microsoft/Phi-3-mini-128k-instruct -a phi3

Now I'm using ollama.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new feature New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants