You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Running in the MacBook M2 Pro Metal mode is too slow, and it becomes incredibly slow when the issue is slightly more complex.
Even to the point of freezing, the GPU usage is high but it cannot provide completions results.
the model:https://huggingface.co/cleatherbury/Phi-3-mini-128k-instruct-Q4_K_M-GGUF
[2024-09-15 09:11:00.088] [info] Sampling method: penalties -> temperature -> topk -> topp -> minp -> multinomial
[2024-09-15 09:11:00.088] [info] Model kind is: quantized from gguf (no adapters)
general.architecture: phi3
general.basename: Phi-3
general.file_type: 7
general.finetune: 128k-instruct
general.languages: en
general.license: mit
general.license.link: https://huggingface.co/microsoft/Phi-3-mini-128k-instruct/resolve/main/LICENSE
general.name: Phi 3 Mini 128k Instruct
general.quantization_version: 2
general.size_label: mini
general.tags: nlp, code, text-generation
general.type: model
phi3.attention.head_count: 32
phi3.attention.head_count_kv: 32
phi3.attention.layer_norm_rms_epsilon: 0.00001
phi3.attention.sliding_window: 262144
phi3.block_count: 32
phi3.context_length: 131072
phi3.embedding_length: 3072
phi3.feed_forward_length: 8192
phi3.rope.dimension_count: 96
phi3.rope.freq_base: 10000
phi3.rope.scaling.attn_factor: 1.1902381
phi3.rope.scaling.original_context_length: 4096
quantize.imatrix.chunks_count: 151
quantize.imatrix.dataset: /training_dir/calibration_datav3.txt
quantize.imatrix.entries_count: 128
quantize.imatrix.file: /models_out/Phi-3.1-mini-128k-instruct-GGUF/Phi-3.1-mini-128k-instruct.imatrix
[2024-09-15 09:11:01.563] [info] Model loaded.
[2024-09-15 09:11:01.564] [info] Serving on http://localhost:19161.
[2024-09-15 09:11:29.590] [info] completions request parse...{"model":"default","prompt":"who are you","best_of":1,"echo":false,"presence_penalty":null,"frequency_penalty":null,"logit_bias":null,"logprobs":null,"max_tokens":null,"n":1,"stop":null,"stream":null,"temperature":null,"top_p":null,"suffix":null,"user":null,"tools":null,"tool_choice":null,"top_k":null,"grammar":null,"adapters":null,"min_p":null,"dry_multiplier":null,"dry_base":null,"dry_allowed_length":null,"dry_sequence_breakers":null}```
## Latest commit or version
Which commit or version you ran with.
mistralrs = { git = "https://github.com/EricLBuehler/mistral.rs.git", features = ["metal"], version = "0.3.0", rev = "ae71578" }
The text was updated successfully, but these errors were encountered:
Describe the bug
Running in the MacBook M2 Pro Metal mode is too slow, and it becomes incredibly slow when the issue is slightly more complex.
Even to the point of freezing, the GPU usage is high but it cannot provide completions results.
the model:https://huggingface.co/cleatherbury/Phi-3-mini-128k-instruct-Q4_K_M-GGUF
The text was updated successfully, but these errors were encountered: