Request: DeepSeek Coder V2 and Coder Lite V2 #63
Replies: 2 comments 7 replies
-
I’m glad to hear that you like this project! For now 1m doesn't support multi-gpu orz. We’re considering adding multi-GPU support to 1m in the next release. Regarding the issue with the coder, could you let us know how you ran DeepseekV2 Coder? Please share the run command and the optimised YAML file you used. Keep in mind that the lite version has only 28 layers, while the coder version has 60 layers, which is important when writing your optimised YAML. |
Beta Was this translation helpful? Give feedback.
-
Thank you. That was an oversight by me. It definitely does work with the Coder models. Works great. I'll be looking to see if I can modify the yaml to better optimize the larger DeepSeek model for my particular environment. After the base functionality that seems like a great addition if it's possible without human direction. python -m ktransformers.local_chat --model_path deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct --gguf_path /mnt/models/sv-llm:deepseek-coder-v2:16b-lite-instruct-q8_0 |
Beta Was this translation helpful? Give feedback.
-
I tried this with DeepSeek Lite V2 and the resource improvement was really great. Of course I tried the same optimization files on the Coder models but that failed. This looks really promising for the usability of MOE models.
I'd really like to request CPU, 1GPU, and 2GPU (24GB per GPU) versions of these that supports their full 128K context. Even with your examples I'm not sure I'd get this right.
Beta Was this translation helpful? Give feedback.
All reactions