Skip to content

Releases: kvcache-ai/ktransformers

v0.1.4

30 Aug 13:52
022b893
Compare
Choose a tag to compare

Bug fix

  1. Fix bug that ktransformers cannot offload whole layer in cpu.
  2. Update DeepseekV2‘s multi gpu yaml examples to evenly allocate layers.
  3. Update Docker file.
  4. Fix bug about Qwen2-57B can not loaded
  5. Fix bug with #66 , add requirements for uvicorn

v0.1.3

29 Aug 01:36
233bbb8
Compare
Choose a tag to compare
  1. support internlm2.5 for 1M Prompt under 24GB VRAM and 150GB DRAM(only local_chat)
  2. decrease DeepseekV2's required VRAM from 20G to 10G.
  3. fix bugs as #51 #52 #56

v0.1.2

15 Aug 17:39
77a34c2
Compare
Choose a tag to compare
  1. Support windows native. #4
  2. Support multiple GPU. #8
  3. Support llamfile as linear backend.
  4. Support new model: mixtral 8 * 7B and 8 * 22B
  5. Support q2k, q3k, q5k dequant on gpu. #16
  6. Support github action to create pre compile package
  7. Support shared memory in different operator
  8. Fix some bugs on build from source #23

v0.1.1

01 Aug 04:40
5e83bc0
Compare
Choose a tag to compare
  1. support multiple cpu architecture pre compiled wheel package
  2. pre compile wheel package support multiple TORCH_CUDA_ARCH_LIST as "8.0;8.6;8.7;8.9"
  3. test and support python 3.10
  4. add a dockerfile to build docker image
  5. update README.md to support docker (In Progress: upload docker image)
  6. update version to 0.1.1

0.1.0

29 Jul 13:19
2562082
Compare
Choose a tag to compare
  1. Complete the submission information for PyPI.
  2. Support for dynamically detecting the current environment of the client. If precompiled packages can be used for installation, download and install the precompiled packages. (Adapted from flash-attn)
  3. Modified the installation process in the README.