Releases · kvcache-ai/ktransformers · GitHub

30 Aug 13:52

UnicornChan

v0.1.4 Latest

Latest

Bug fix

Fix bug that ktransformers cannot offload whole layer in cpu.
Update DeepseekV2‘s multi gpu yaml examples to evenly allocate layers.
Update Docker file.
Fix bug about Qwen2-57B can not loaded
Fix bug with #66 , add requirements for uvicorn

Assets 68

ktransformers-0.1.4+cu121torch23avx2-cp310-cp310-linux_x86_64.whl

26.7 MB 2024-08-30T13:47:50Z
ktransformers-0.1.4+cu121torch23avx2-cp311-cp311-linux_x86_64.whl

26.7 MB 2024-08-30T13:47:57Z
ktransformers-0.1.4+cu121torch23avx2-cp312-cp312-linux_x86_64.whl

26.7 MB 2024-08-30T13:48:02Z
ktransformers-0.1.4+cu121torch23avx512-cp310-cp310-linux_x86_64.whl

26.7 MB 2024-08-30T13:48:05Z
ktransformers-0.1.4+cu121torch23avx512-cp311-cp311-linux_x86_64.whl

26.7 MB 2024-08-30T13:48:09Z
ktransformers-0.1.4+cu121torch23avx512-cp312-cp312-linux_x86_64.whl

26.7 MB 2024-08-30T13:48:13Z
ktransformers-0.1.4+cu121torch23fancy-cp310-cp310-linux_x86_64.whl

26.7 MB 2024-08-30T13:48:16Z
ktransformers-0.1.4+cu121torch23fancy-cp311-cp311-linux_x86_64.whl

26.7 MB 2024-08-30T13:48:19Z
ktransformers-0.1.4+cu121torch23fancy-cp312-cp312-linux_x86_64.whl

26.7 MB 2024-08-30T13:48:23Z
ktransformers-0.1.4+cu121torch24avx2-cp310-cp310-linux_x86_64.whl

26.7 MB 2024-08-30T13:48:28Z
Source code (zip)

2024-08-30T09:57:49Z
Source code (tar.gz)

2024-08-30T09:57:49Z

29 Aug 01:36

UnicornChan

v0.1.3

support internlm2.5 for 1M Prompt under 24GB VRAM and 150GB DRAM(only local_chat)
decrease DeepseekV2's required VRAM from 20G to 10G.
fix bugs as #51 #52 #56

Assets 4

15 Aug 17:39

UnicornChan

v0.1.2

Support windows native. #4
Support multiple GPU. #8
Support llamfile as linear backend.
Support new model: mixtral 8 * 7B and 8 * 22B
Support q2k, q3k, q5k dequant on gpu. #16
Support github action to create pre compile package
Support shared memory in different operator
Fix some bugs on build from source #23

Assets 59

01 Aug 04:40

UnicornChan

v0.1.1

support multiple cpu architecture pre compiled wheel package
pre compile wheel package support multiple TORCH_CUDA_ARCH_LIST as "8.0;8.6;8.7;8.9"
test and support python 3.10
add a dockerfile to build docker image
update README.md to support docker (In Progress: upload docker image)
update version to 0.1.1

Assets 57

29 Jul 13:19

UnicornChan

0.1.0

Complete the submission information for PyPI.
Support for dynamically detecting the current environment of the client. If precompiled packages can be used for installation, download and install the precompiled packages. (Adapted from flash-attn)
Modified the installation process in the README.

Assets 10