Skip to content

Releases: kp-forks/llama.cpp

b4388

24 Dec 16:55
30caac3
Compare
Choose a tag to compare
llama : the WPM vocabs use the CLS token as BOS (#10930)

* llama : the WPM vocabs use the CLS token as BOS

ggml-ci

* llama : add comment

b4387

24 Dec 05:55
60cfa72
Compare
Choose a tag to compare
ggml : use wstring for backend search paths (#10960)

ggml-ci

b4385

23 Dec 21:49
32d6ee6
Compare
Choose a tag to compare
ggml : fix const usage in SSE path (#10962)

b4384

23 Dec 14:26
14b699e
Compare
Choose a tag to compare
server : fix missing model id in /model endpoint (#10957)

* server : fix missing model id in /model endpoint

* fix ci

b4375

22 Dec 13:41
ebdee94
Compare
Choose a tag to compare
vulkan: build fixes for 32b (#10927)

* vulkan: build fixes for 32b

Should fix #10923

* vulkan: initialize some buffer/offset variables

b4372

21 Dec 05:35
e34c5af
Compare
Choose a tag to compare
ggml-cpu: replace NEON asm with intrinsics in ggml_gemv_q4_0_4x8_q8_0…

b4371

20 Dec 21:58
eb5c3dc
Compare
Choose a tag to compare
SYCL: Migrate away from deprecated ggml_tensor->backend (#10840)

* Migrate to tensor->buffer for checking backend buffer type: 1

* SYCL: common.cpp try to migrate away from tensor->backend

* SYCL: fix assertions and add proper comments

* SYCL: remove extra space

* SYCL: Add back static to ggml_backend_buffer_is_sycl_split function

* SYCL: Add pragma directive to suppress warning spam

* SYCL: Integrate debug logs with GGML_LOG and other fixes

* Revert "SYCL: Integrate debug logs with GGML_LOG and other fixes"

This reverts commit 2607b7de0f0d2f4f1f690226f86fa861aa39cb97.
Let's keep the current SYCL specific logging mechanism for now

* SYCL: Use GGML_SYCL_DEBUG after reverting

* SYCL: reg_get_proc_address func, update to the current func signature

* SYCL: Refactor SYCL buffer checks in ggml_sycl_cpy_tensor_2d

b4369

20 Dec 13:48
21ae3b9
Compare
Choose a tag to compare
ggml : add test for SVE and disable when it fails (#10906)

b4367

19 Dec 21:47
d408bb9
Compare
Choose a tag to compare
clip : disable GPU support (#10896)

ggml-ci

b4361

19 Dec 13:34
7585edb
Compare
Choose a tag to compare
convert : Add support for Microsoft Phi-4 model  (#10817)

* convert : use GPT2 vocab for Phi-4 model

* convert : use null value of sliding_window to distinguish Phi-4 from other PHI3-based models

* llama : do not use sliding window attention mask for Phi-4 model

---------

Co-authored-by: Stanisław Szymczyk <[email protected]>