Releases · kp-forks/llama.cpp

24 Dec 16:55

30caac3

b4388

llama : the WPM vocabs use the CLS token as BOS (#10930)

* llama : the WPM vocabs use the CLS token as BOS

ggml-ci

* llama : add comment

Assets 23

24 Dec 05:55

github-actions

b4387

60cfa72

b4387

ggml : use wstring for backend search paths (#10960)

ggml-ci

Assets 23

23 Dec 21:49

github-actions

b4385

32d6ee6

b4385

ggml : fix const usage in SSE path (#10962)

Assets 23

23 Dec 14:26

github-actions

b4384

14b699e

b4384

server : fix missing model id in /model endpoint (#10957)

* server : fix missing model id in /model endpoint

* fix ci

Assets 23

22 Dec 13:41

github-actions

b4375

ebdee94

b4375

vulkan: build fixes for 32b (#10927)

* vulkan: build fixes for 32b

Should fix #10923

* vulkan: initialize some buffer/offset variables

Assets 23

21 Dec 05:35

github-actions

b4372

e34c5af

b4372

ggml-cpu: replace NEON asm with intrinsics in ggml_gemv_q4_0_4x8_q8_0…

Assets 23

20 Dec 21:58

github-actions

b4371

eb5c3dc

b4371

SYCL: Migrate away from deprecated ggml_tensor->backend (#10840)

* Migrate to tensor->buffer for checking backend buffer type: 1

* SYCL: common.cpp try to migrate away from tensor->backend

* SYCL: fix assertions and add proper comments

* SYCL: remove extra space

* SYCL: Add back static to ggml_backend_buffer_is_sycl_split function

* SYCL: Add pragma directive to suppress warning spam

* SYCL: Integrate debug logs with GGML_LOG and other fixes

* Revert "SYCL: Integrate debug logs with GGML_LOG and other fixes"

This reverts commit 2607b7de0f0d2f4f1f690226f86fa861aa39cb97.
Let's keep the current SYCL specific logging mechanism for now

* SYCL: Use GGML_SYCL_DEBUG after reverting

* SYCL: reg_get_proc_address func, update to the current func signature

* SYCL: Refactor SYCL buffer checks in ggml_sycl_cpy_tensor_2d

Assets 23

20 Dec 13:48

github-actions

b4369

21ae3b9

b4369

ggml : add test for SVE and disable when it fails (#10906)

Assets 23

19 Dec 21:47

github-actions

b4367

d408bb9

b4367

clip : disable GPU support (#10896)

ggml-ci

Assets 23

19 Dec 13:34

github-actions

b4361

7585edb

b4361

convert : Add support for Microsoft Phi-4 model  (#10817)

* convert : use GPT2 vocab for Phi-4 model

* convert : use null value of sliding_window to distinguish Phi-4 from other PHI3-based models

* llama : do not use sliding window attention mask for Phi-4 model

---------

Co-authored-by: Stanisław Szymczyk <[email protected]>

Assets 23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: kp-forks/llama.cpp

b4388

b4387

b4385

b4384

b4375

b4372

b4371

b4369

b4367

b4361