How to fetch kv cache under llamacpp like “past_key_values“？ #9670

hitdra · 2024-09-27T19:19:43Z

hitdra
Sep 27, 2024

With pytorch and the transformer framework, I can get the kv cache like this
generated = model.generate(input_ids, max_new_tokens = 1, return_dict_in_generate=True)
kv = generated['past_key_values']
How to get the corresponding kv cache under llamacpp?
I really appreciate everyone being able to answer my questions, thanks a million!

bingo787 · 2024-10-11T02:37:51Z

bingo787
Oct 11, 2024

I want to konw also.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to fetch kv cache under llamacpp like “past_key_values“？ #9670

{{title}}

Replies: 1 comment

{{title}}

Select a reply

How to fetch kv cache under llamacpp like “past_key_values“？ #9670

hitdra Sep 27, 2024

Replies: 1 comment

bingo787 Oct 11, 2024

hitdra
Sep 27, 2024

bingo787
Oct 11, 2024