Support Falcon #293

zcourts · 2023-06-02T12:50:33Z

Similar to MPT, Falcon is Apache licensed, weights and all!

And according to the HuggingFace leaderboard it outperforms all current open source models including MPT.

It seems having a GGML conversion done of the model is a necessary precursor to having it included.

I don't think I have the expertise to do this but we may be able to help (e.g. can give access to a V100S or V100S to do the conversion)

LLukas22 · 2023-06-02T13:00:42Z

Already on it, got it converted and quantized but it produced gibberish. Im waiting on ggerganov/llama.cpp#1602 to see how they will handle the Q, K, V weights. I dont want to create two seperate falcon-ggml ecosystems, so im waiting for the upstream ggml implementation.

zcourts · 2023-06-02T13:01:36Z

Ongoing discussion worth tracking here to get GG conversion ggerganov/llama.cpp#1602

Found after posting this here. An attempt to convert has been made ggerganov/llama.cpp#1602 (comment)

zcourts · 2023-06-02T13:42:06Z

Looks like our posts overlapped! Great to hear, I've offered to provide GPU access to further the work being done in ggerganov/llama.cpp#1602 - will follow up as that progresses

KerfuffleV2 · 2023-06-15T02:21:26Z

There is now a working GGML example for 40B: ggerganov/ggml#231

LLukas22 · 2023-06-15T07:48:09Z

That's great! Maybe i will create a draft, but i would like to wait until it get's merged into ggml.

iHaagcom · 2023-06-16T12:31:43Z

Working one here https://github.com/jploski/ggml/tree/falcon40b

LLukas22 · 2023-06-16T12:42:58Z

Yeah, I noticed that. It would be great if someone could try porting it to Rust. I'm currently quite busy implementing GPU acceleration for all architectures.😬

philpax · 2023-06-28T23:20:38Z

Damn, was hoping editing the description would cancel out the issue-closing.

Anyhow - I've merged in the Falcon 7B implementation, but it doesn't handle 40B, and it requires 32-bit memory tensors as the repeat operation it uses doesn't work with 16-bit tensors. Because of these caveats - and the continuing work on (one of) the original implementations in https://github.com/cmp-nct/ggllm.cpp - I've decided to merge it in, but disable it by default.

I'll keep this issue open until Falcon is truly ready to fly.

philpax · 2023-07-27T09:46:35Z

@LLukas22 should we close this or wait until the model format has stabilised?

LLukas22 · 2023-07-27T09:47:45Z

We should wait until GGUF is implemented and we have all the necessary fields in the model file.

LLukas22 added the topic:model-support Support for new models label Jun 2, 2023

LLukas22 mentioned this issue Jun 4, 2023

Feature Request: Falcon 7B support LLukas22/llm-rs-python#18

Open

LLukas22 mentioned this issue Jun 17, 2023

Add Falcon Support #313

Merged

philpax assigned LLukas22 Jun 18, 2023

philpax closed this as completed in #313 Jun 28, 2023

philpax reopened this Jun 28, 2023

skirodev mentioned this issue Jul 15, 2023

add Falcon 40B model support #368

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support Falcon #293

Support Falcon #293

zcourts commented Jun 2, 2023

LLukas22 commented Jun 2, 2023

zcourts commented Jun 2, 2023

zcourts commented Jun 2, 2023

KerfuffleV2 commented Jun 15, 2023

LLukas22 commented Jun 15, 2023

iHaagcom commented Jun 16, 2023

LLukas22 commented Jun 16, 2023

philpax commented Jun 28, 2023

philpax commented Jul 27, 2023

LLukas22 commented Jul 27, 2023

Support Falcon #293

Support Falcon #293

Comments

zcourts commented Jun 2, 2023

LLukas22 commented Jun 2, 2023

zcourts commented Jun 2, 2023

zcourts commented Jun 2, 2023

KerfuffleV2 commented Jun 15, 2023

LLukas22 commented Jun 15, 2023

iHaagcom commented Jun 16, 2023

LLukas22 commented Jun 16, 2023

philpax commented Jun 28, 2023

philpax commented Jul 27, 2023

LLukas22 commented Jul 27, 2023