-
Notifications
You must be signed in to change notification settings - Fork 355
Support Falcon #293
Comments
Already on it, got it converted and quantized but it produced gibberish. Im waiting on ggerganov/llama.cpp#1602 to see how they will handle the Q, K, V weights. I dont want to create two seperate falcon-ggml ecosystems, so im waiting for the upstream ggml implementation. |
Ongoing discussion worth tracking here to get GG conversion ggerganov/llama.cpp#1602 Found after posting this here. An attempt to convert has been made ggerganov/llama.cpp#1602 (comment) |
Looks like our posts overlapped! Great to hear, I've offered to provide GPU access to further the work being done in ggerganov/llama.cpp#1602 - will follow up as that progresses |
There is now a working GGML example for 40B: ggerganov/ggml#231 |
That's great! Maybe i will create a draft, but i would like to wait until it get's merged into ggml. |
Working one here https://github.com/jploski/ggml/tree/falcon40b |
Yeah, I noticed that. It would be great if someone could try porting it to Rust. I'm currently quite busy implementing GPU acceleration for all architectures.😬 |
Damn, was hoping editing the description would cancel out the issue-closing. Anyhow - I've merged in the Falcon 7B implementation, but it doesn't handle 40B, and it requires 32-bit memory tensors as the I'll keep this issue open until Falcon is truly ready to fly. |
@LLukas22 should we close this or wait until the model format has stabilised? |
We should wait until GGUF is implemented and we have all the necessary fields in the model file. |
Similar to MPT, Falcon is Apache licensed, weights and all!
And according to the HuggingFace leaderboard it outperforms all current open source models including MPT.
It seems having a GGML conversion done of the model is a necessary precursor to having it included.
I don't think I have the expertise to do this but we may be able to help (e.g. can give access to a V100S or V100S to do the conversion)
The text was updated successfully, but these errors were encountered: