Replies: 2 comments 6 replies
-
Reading papers may help you understand the mathematical theory, but likely won't help you understand all the infrastructure here and the focus on performance optimization. I recommend building main.exe in debug mode, and stepping through the callflow with gdb. There was also a discussion where someone shared their learnings. |
Beta Was this translation helpful? Give feedback.
-
If you are looking to contribute, the best way to start is by actually using it as a "daily driver" tool. While "Attention is all you need" is a good start, you will find that it's very hard to actually apply the idea directly into the code. There's many other part in the project that you need to understand, for example backend (CPU/GPU) implementation, kv cache, gguf format,... For that reason, it's better to just use it, maybe also read all the And remember, only make contributions when it's good for community - not just good for you. This video pretty much sum that up: https://www.youtube.com/watch?v=5nY_cy8zcO4 |
Beta Was this translation helpful? Give feedback.
-
Hi i am looking to contribute to this project. I am just getting started... and I am looking for papers that explain the logic/concept of this open source project. Could you share any and everything that i should look into before I jump in?
I know that attention is all you need is a good start.
Thank you in advance
Beta Was this translation helpful? Give feedback.
All reactions