Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

llama : refactor model loading code #1991

Closed
ggerganov opened this issue Jun 25, 2023 · 3 comments
Closed

llama : refactor model loading code #1991

ggerganov opened this issue Jun 25, 2023 · 3 comments
Assignees
Labels
good first issue Good for newcomers refactoring Refactoring

Comments

@ggerganov
Copy link
Owner

In llama.cpp we have logic for supporting some very old model formats and features such as sharded models which is making the code unnecessary complicated and difficult to maintain. We should simplify it and remove support for old stuff that is no longer used.

Additionally, with the upcoming unified file format (ggerganov/ggml#220) we will have to look into reimplementing the code to use it and add support for loading non-LLaMA models as well. This will be an important step towards adding inference of new models such as MPT and Falcon. Therefore, simplifying the logic as much as possible will help to easily adopt the new unified file format when it is ready

@ggerganov ggerganov added good first issue Good for newcomers refactoring Refactoring labels Jun 25, 2023
@hetiejun
Copy link

A gental solution might be to provide a tool that can convert old formats and even other formats into the new format, I suppose.

@howard0su
Copy link
Collaborator

Remove shards support. #2000

@ggerganov
Copy link
Owner Author

Closed via #2398

This was referenced Oct 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers refactoring Refactoring
Projects
Status: Done
Development

No branches or pull requests

3 participants