Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an API endpoint to load the last-used model #5516

Open
wants to merge 1 commit into
base: dev
Choose a base branch
from

Conversation

anon-contributor-0
Copy link

@anon-contributor-0 anon-contributor-0 commented Feb 16, 2024

Adds an internal API endpoint to the OpenAI API that allows loading of the last used model.

This new endpoint would be particularly helpful for scenarios where VRAM management is necessary - a third-party application can ask text-generation-webui to vacate VRAM (e.g. with /v1/internal/model/unload), then quickly reload the model that was just active once some other task is done (for example, image generation). That technique is employed in the sd_api_pictures extension with AUTOMATIC1111's Web UI. This PR would allow other applications to perform the same technique with text-generation-webui.

/v1/internal/model/loadlast triggers the new models.load_last_model() function if it's POSTed to.

As a bonus, this also fixes a bug in models.reload_model() - it would fail since shared.model_name was set to None by models.unload_model(), meaning that reload_model() would then attempt to load None as a result. Setting it to the newly-added shared.last_model_name variable should fix that issue.

@anon-contributor-0 anon-contributor-0 changed the title Add an API endpoint to reload the last-used model Add an API endpoint to load the last-used model Feb 16, 2024
@oobabooga oobabooga deleted the branch oobabooga:dev February 17, 2024 21:53
@oobabooga oobabooga closed this Feb 17, 2024
@oobabooga oobabooga reopened this Feb 17, 2024
@anon-contributor-0
Copy link
Author

@oobabooga, any chance you could have a look at this? It's a relatively quick PR, and it could allow for better flexibility for users on lower-end hardware.

@anon-contributor-0
Copy link
Author

anon-contributor-0 commented May 20, 2024

@oobabooga Checking in again - any chance you'd have a cycle to take a look here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants