Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Server stops working after a while #108

Open
nklsbckmnn opened this issue Sep 1, 2024 · 3 comments
Open

Server stops working after a while #108

nklsbckmnn opened this issue Sep 1, 2024 · 3 comments
Assignees

Comments

@nklsbckmnn
Copy link

Using the local server in 0.3.2, I regularly have the problem that the server stops working after a few hundred requests resulting in timeouts. Restarting the server within the app does not help, only restarting the app. In the server log after "Received POST request [...]" I get "Running chat completion on conversation with 1 messages." and then, 20 minutes later, instead of the usual "Generated prediction", I get "Client disconnected. Stopping generation... if the model is busy processing the prompt, it will finish first" immediately followed by "Client disconnected. Stopping generation...".

@yagil
Copy link
Member

yagil commented Sep 1, 2024

@nklsbckmnn thanks for the bug report, we'll investigate. Can you please check if there's anything that looks related in the app logs? You can find them by clicking on the bottom left of the screen and then -> open app logs

image

@nklsbckmnn
Copy link
Author

Nothing out of the ordinary and nothing around the time in question.

@nklsbckmnn
Copy link
Author

nklsbckmnn commented Sep 4, 2024

I can reliably reproduce this. I noticed that the status indicator in the list of loaded models was stuck on "Processing". I also noticed that at least in cases where there was no "Accumulating tokens ... (stream = false)" after "Running chat completion", following "Client disconnected. Stopping generation.." there were entries like this in the log:

[LM STUDIO SERVER] [gemma-2-9b-it-q8_0-f16] Generated prediction: {
  "id": "chatcmpl-nc8n345tu5ohg4fz3lc5",
  "object": "chat.completion",
  "created": 1725422160,
  "model": "gemma-2-9b-it-q8_0-f16",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": ""
      },
      "logprobs": null,
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 0,
    "completion_tokens": 0,
    "total_tokens": 0
  },
  "system_fingerprint": "gemma-2-9b-it-q8_0-f16"
}

When I unloaded the model I got "[ERROR] Model unloaded.. Error Data: n/a, Additional Data: n/a" in the server log. I'm using "bartowski/gemma-2-9b-it-GGUF/gemma-2-9b-it-Q8_0-f16.gguf". Contrary to the last time I encountered this, I was able to return to a working state just by reloading the model, no app restart needed. I will run with verbose logging next. Thanks for looking into this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants