From 727d88e9a94e2e51369d1a429583b82c46c20401 Mon Sep 17 00:00:00 2001 From: Lianmin Zheng Date: Sat, 26 Oct 2024 04:42:14 -0700 Subject: [PATCH] Update 2024-10-26 04:42:14 --- README.html | 4 ++-- _sources/troubleshooting.md | 4 ++-- backend.html | 4 ++-- benchmark_and_profiling.html | 4 ++-- choices_methods.html | 4 ++-- contributor_guide.html | 4 ++-- custom_chat_template.html | 4 ++-- embedding_model.html | 4 ++-- frontend.html | 4 ++-- hyperparameter_tuning.html | 4 ++-- index.html | 4 ++-- install.html | 4 ++-- model_support.html | 4 ++-- release_process.html | 4 ++-- sampling_params.html | 4 ++-- searchindex.js | 2 +- send_request.html | 4 ++-- setup_github_runner.html | 4 ++-- troubleshooting.html | 8 ++++---- 19 files changed, 39 insertions(+), 39 deletions(-) diff --git a/README.html b/README.html index 236f6e5..b966fde 100644 --- a/README.html +++ b/README.html @@ -253,7 +253,7 @@ -
  • CUDA error: an illegal memory access was encounteredThis error may be due to kernel errors or out-of-memory issues.

    • If it is a kernel error, it is not easy to fix.

    • -
    • If it is out-of-memory, sometimes it will report this error instead of “Out-of-memory.” In this case, try setting a smaller value for --mem-fraction-static. The default value of --mem-fraction-static is around 0.8 - 0.9. https://github.com/sgl-project/sglang/blob/1edd4e07d6ad52f4f63e7f6beaa5987c1e1cf621/python/sglang/srt/server_args.py#L92-L102

    • +
    • If it is out-of-memory, sometimes it will report this error instead of “Out-of-memory.” In this case, try setting a smaller value for --mem-fraction-static. The default value of --mem-fraction-static is around 0.8 - 0.9.

    @@ -439,7 +439,7 @@

    The server hangs
  • Add --disable-cuda-graph.

  • -
  • Add --disable-flashinfer-sampling.

  • +
  • Add --sampling-backend pytorch.