Skip to content

Commit

Permalink
Merge branch 'main' into flex_scheduler
Browse files Browse the repository at this point in the history
  • Loading branch information
yukavio authored Aug 30, 2024
2 parents f101839 + b7f8341 commit 8d5c8d9
Show file tree
Hide file tree
Showing 185 changed files with 7,368 additions and 4,279 deletions.
3 changes: 2 additions & 1 deletion .github/ISSUE_TEMPLATE/1-bug-report.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ body:
- label: 2. The bug has not been fixed in the latest version.
- label: 3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback.
- label: 4. If the issue you raised is not a bug but a question, please raise a discussion at https://github.com/sgl-project/sglang/discussions/new/choose Otherwise, it will be closed.
- label: 5. Please use English, otherwise it will be closed.
- type: textarea
attributes:
label: Describe the bug
Expand All @@ -31,7 +32,7 @@ body:
attributes:
label: Environment
description: |
Please provide necessary environment information here with `python3 -m sglang.check_env`.
Please provide necessary environment information here with `python3 -m sglang.check_env`. Otherwise the issue will be closed.
placeholder: Environment here.
validations:
required: true
6 changes: 6 additions & 0 deletions .github/ISSUE_TEMPLATE/2-feature-request.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,12 @@ description: Suggest an idea for this project
title: "[Feature] "

body:
- type: checkboxes
attributes:
label: Checklist
options:
- label: 1. If the issue you raised is not a feature but a question, please raise a discussion at https://github.com/sgl-project/sglang/discussions/new/choose Otherwise, it will be closed.
- label: 2. Please use English, otherwise it will be closed.
- type: textarea
attributes:
label: Motivation
Expand Down
14 changes: 7 additions & 7 deletions .github/pull_request_template.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,15 @@
Thank you for your contribution, we really appreciate it. The following instructions will help improve your pull request and make it easier to receive feedback. If there are any items you don't understand, don't worry. Just submit the pull request and ask the maintainers for help.
<!-- Thank you for your contribution! We appreciate it. The following guidelines will help improve your pull request and facilitate feedback. If anything is unclear, don't hesitate to submit your pull request and ask the maintainers for assistance. -->

## Motivation

Please explain the motivation behind this PR and the goal you aim to achieve with it.
<!-- Explain the purpose of this PR and the goals it aims to achieve. -->

## Modification
## Modifications

Briefly describe the changes made in this PR.
<!-- Describe the changes made in this PR. -->

## Checklist

1. Ensure pre-commit `pre-commit run --all-files` or other linting tools are used to fix potential lint issues.
2. Confirm that modifications are covered by complete unit tests. If not, please add more unit tests for correctness.
3. Modify documentation as needed, such as docstrings or example tutorials.
- [ ] Format your code according to the [Contributor Guide](https://github.com/sgl-project/sglang/blob/main/docs/en/contributor_guide.md).
- [ ] Add unit tests as outlined in the [Contributor Guide](https://github.com/sgl-project/sglang/blob/main/docs/en/contributor_guide.md).
- [ ] Update documentation as needed, including docstrings or example tutorials.
43 changes: 43 additions & 0 deletions .github/workflows/accuracy-test.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
name: Accuracy Test

on:
push:
branches: [ main ]
paths:
- "python/sglang/**"
- "test/**"
pull_request:
branches: [ main ]
paths:
- "python/sglang/**"
- "test/**"
workflow_dispatch:

concurrency:
group: accuracy-test-${{ github.ref }}
cancel-in-progress: true

jobs:
accuracy-test:
if: github.repository == 'sgl-project/sglang' || github.event_name == 'pull_request'
runs-on: 1-gpu-runner

steps:
- name: Checkout code
uses: actions/checkout@v3

- name: Install dependencies
run: |
pip install --upgrade pip
pip install -e "python[all]"
pip install flashinfer -i https://flashinfer.ai/whl/cu121/torch2.4/ --force-reinstall
git clone https://github.com/merrymercy/human-eval.git
cd human-eval
pip install -e .
- name: Evaluate Accuracy
timeout-minutes: 20
run: |
cd test/srt
python3 test_eval_accuracy_large.py
22 changes: 22 additions & 0 deletions .github/workflows/cancel-pr-workflow.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
name: Cancel PR Workflows on Merge

on:
pull_request_target:
types:
- closed

permissions:
actions: write

jobs:
cancel:
if: github.event.pull_request.merged == true
runs-on: ubuntu-latest
steps:
- name: Cancel Previous Runs
uses: styfle/[email protected]
with:
workflow_id: all
access_token: ${{ secrets.GITHUB_TOKEN }}
ignore_sha: true
pr_number: ${{ github.event.pull_request.number }}
60 changes: 32 additions & 28 deletions .github/workflows/e2e-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,33 +20,37 @@ concurrency:
jobs:
e2e-test:
if: github.repository == 'sgl-project/sglang' || github.event_name == 'pull_request'
runs-on: bench
runs-on: 1-gpu-runner

steps:
- name: Checkout code
uses: actions/checkout@v3

- name: Install dependencies
run: |
source $HOME/venv/bin/activate
echo "$HOME/venv/bin" >> $GITHUB_PATH
pip install --upgrade pip
pip install -e "python[all]"
pip install flashinfer -i https://flashinfer.ai/whl/cu121/torch2.4/ --force-reinstall
- name: Benchmark Serving Throughput
run: |
cd test/srt
python3 -m unittest test_serving_throughput.TestServingThroughput.test_default
- name: Benchmark Serving Throughput (w/o RadixAttention)
run: |
cd test/srt
python3 -m unittest test_serving_throughput.TestServingThroughput.test_default_without_radix_cache
- name: Benchmark Serving Throughput (w/o FlashInfer)
run: |
cd test/srt
python3 -m unittest test_serving_throughput.TestServingThroughput.test_default_without_flashinfer
- name: Checkout code
uses: actions/checkout@v3

- name: Install dependencies
run: |
pip install --upgrade pip
pip install -e "python[all]"
pip install flashinfer -i https://flashinfer.ai/whl/cu121/torch2.4/ --force-reinstall
- name: Benchmark Serving Throughput
timeout-minutes: 10
run: |
cd test/srt
python3 -m unittest test_serving_throughput.TestServingThroughput.test_default
- name: Benchmark Serving Latency
timeout-minutes: 10
run: |
python3 -m sglang.bench_latency --model meta-llama/Meta-Llama-3.1-8B-Instruct --batch-size 1 --input 128 --output 8
- name: Benchmark Serving Throughput (w/o RadixAttention)
timeout-minutes: 10
run: |
cd test/srt
python3 -m unittest test_serving_throughput.TestServingThroughput.test_default_without_radix_cache
- name: Benchmark Serving Throughput (w/o ChunkedPrefill)
timeout-minutes: 10
run: |
cd test/srt
python3 -m unittest test_serving_throughput.TestServingThroughput.test_default_without_chunked_prefill
45 changes: 45 additions & 0 deletions .github/workflows/moe-test.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
name: MoE Test

on:
push:
branches: [ main ]
paths:
- "python/sglang/**"
- "test/**"
pull_request:
branches: [ main ]
paths:
- "python/sglang/**"
- "test/**"
workflow_dispatch:

concurrency:
group: moe-test-${{ github.ref }}
cancel-in-progress: true

jobs:
moe-test:
if: github.repository == 'sgl-project/sglang' || github.event_name == 'pull_request'
runs-on: 2-gpu-runner

steps:
- name: Checkout code
uses: actions/checkout@v3

- name: Install dependencies
run: |
pip install --upgrade pip
pip install -e "python[all]"
pip install flashinfer -i https://flashinfer.ai/whl/cu121/torch2.4/ --force-reinstall
- name: Benchmark MoE Serving Throughput
timeout-minutes: 10
run: |
cd test/srt
python3 -m unittest test_moe_serving_throughput.TestServingThroughput.test_default
- name: Benchmark MoE Serving Throughput (w/o RadixAttention)
timeout-minutes: 10
run: |
cd test/srt
python3 -m unittest test_moe_serving_throughput.TestServingThroughput.test_default_without_radix_cache
58 changes: 33 additions & 25 deletions .github/workflows/unit-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,31 +18,39 @@ concurrency:
cancel-in-progress: true

jobs:
unit-test:
unit-test-jobs:
if: github.repository == 'sgl-project/sglang' || github.event_name == 'pull_request'
runs-on: unit

runs-on: 1-gpu-runner
strategy:
matrix:
test_type: ['backend-0', 'backend-1', 'frontend']
steps:
- name: Checkout code
uses: actions/checkout@v3

- name: Install dependencies
run: |
source $HOME/venv/bin/activate
echo "$HOME/venv/bin" >> $GITHUB_PATH
pip install --upgrade pip
pip install -e "python[all]"
pip install flashinfer -i https://flashinfer.ai/whl/cu121/torch2.4/ --force-reinstall
pip install accelerate
pip install sentence_transformers
- name: Checkout code
uses: actions/checkout@v3

- name: Install dependencies
run: |
pip install --upgrade pip
pip install -e "python[dev]"
pip install flashinfer -i https://flashinfer.ai/whl/cu121/torch2.4/ --force-reinstall
- name: Run test
timeout-minutes: 20
run: |
if [ "${{ matrix.test_type }}" = "frontend" ]; then
cd test/lang
python3 run_suite.py --suite minimal
elif [ "${{ matrix.test_type }}" = "backend-0" ]; then
cd test/srt
python3 run_suite.py --suite minimal --range-begin 0 --range-end 8
elif [ "${{ matrix.test_type }}" = "backend-1" ]; then
cd test/srt
python3 run_suite.py --suite minimal --range-begin 8
fi
- name: Test Backend Runtime
run: |
cd test/srt
python3 run_suite.py --suite minimal
- name: Test Frontend Language
run: |
cd test/lang
python3 run_suite.py --suite minimal
unit-test:
needs: unit-test-jobs
runs-on: ubuntu-latest
steps:
- name: Merge step
run: echo "This is an empty merge step"
Loading

0 comments on commit 8d5c8d9

Please sign in to comment.