Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Setup "debugging" and misc cleanup #695

Merged
merged 1 commit into from
Sep 26, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 21 additions & 9 deletions .github/actions/install-deps-action/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,23 +4,35 @@ runs:
using: 'composite'
steps:
### OTHER REPOS ####

# Hard turn-off interactive mode
- run: echo 'debconf debconf/frontend select Noninteractive' | sudo debconf-set-selections
shell: bash

# Refresh packages list
- run: sudo apt update
# turn off interactive, refresh pkgs, use apt fast
- run: |
echo 'debconf debconf/frontend select Noninteractive' | sudo debconf-set-selections
sudo rm /var/lib/man-db/auto-update
echo "deb [signed-by=/etc/apt/keyrings/apt-fast.gpg] http://ppa.launchpad.net/apt-fast/stable/ubuntu $(source /etc/os-release && echo $UBUNTU_CODENAME) main" | sudo tee /etc/apt/sources.list.d/apt-fast.list
wget -q -O- "https://keyserver.ubuntu.com/pks/lookup?op=get&search=0xBC5934FD3DEBD4DAEA544F791E2824A7F22B44BD" | sudo gpg --dearmor -o /etc/apt/keyrings/apt-fast.gpg
sudo apt-get update -y
sudo apt-get install -y apt-fast aria2 tasksel
echo 'debconf apt-fast/maxdownloads string 100' | sudo debconf-set-selections
echo 'debconf apt-fast/dlflag boolean true' | sudo debconf-set-selections
echo 'debconf apt-fast/aptmanager string apt-get' | sudo debconf-set-selections
sudo tasksel remove ubuntu-desktop
shell: bash

### DOWNLOAD AND INSTALL DEPENDENCIES ###

# Download dependencies packaged by Ubuntu
- run: sudo apt -y install bison busybox-static cargo cmake coreutils cpio elfutils file flex gcc gcc-multilib git iproute2 jq kbd kmod libcap-dev libelf-dev libunwind-dev libvirt-clients libzstd-dev linux-headers-generic linux-tools-common linux-tools-generic make ninja-build pahole pkg-config python3-dev python3-pip python3-requests qemu-kvm rsync rustc stress-ng udev zstd libseccomp-dev libcap-ng-dev llvm clang python3-full pipx curl meson
- run: |
sudo apt-fast install -f -y bison busybox-static cmake coreutils \
cpio elfutils file flex gcc gcc-multilib git iproute2 jq kbd kmod \
libcap-dev libelf-dev libunwind-dev libvirt-clients libzstd-dev \
linux-headers-generic linux-tools-common linux-tools-generic make \
ninja-build pahole pkg-config python3-dev python3-pip python3-requests \
qemu-kvm rsync stress-ng udev zstd libseccomp-dev libcap-ng-dev \
llvm clang python3-full curl meson bpftrace cargo rustc dwarves
shell: bash

# virtme-ng
- run: pip3 install virtme-ng --break-system-packages
- run: sudo pip3 install virtme-ng --break-system-packages
shell: bash

# Setup KVM support
Expand Down
107 changes: 64 additions & 43 deletions .github/workflows/caching-build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ on:
- cron: "0 * * * *"
push:
pull_request:

jobs:
lint:
runs-on: ubuntu-24.04
Expand All @@ -32,8 +32,8 @@ jobs:
- run: sudo chown root /usr/bin/tar && sudo chmod u+s /usr/bin/tar
# redundancy to exit fast
- run: echo 'debconf debconf/frontend select Noninteractive' | sudo debconf-set-selections
- run: sudo apt update
- run: sudo apt install -y git --no-install-recommends
- run: sudo apt-get update
- run: sudo apt-get install -y git --no-install-recommends
# get latest head commit of sched_ext for-next
- run: echo "SCHED_EXT_KERNEL_COMMIT=$(git ls-remote https://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext.git heads/for-next | awk '{print $1}')" >> $GITHUB_ENV

Expand All @@ -45,8 +45,10 @@ jobs:
uses: actions/cache@v4
with:
path: |
linux
key: kernel-build-${{ env.SCHED_EXT_KERNEL_COMMIT }}
linux/arch/x86/boot/bzImage
linux/usr/include
linux/**/*.h
key: kernel-build-${{ env.SCHED_EXT_KERNEL_COMMIT }}-4

- if: ${{ steps.cache-kernel.outputs.cache-hit != 'true' }}
uses: ./.github/actions/install-deps-action
Expand All @@ -62,14 +64,6 @@ jobs:
- if: ${{ steps.cache-virtiofsd.outputs.cache-hit != 'true' && steps.cache-kernel.outputs.cache-hit != 'true' }}
run: cargo install virtiofsd && sudo cp -a ~/.cargo/bin/virtiofsd /usr/lib/

# cache bzImage alone for rust tests (disk space limit workaround)
- name: Cache bzImage
id: cache-bzImage
uses: actions/cache@v4
with:
path: |
linux/arch/x86/boot/bzImage
key: kernel-bzImage-${{ env.SCHED_EXT_KERNEL_COMMIT }}

- if: ${{ steps.cache-kernel.outputs.cache-hit != 'true' }}
name: Clone Kernel
Expand All @@ -96,14 +90,18 @@ jobs:
integration-test:
runs-on: ubuntu-24.04
needs: build-kernel
continue-on-error: true
strategy:
matrix:
scheduler: [ scx_bpfland, scx_lavd, scx_layered, scx_rlfifo, scx_rustland, scx_rusty ]
fail-fast: false
steps:
# prevent cache permission errors
- run: sudo chown root /usr/bin/tar && sudo chmod u+s /usr/bin/tar
- uses: actions/checkout@v4
- uses: Swatinem/rust-cache@v2
with:
key: ${{ matrix.scheduler }}
prefix-key: "4"
- uses: ./.github/actions/install-deps-action
# cache virtiofsd (goes away w/ 24.04)
- name: Cache virtiofsd
Expand All @@ -125,8 +123,10 @@ jobs:
uses: actions/cache@v4
with:
path: |
linux
key: kernel-build-${{ env.SCHED_EXT_KERNEL_COMMIT }}
linux/arch/x86/boot/bzImage
linux/usr/include
linux/**/*.h
key: kernel-build-${{ env.SCHED_EXT_KERNEL_COMMIT }}-4

# need to re-run job when kernel head changes between build and test running.
- if: ${{ steps.cache-kernel.outputs.cache-hit != 'true' }}
Expand All @@ -139,7 +139,7 @@ jobs:
- run: sudo chmod +x /usr/bin/veristat && sudo chmod 755 /usr/bin/veristat

# The actual build:
- run: meson setup build -Dkernel=$(pwd)/linux -Dkernel_headers=./linux/usr/include -Denable_stress=true
- run: meson setup build -Dkernel=../linux/arch/x86/boot/bzImage -Dkernel_headers=../linux -Denable_stress=true -Dvng_rw_mount=true
- run: meson compile -C build ${{ matrix.scheduler }}

# Print CPU model before running the tests (this can be useful for
Expand All @@ -148,13 +148,35 @@ jobs:

# Test schedulers
- run: meson compile -C build test_sched_${{ matrix.scheduler }}
# this is where errors we want logs on start occurring, so always generate debug info and save logs
if: always()
# Stress schedulers
- uses: cytopia/[email protected]
name: stress test
if: always()
with:
retries: 3
command: meson compile -C build stress_tests_${{ matrix.scheduler }}
- run: meson compile -C build veristat_${{ matrix.scheduler }}
if: always()
- run: sudo cat /var/log/dmesg > host-dmesg.ci.log
if: always()
- run: echo "NICE_REF=${{ github.event.pull_request && github.head_ref || github.ref_name }}" >> $GITHUB_ENV
if: always()
- run: mkdir -p ./log_save/
if: always()
# no symlink following here (to avoid cycles)
- run: sudo find '/home/runner/' -iname '*.ci.log' -exec mv {} ./log_save/ \;
if: always()
- name: upload debug logs, bpftrace, veristat, dmesg, etc.
if: always()
uses: actions/upload-artifact@v4
with:
name: ${{ matrix.scheduler }}_logs_${{ env.NICE_REF }}_${{ github.run_id }}_${{ github.run_attempt }}
path: ./log_save/*.ci.log
# it's all txt files w/ 90 day retention, lets be nice.
compression-level: 9


rust-test-core:
runs-on: ubuntu-24.04
Expand All @@ -166,6 +188,10 @@ jobs:
# prevent cache permission errors
- run: sudo chown root /usr/bin/tar && sudo chmod u+s /usr/bin/tar
- uses: actions/checkout@v4
- uses: Swatinem/rust-cache@v2
with:
key: ${{ matrix.component }}
prefix-key: "4"
- uses: ./.github/actions/install-deps-action
# cache virtiofsd (goes away w/ 24.04)
- name: Cache virtiofsd
Expand All @@ -180,28 +206,24 @@ jobs:

# get latest head commit of sched_ext for-next
- run: echo "SCHED_EXT_KERNEL_COMMIT=$(git ls-remote https://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext.git heads/for-next | awk '{print $1}')" >> $GITHUB_ENV
# cache bzImage alone for rust tests
- name: Cache bzImage
id: cache-bzImage

- name: Cache Kernel
id: cache-kernel
uses: actions/cache@v4
with:
path: |
linux/arch/x86/boot/bzImage
key: kernel-bzImage-${{ env.SCHED_EXT_KERNEL_COMMIT }}
linux/usr/include
linux/**/*.h
key: kernel-build-${{ env.SCHED_EXT_KERNEL_COMMIT }}-4

# need to re-run job when kernel head changes between build and test running.
- if: ${{ steps.cache-bzImage.outputs.cache-hit != 'true' }}
- if: ${{ steps.cache-kernel.outputs.cache-hit != 'true' }}
name: exit if cache stale
run: exit -1

- uses: Swatinem/rust-cache@v2
with:
workspaces: rust
key: ${{ matrix.component }}
prefix-key: "1"
- run: cargo build --manifest-path rust/${{ matrix.component }}/Cargo.toml
- run: cargo test --manifest-path rust/${{ matrix.component }}/Cargo.toml --no-run
- run: vng -v --memory 10G --cpu 8 -r linux/arch/x86/boot/bzImage --net user -- cargo test --manifest-path rust/${{ matrix.component }}/Cargo.toml
- run: vng -v --rw --memory 10G --cpu 8 -r linux/arch/x86/boot/bzImage --net user -- cargo test --manifest-path rust/${{ matrix.component }}/Cargo.toml

rust-test-schedulers:
runs-on: ubuntu-24.04
Expand All @@ -213,6 +235,10 @@ jobs:
# prevent cache permission errors
- run: sudo chown root /usr/bin/tar && sudo chmod u+s /usr/bin/tar
- uses: actions/checkout@v4
- uses: Swatinem/rust-cache@v2
with:
key: ${{ matrix.scheduler }}
prefix-key: "4"
- uses: ./.github/actions/install-deps-action
# cache virtiofsd (goes away w/ 24.04)
- name: Cache virtiofsd
Expand All @@ -227,28 +253,24 @@ jobs:

# get latest head commit of sched_ext for-next
- run: echo "SCHED_EXT_KERNEL_COMMIT=$(git ls-remote https://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext.git heads/for-next | awk '{print $1}')" >> $GITHUB_ENV
# cache bzImage alone for rust tests
- name: Cache bzImage
id: cache-bzImage
# Cache Kernel alone for rust tests
- name: Cache Kernel
id: cache-kernel
uses: actions/cache@v4
with:
path: |
linux/arch/x86/boot/bzImage
key: kernel-bzImage-${{ env.SCHED_EXT_KERNEL_COMMIT }}
linux/usr/include
linux/**/*.h
key: kernel-build-${{ env.SCHED_EXT_KERNEL_COMMIT }}-4

# need to re-run job when kernel head changes between build and test running.
- if: ${{ steps.cache-bzImage.outputs.cache-hit != 'true' }}
- if: ${{ steps.cache-kernel.outputs.cache-hit != 'true' }}
name: exit if cache stale
run: exit -1

- uses: Swatinem/rust-cache@v2
with:
workspaces: scheds/rust
key: ${{ matrix.scheduler }}
prefix-key: "1"
- run: cargo build --manifest-path scheds/rust/${{ matrix.scheduler }}/Cargo.toml
- run: cargo test --manifest-path scheds/rust/${{ matrix.scheduler }}/Cargo.toml --no-run
- run: vng -v --memory 10G --cpu 8 -r linux/arch/x86/boot/bzImage --net user -- cargo test --manifest-path scheds/rust/${{ matrix.scheduler }}/Cargo.toml
- run: vng -v --rw --memory 10G --cpu 8 -r linux/arch/x86/boot/bzImage --net user -- cargo test --manifest-path scheds/rust/${{ matrix.scheduler }}/Cargo.toml

pages:
runs-on: ubuntu-24.04
Expand All @@ -270,8 +292,7 @@ jobs:
rustup install nightly
export PATH="~/.cargo/bin:$PATH"
RUSTDOCFLAGS="--enable-index-page -Zunstable-options" ~/.cargo/bin/cargo +nightly doc --workspace --no-deps --bins --lib --examples --document-private-items --all-features
sudo apt update
sudo apt install build-essential graphviz sphinx-doc python3-sphinx-rtd-theme texlive-latex-recommended python3-yaml -y
sudo apt-fast install build-essential graphviz sphinx-doc python3-sphinx-rtd-theme texlive-latex-recommended python3-yaml -y
cargo install htmlq
git clone --single-branch -b for-next --depth 1 https://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext.git linux
cd linux
Expand Down
20 changes: 20 additions & 0 deletions .github/workflows/sched-ext.config
Original file line number Diff line number Diff line change
Expand Up @@ -32,3 +32,23 @@ CONFIG_PREEMPT_RCU=y
#
CONFIG_DEBUG_LOCKDEP=y
CONFIG_DEBUG_ATOMIC_SLEEP=y

# Bpftrace headers (for additional debug info)
CONFIG_BPF=y
CONFIG_BPF_SYSCALL=y
CONFIG_BPF_JIT=y
CONFIG_HAVE_EBPF_JIT=y
CONFIG_BPF_EVENTS=y
CONFIG_FTRACE_SYSCALLS=y
CONFIG_FUNCTION_TRACER=y
CONFIG_HAVE_DYNAMIC_FTRACE=y
CONFIG_DYNAMIC_FTRACE=y
CONFIG_HAVE_KPROBES=y
CONFIG_KPROBES=y
CONFIG_KPROBE_EVENTS=y
CONFIG_ARCH_SUPPORTS_UPROBES=y
CONFIG_UPROBES=y
CONFIG_UPROBE_EVENTS=y
CONFIG_DEBUG_FS=y
# more bpftrace to make that work
CONFIG_IKHEADERS=y
27 changes: 25 additions & 2 deletions meson-scripts/run_stress_tests
Original file line number Diff line number Diff line change
Expand Up @@ -40,21 +40,36 @@ def run_stress_test(
vng_path: str,
kernel: str,
verbose: bool,
rw: bool,
headers: str,
) -> int:
scheduler_args = config.get('scheduler_args')
stress_cmd = config.get('stress_cmd')
s_path = sched_path(build_dir, config.get('sched'))
sched_cmd = s_path + " " + config.get('sched_args')
timeout_sec = int(config.get("timeout_sec"))
if vng_path:
cmd = [vng_path, "--user", "root", "-v", "-r", kernel]
if config.get("qemu_opts"):
cmd += ['--qemu-opts']
cmd += [f"'{config.get("qemu_opts")}'"]
vm_input = f"{stress_cmd} & timeout --foreground --preserve-status {timeout_sec} {sched_cmd}"
if bpftrace_scripts := config.get('bpftrace_scripts'):
vm_input = f"\"{build_dir}/bpftrace_stress_wrapper.sh\" '{stress_cmd}' '{sched_cmd}' '{timeout_sec}' '{bpftrace_scripts}'"
cmd = [vng_path, "--user", "root", "-v", "--", vm_input]
if headers:
vm_input += f" '{headers}'"
if rw and os.getenv('CI'):
print('mounting VNG as RW because CI')
cmd += ["--rw"]
elif rw:
print('not mounting VNG as RW because not CI')
cmd += ["--"]
cmd += [vm_input]
err = sys.stderr if output == "-" else open(output, "w")
out = sys.stdout if output == "-" else err
print(f"vng cmd is {cmd}")
proc = subprocess.Popen(
cmd, env=os.environ, cwd=kernel, shell=False, stdout=out,
cmd, env=os.environ, shell=False, stdout=out,
stderr=err, stdin=subprocess.PIPE, text=True)
proc.wait()
return proc.returncode
Expand Down Expand Up @@ -85,6 +100,8 @@ def stress_tests(args: Namespace) -> None:
vng_path,
args.kernel,
args.verbose,
args.rw,
args.headers
)
for test_name, ret in return_codes.items():
if ret not in (143, 0):
Expand Down Expand Up @@ -114,6 +131,12 @@ if __name__ == "__main__":
parser.add_argument(
'--sched', default='', help='Scheduler to test (default: all)'
)
parser.add_argument(
'--rw', default=False, help='Mount VNG Directories as RW (dangerous)'
)
parser.add_argument(
'--headers', default='', help='Kernel Headers Path'
)

args = parser.parse_args()
if args.verbose:
Expand Down
13 changes: 12 additions & 1 deletion meson-scripts/test_sched
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,16 @@ GUEST_TIMEOUT=60
#
declare -A SCHEDS

VNG_RW=''

# Enable vng rw for when on ci.
if [ $# -ge 3 ] ; then
if [ "$3" == "VNG_RW=true" ]; then
echo 'setting vng to mount rw'
VNG_RW=' --rw '
fi
fi

# enable running tests on individual schedulers
if [ $# -ge 2 ] ; then
SCHEDS[$2]=""
Expand Down Expand Up @@ -63,7 +73,7 @@ for sched in ${!SCHEDS[@]}; do

rm -f /tmp/output
timeout --preserve-status ${GUEST_TIMEOUT} \
vng -m 2G -v -r ${kernel} -- \
vng --user root -m 10G --cpu 8 $VNG_RW -v -r ${kernel} -- \
"timeout --foreground --preserve-status ${TEST_TIMEOUT} ${sched_path} ${args}" \
2> >(tee /tmp/output) </dev/null
grep -v " Speculative Return Stack Overflow" /tmp/output | \
Expand All @@ -79,4 +89,5 @@ for sched in ${!SCHEDS[@]}; do
else
echo "OK: ${sched}"
fi
cp /tmp/output test_log.ci.log
done
Loading
Loading