Skip to content

Commit

Permalink
Add dockerfile and update README with instructions.
Browse files Browse the repository at this point in the history
  • Loading branch information
eagarvey-amd committed Sep 20, 2024
1 parent 72b5c0a commit 33f4261
Show file tree
Hide file tree
Showing 2 changed files with 76 additions and 3 deletions.
26 changes: 23 additions & 3 deletions models/turbine_models/custom_models/torchbench/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,27 @@ This directory serves as a place for scripts and utilities to run a suite of ben

Eventually, we want this process to be a plug-in to the upstream torchbench process, and this will be accomplished by exposing the IREE methodology shown here as a compile/runtime backend for the torch benchmark classes. For now, it is set up for developers as a way to get preliminary results and achieve blanket functionality for the models listed in export.py.

### Setup
The setup instructions provided here, in a few cases, use "gfx942" as the IREE/LLVM hip target. This is for MI300x accelerators -- you can find a mapping of AMD targets to their LLVM target architecture [here](https://llvm.org/docs/AMDGPUUsage.html#amdgpu-architecture-table), and replace "gfx942" in the following documentation with your desired target.

## Setup (docker)

Use the dockerfile provided with the following build/run commands to execute in docker.
These commands assume a few things about your machine/distro, so please read them and make sure they do what you want.

```shell
docker build --platform linux/amd64 --tag shark_torchbench --file shark_torchbench.dockerfile .
```
```shell
docker run -it --network=host --device=/dev/kfd --device=/dev/dri --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -v shark_torchbench:/SHARK-Turbine/models/turbine_models/custom_models/torchbench/outputs -w /SHARK-Turbine/models/turbine_models/custom_models/torchbench shark_torchbench:latest
```
```shell
python3 ./export.py --target=gfx942 --device=rocm --compile_to=vmfb --performance --inference --precision=fp16 --float16 --external_weights=safetensors --external_weights_dir=./torchbench_weights/ --output_csv=./outputs/torchbench_results_SHARK.csv
```


## Setup (source)

### Setup source code and prerequisites

- pip install torch+rocm packages:
```shell
Expand Down Expand Up @@ -41,10 +61,10 @@ cd ..
python ./export.py --target=gfx942 --device=rocm --compile_to=vmfb --performance --inference --precision=fp16 --float16 --external_weights=safetensors --external_weights_dir=./torchbench_weights/
```

### Example (hf_Albert)
### Example of manual benchmark using export and IREE runtime CLI (hf_Albert)

```shell
python ./export.py --target=gfx942 --device=rocm --compile_to=vmfb --performance --inference --precision=fp16 --float16 --external_weights=safetensors --external_weights_dir=./torchbench_weights/ --model_id=hf_Albert

iree-benchmark-module --module=hf_Albert_32_fp16_gfx942.vmfb --input=@input0.npy --parameters=model=./torchbench_weights/hf_Albert_fp16.irpa --device=hip://0 --device_allocator=caching --function=main --benchmark_repetitions=10
iree-benchmark-module --module=generated/hf_Albert_32_fp16_gfx942.vmfb --input=@generated/hf_Albert_input0.npy --parameters=model=./torchbench_weights/hf_Albert_fp16.irpa --device=hip://0 --device_allocator=caching --function=main --benchmark_repetitions=10
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
FROM rocm/dev-ubuntu-22.04:6.1.2

# ######################################################
# # Install MLPerf+Shark reference implementation
# ######################################################
ENV DEBIAN_FRONTEND=noninteractive

# apt dependencies
RUN apt-get update && apt-get install -y \
ffmpeg libsm6 libxext6 git wget unzip \
software-properties-common git \
build-essential curl cmake ninja-build clang lld vim nano python3.10-dev python3.10-venv && \
apt-get clean && rm -rf /var/lib/apt/lists/*
RUN pip install --upgrade pip setuptools wheel && \
pip install pybind11 'nanobind<2' numpy==1.* pandas && \
pip install hip-python hip-python-as-cuda -i https://test.pypi.org/simple

# Rust requirements
RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
ENV PATH="/root/.cargo/bin:${PATH}"

SHELL ["/bin/bash", "-c"]

# Disable apt-key parse waring
ARG APT_KEY_DONT_WARN_ON_DANGEROUS_USAGE=1

######################################################
# Install SHARK-Turbine
######################################################
RUN pip3 install torch==2.4.0+rocm6.1 torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm6.1
RUN pip3 install --pre iree-compiler==20240920.1022 iree-runtime==20240920.1022 -f https://iree.dev/pip-release-links.html

RUN apt install amd-smi-lib && sudo chown -R $USER:$USER /opt/rocm/share/amd_smi && python3 -m pip install /opt/rocm/share/amd_smi
# Install turbine-models, where the export is implemented.

ENV TB_SHARK_DIR=/SHARK-Turbine/models/turbine_models/custom_models/torchbench

RUN git clone https://github.com/nod-ai/SHARK-Turbine -b torchbench \
&& cd SHARK-Turbine \
&& pip install --pre --upgrade -e models -r models/requirements.txt \
&& cd $TB_SHARK_DIR \
&& git clone https://github.com/pytorch/pytorch \
&& cd pytorch/benchmarks \
&& touch __init__.py && cd ../.. \
&& git clone https://github.com/pytorch/benchmark && cd benchmark \
&& python3 install.py --models BERT_pytorch Background_Matting LearningToPaint alexnet dcgan densenet121 hf_Albert hf_Bart hf_Bert hf_GPT2 hf_T5 mnasnet1_0 mobilenet_v2 mobilenet_v3_large nvidia_deeprecommender pytorch_unet resnet18 resnet50 resnet50_32x4d shufflenet_v2_x1_0 squeezenet1_1 timm_nfnet timm_efficientnet timm_regnet timm_resnest timm_vision_transformer timm_vovnet vgg16 \
&& pip install -e .

ENV HF_HOME=/models/huggingface/

# initialization settings for CPX mode
ENV HSA_USE_SVM=0
ENV HSA_XNACK=0

0 comments on commit 33f4261

Please sign in to comment.