Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MLP benchmarks #152

Merged
merged 18 commits into from
Jul 30, 2024
2 changes: 2 additions & 0 deletions cmake/tpp-mlir.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,8 @@ if (TPP_MLIR_DIR)
-Wl,--no-as-needed
-L${TPP_MLIR_DIR}/lib
-ltpp_xsmm_runner_utils
-L${LLVM_LIBRARY_DIR}
-lmlir_c_runner_utils
-Wl,--as-needed
)
#FIXME: Provide platform-independent way of doing that:
Expand Down
72 changes: 72 additions & 0 deletions tools/mlir_bench/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
# MLP benchmarks

Various MLP benchmarks.
Describes usage of the `*_bench.sh` scripts.

## LIBXSMM
- F32:
```bash
libxsmm_bench.sh
```
- BF16:
```bash
libxsmm_bench.sh -B
```

## Pure MLIR
- F32:
```bash
tpp_mlir_bench.sh -t f32
```
- BF16:
```bash
tpp_mlir_bench.sh -t bf16
```

## OV - no MLIR
Default model:\
`matmul_transpose_b + bias broadcast`

Alternative model - scritp flag `-b mlp`:\
`matmul + bias (no broadcast)`

- F32:
```bash
OV_MLIR=0 mlp_bench.sh -t f32
```
- BF16:
```bash
OV_MLIR=0 mlp_bench.sh -t bf16
```

## OV + MLIR - full
Default model:\
`matmul_transpose_b + bias broadcast`

Alternative model - scritp flag `-b mlp`:\
`matmul + bias (no broadcast)`

- F32:
```bash
OV_MLIR=1 mlp_bench.sh -t f32
```
- BF16:
```bash
OV_MLIR=1 mlp_bench.sh -t bf16
```

## OV + MLIR - kernel only
Default model:\
`matmul_transpose_b + bias broadcast`

Alternative model - scritp flag `-b mlp`:\
`matmul + bias (no broadcast)`

- F32:
```bash
ov_raw_mlir_bench.sh -t f32
```
- BF16:
```bash
ov_raw_mlir_bench.sh -t bf16
```
70 changes: 70 additions & 0 deletions tools/mlir_bench/libxsmm_bench.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
#!/bin/bash

# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

# Runs MLP benchmarks using libxsmm.

die_syntax() {
echo "Syntax: $0 [-B] [-D]"
echo ""
echo " -B: Use bf16 data type"
echo " -D: Set model shapes to dynamic"
exit 1
}

# Cmd-line opts
while getopts "BD" arg; do
case ${arg} in
B)
DATA_TYPE="bf16"
;;
D)
IS_DYNAMIC=true
;;
?)
echo "Invalid option: ${OPTARG}"
die_syntax
;;
esac
done

BENCH_RUNNER=xsmm_dnn_mlp

# Initial validation.
if ! [ "$(command -v ${BENCH_RUNNER})" ]; then
echo "Missing benchmark runner ${BENCH_RUNNER}"
exit 1
fi
if [ ${IS_DYNAMIC} ]; then
echo "Dynamic shapes are not supported by ${BENCH_RUNNER}"
exit 1
fi

# Kernel config.
INPUT_SIZES=( 1024 2048 4096 8192 )
OUTPUT_SIZES=( 128 256 512 )
if [ ! "${DATA_TYPE}" ]; then
DATA_TYPE="f32"
fi

echo "Result type: GFLOPS"
for OUT_SIZE in "${OUTPUT_SIZES[@]}"; do
echo "MLP - OUT: ${OUT_SIZE} INS: ${INPUT_SIZES[@]}"
for IN_SIZE in "${INPUT_SIZES[@]}"; do
# Run benchmark.
NUM_ITER=10000
FUSE_TYPE=5
TYPE=F
TILES=(64 64 64)
LAYOUT=(0 0)
if [ "${DATA_TYPE}" = "bf16" ]; then
LAYOUT=(1 1)
fi
# Disable parallelism.
ENV_FLAGS=OMP_NUM_THREADS=1
exec env ${ENV_FLAGS} ${BENCH_RUNNER} ${NUM_ITER} ${OUT_SIZE} ${FUSE_TYPE} ${TYPE} ${TILES[@]} \
${LAYOUT[@]} ${IN_SIZE} ${OUT_SIZE} \
| sed -nE "s/.*GFLOPS\s+=\s*([0-9.]+).*/\\1/p"
done
done
109 changes: 109 additions & 0 deletions tools/mlir_bench/mlp_bench.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,109 @@
#!/bin/bash

# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

# Runs OV MLP benchmarks.

die_syntax() {
echo "Syntax: $0 [-t (f32|f16|bf16|...)] [-b (mlp)] [-D]"
echo ""
echo " -t: Optional data type"
echo " -b: Optional baseline model"
echo " -D: Set model shapes to dynamic"
exit 1
}

# Cmd-line opts
while getopts "t:b:D" arg; do
case ${arg} in
t)
DATA_TYPE=${OPTARG}
;;
b)
BASELINE_MODEL=${OPTARG}
;;
D)
IS_DYNAMIC=true
;;
?)
echo "Invalid option: ${OPTARG}"
die_syntax
;;
esac
done

OV_ROOT=$(git rev-parse --show-toplevel)
BENCH_ROOT=$(realpath ${OV_ROOT}/tools/mlir_bench)

MODEL_GEN=$(realpath ${BENCH_ROOT}/ov_model_gen.py)
BENCH_RUNNER=benchmark_app

# Initial validation.
if ! [ -d ${OV_ROOT} ]; then
echo "Missing OV repo"
exit 1
fi
if ! [ -d ${BENCH_ROOT} ]; then
echo "Missing MLIR benchmark directory"
exit 1
fi
if ! [ -f ${MODEL_GEN} ]; then
echo "Missing model generator"
exit 1
fi
if ! [ "$(command -v ${BENCH_RUNNER})" ]; then
echo "Missing benchmark runner ${BENCH_RUNNER}"
exit 1
fi
if [ "${BASELINE_MODEL}" ] && [ ${IS_DYNAMIC} ]; then
echo "Baseline models with dynamic shapes not supported"
exit 1
fi

# Kernel config.
INPUT_SIZES=( 1024 2048 4096 8192 )
OUTPUT_SIZES=( 128 256 512 )
if [ ! "${DATA_TYPE}" ]; then
DATA_TYPE="f32"
fi
MODEL_NAME="MLIR_MLP_BENCH.xml"

echo "Result type: time [ms]"
for OUT_SIZE in "${OUTPUT_SIZES[@]}"; do
echo "MLP - OUT: ${OUT_SIZE} INS: ${INPUT_SIZES[@]}"
for IN_SIZE in "${INPUT_SIZES[@]}"; do
# Generate model.
if [ "${BASELINE_MODEL}" ]; then
# Enable baseline model flag.
MODEL_CONFIG=(-b="${BASELINE_MODEL}[${OUT_SIZE},${OUT_SIZE},${IN_SIZE}]")
else
# Generate default PyTorch MLP.
MODEL_CONFIG=(-l="linear[${IN_SIZE},${OUT_SIZE}] relu[]")
fi
GEN_FLAGS=(-t ${DATA_TYPE} -n ${MODEL_NAME})
if [ ${IS_DYNAMIC} ]; then
GEN_FLAGS+=(--dynamic)
fi
python3 ${MODEL_GEN} "${MODEL_CONFIG[@]}" "${GEN_FLAGS[@]}"
if [ $? != 0 ]; then
echo "Failed to generate model"
exit 1
fi
# Run benchmark.
PRECISION=${DATA_TYPE}
if [ "${DATA_TYPE}" = "bf16" ]; then
# No native support for bf16, use simple f16 instead.
PRECISION="f16"
fi
if [ ${IS_DYNAMIC} ]; then
DATA_SHAPE=(-data_shape [${OUT_SIZE},${IN_SIZE}])
fi
# Benchmark config. Disable parallelism.
PERF_FLAGS="-niter 10000 -hint none -nstreams 1 -nthreads 1"
BENCH_FLAGS="-m ${MODEL_NAME} -d CPU \
-ip ${PRECISION} ${DATA_SHAPE[@]} ${PERF_FLAGS}"
${BENCH_RUNNER} ${BENCH_FLAGS} 2>/dev/null | \
sed -nE "s/.*\[ INFO \]\s*Median:\s*([0-9.]+).*/\\1/p"
done
done
Loading
Loading