Replies: 2 comments
-
<!doctype html> <title>Resnet50 performance(tuned with fusion in place): Direction = fwd, Chip = gfx90a, InputLayout = nchwg, OutputLayout = nkhwg, N = 1, DilationH = 1, DilationW = 1</title> <style type="text/css"> caption { caption-side: bottom; padding: 0.5em; } </style>Resnet50 performance(tuned with fusion in place): Direction = fwd, Chip = gfx90a, InputLayout = nchwg, OutputLayout = nkhwg, N = 1, DilationH = 1, DilationW = 1Summary<style type="text/css"> #T_44a31 tbody tr:nth-child(odd) { background-color: #e0e0e0; } #T_44a31 tbody tr:nth-child(even) { background-color: #eeeeee; } #T_44a31 table { background-color: #dddddd; border-collapse: collapse; } #T_44a31 th { padding: 0.5em; text-align: center; max-width: 150px; } #T_44a31 td { padding: 0.5em; text-align: center; max-width: 150px; } </style>
Details<style type="text/css"> #T_a9607 tbody tr:nth-child(odd) { background-color: #e0e0e0; } #T_a9607 tbody tr:nth-child(even) { background-color: #eeeeee; } #T_a9607 table { background-color: #dddddd; border-collapse: collapse; } #T_a9607 th { padding: 0.5em; text-align: center; max-width: 150px; } #T_a9607 td { padding: 0.5em; text-align: center; max-width: 150px; } </style>
|
Beta Was this translation helpful? Give feedback.
0 replies
-
<!doctype html> <title>Resnet50 performance(tuned with conv): Direction = fwd, Chip = gfx90a, InputLayout = nchwg, OutputLayout = nkhwg, N = 1, DilationH = 1, DilationW = 1</title> <style type="text/css"> caption { caption-side: bottom; padding: 0.5em; } </style>Resnet50 performance(tuned with conv): Direction = fwd, Chip = gfx90a, InputLayout = nchwg, OutputLayout = nkhwg, N = 1, DilationH = 1, DilationW = 1Summary<style type="text/css"> #T_42c6c tbody tr:nth-child(odd) { background-color: #e0e0e0; } #T_42c6c tbody tr:nth-child(even) { background-color: #eeeeee; } #T_42c6c table { background-color: #dddddd; border-collapse: collapse; } #T_42c6c th { padding: 0.5em; text-align: center; max-width: 150px; } #T_42c6c td { padding: 0.5em; text-align: center; max-width: 150px; } </style>
Details<style type="text/css"> #T_f158e tbody tr:nth-child(odd) { background-color: #e0e0e0; } #T_f158e tbody tr:nth-child(even) { background-color: #eeeeee; } #T_f158e table { background-color: #dddddd; border-collapse: collapse; } #T_f158e th { padding: 0.5em; text-align: center; max-width: 150px; } #T_f158e td { padding: 0.5em; text-align: center; max-width: 150px; } </style>
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
We want to track the performance of the fusion test case at fusion/e2e/restnet50 and xmir/e2e/bert-torch-tosa.
Experiments
To find the best perf_config for fusion, I tried two methods:
1. tuned the fusion test cases
eg.
./bin/rocmlir-driver -host-pipeline partition,highlevel -targets gfx90a ../mlir/test/xmir/e2e/bert-torch-tosa/bert_part_0.torch-tosa.mlir| ./bin/rocmlir-tuning-driver
The script of tuning all fusion test cases using this method is in this PR.
python3 ./bin/tuningRunner.py --op=fusion --test_dir=../mlir/test/xmir/e2e/bert-torch-tosa/ --verify-mode clone --tuning_db=bert-tuned-fusion.tsv
2 extract gemm or conv configurations from the fusion test cases and tune them
eg. the gemm configurations for bert test cases are:
python3 ./bin/tuningRunner.py --op gemm -configs_file=./bert-fusion -o bert-tuned-gemm.tsv
The experiments show that best perf-configs obtained from these two methnds are different for some cases.
I then benchmarked fusion test cases using the perf-configs obtained from these two methods.
The resulting performance is similar, with a performance difference of less than 10% for individual test cases. However, the overall performance is almost identical. Thus it makes sense to get perf configs from isolating and tuning gemm and conv.
The conv configurations from fusion e2e resnet50 tests look like:
Proposal:
My proposal is to extract the above configurations from the fusion tests during tuning, rather than hardcoding them in a config file. Then, we can run the tuningRunner using the generated configs. This approach eliminates the need to manually update the config file when fusion test cases are modified, making maintenance much easier.
Beta Was this translation helpful? Give feedback.
All reactions