In order to identify the key optimizations adopted by some high-performance compilers, we propose a hotspot-driven semi-automatic framework for identifying key compiler optimizations through comparing the binaries generated by two different compilers. Firstly, the framework obtains the execution time and hotspot distribution information of binaries generated by two different compilers with the same source code through a performance analysis tool(Linux perf
), and then the framework automatically selects the identified hotspots that cause the binaries‘ performance difference. We use DynamoRIO Client
to analyze the instruction distribution characteristics of specific hotspots, which help us narrow down the scope of hotspots‘s binary analysis. All the above steps can be done automatically by the framework. And the following process of binary analysis based on instruction distribution characteristics needs to be finished manually.Our work has been published in CC23's conference
- DynamoRIO
- Linux perf
- screen
- Cmake
- python3 (pandas/numpy)
- DynamoRIO client can't support the linux system with glibc2.34+, so we have to choose a linux system configured with glibc2.34 or below.
- The following commands are required before running Linux perf:
sudo sysctl -w kernel.perf_event_paranoid=-1
Check the version of Linux kernel
uname -a
>Linux taishan-200 5.15.0-41-generic #44-Ubuntu SMP Thu Jun 23 11:20:13 UTC 2022 aarch64 aarch64 aarch64 GNU/Linux
Replace line 19 in Dockerfile-ubuntu-20.04 with the specified kernel version, for example:
linux-tools-*\ -> linux-tools-5.15.0-41-generic\
Build a docker image
make build
Run the docker image
make run
Download and unpack DynamoRIO
make tools
cd ./Samples && make
We set all the parameters required by our framework through a specific json file, the effect of parameters are described as follows.
-
paths : The set of file paths used by the framework.
- benchpath:Application binaries path.
- dynamoriolibpath:DynamoRIO client path(X64 platform:
Dynamorio_lib/X64
,AArch64 platform:Dynamorio_lib/AArch64
). - outpath:Output path(
Output
is default path). - dynamoriopath:DynamoRIO path(this path is
DynamoRIO
after running themake tools
command.)
-
Application: The application generated by two different compilers with same source code.
- application1:The name of application1.
- application2:The name of application2.
- run_command1:The arguments of application1 command.
- run_command2:The arguments of application2 command.
-
subfile:The detailed output path under outpath.
-
hotspot_selection_threshold: The threshold of the identified hotspot selection algorithm, the default threshold is 90%.
-
runmode:run mode.
- function_name: Binary instrumentation through identified hotspots.
- logical_address:Binary instrumentation through specific logical address ranges.
-
logical_address
- application1:
- instrumentation:Startup the logical_address mode or not(defalut argument is
False
). - start:start logical address.
- end:end logical address.
- instrumentation:Startup the logical_address mode or not(defalut argument is
- application2:
- instrumentation: Startup the logical_address mode or not(defalut argument is
False
). - start:start logical address.
- end:end logical address.
- instrumentation: Startup the logical_address mode or not(defalut argument is
- application1:
$ ./run-framework.sh config.json
We use the . /run-framework.sh config.json
command to startup our framework, and the final result will output into . /outpath/subfile
folder. We can change the name of json file to run different applications.
- disassemble-data : The assembly files of two applications generated by two different compilers with same source code.
- instrument-data : The instrumentaion data for identified hotspots.
- instrument-data-logical-address :The instrumentaion data for specific logical address ranges.
- application1_identified_hotspots.json : The identified hotspots list of application1.
- application2_identified_hotspots.json :The identified hotspots list of application2.
- report.out : The final summary report of instrumentation data and profiling data for application1 and application2.
We use the methold described in Step1 to startup dockerfile. And then, following the rules in Step 3, we can modify the config_omnetpp.json and config_exchange.json to set the corresponding configuration. Finally, we run the following commands on two different platform.
Startup our framework for 620.omnetpp_s benchmark on X64 platform (Framework runs about 30 minutes):
$ ./run-framework.sh config_omnetpp.json
Startup our framework for 648.exchange2_s benchmark on AArch64 platform (Framework runs about 2 hours):
$ ./run-framework.sh config_exchange.json