Skip to content

kndonetm/lbyarch-mco2

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LBYARCH MCO2

Introduction

Benchmarks for a simple vector dot product implementation written in C and x86-64 assembly.

Usage

To run, clone the repository and open the project in Visual Studio. Under main.c, there are preprocessor definitions which determine the size of the benchmark being run. Set the value of TEST_SIZE to TEST_1_SIZE to benchmark a vector of size $2^{20}$, TEST_2_SIZE for a size of $2^{24}$, and TEST_3_SIZE for a size of $2^{29}$.

Benchmarking Results

Below are the benchmarking results for the dot product operation for vectors of size $2^{20}$, $2^{24}$, and $2^{29}$. $2^{29}$ was used as the largest vector size for the test results due to the hardware being unable to support vectors of size $2^{30}$ or higher.

The columns of each benchmark result refer to the following:

  • C Time: Time (in seconds) to run the dot product operation, written in C.
  • ASM Time: Time (in seconds) to run the dot product operation, written in x86-64 Assembly.
  • C Answer: Result of the C dot product operation. This column is used as a reference to check the correctness of the corresponding Assembly operation.
  • ASM Answer: Result of the Assembly dot product operation.

Debug Mode

Benchmarking results for vectors of size 2^20 in debug mode

Debug 2^20 results

Benchmarking results for vectors of size 2^24 in debug mode

Debug 2^24 results

Benchmarking results for vectors of size 2^29 in debug mode

Debug 2^29 results

Release Mode

Benchmarking results for vectors of size 2^20 in release mode

Release 2^20 results

Benchmarking results for vectors of size 2^24 in release mode

Release 2^24 results

Benchmarking results for vectors of size 2^29 in release mode

Release 2^29 results

Analysis

For all three benchmarks conducted in debug mode, the Assembly program performed significantly better than in debug mode, while the opposite is true for all three benchmarks in release mode. Without optimizations and with debug hooks in place, the C program incurred a significant overhead, most likely due to the values in C being repeatedly stored to and retrieved from memory. However, in release mode, the compiler was allowed to optimize the C code, leading to better performance compared to handwritten Assembly code. This may be because the C code could use vector SIMD operations to speed up the calculations, among other optimizations, to reduce the number of calculations needed to achieve the final output. The assembly code, on the other hand, is stuck to using scalar SIMD operations for floating point calculations.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published