Files from the course Computer Architecture @UMinho.
Studying optimization techniques on a Matrix Multiplication Algorithm (final version with all algorithms in P11)
- gemm1 - Original Algorithm
- gemm2 - Change Loop Order to Increase Spacial Locality
- gemm3 - Local Variable to Avoid Antialiasing
- gemm4 - Manual Unrolling
- gemm5 - Forcing Automatic Unrolling
- gemm6 - Compiler autovectorization
- gemm7 - Compiler Vectorization and Loop Unrolling
- gemm8 - Vectorization with Intrinsics
- gemm9 - Vectorization with Intrinsics and compiler Unrolling
- gemm10 - OpenMP Parallelization
- gemm11 - OpenMP Parallelization and Tree-vectorize