Skip to content

Profiling basic linear algebra operations in various languages - with data analysis!

Notifications You must be signed in to change notification settings

hmcalister/Linear-Algebra-Profiling

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Linear Algebra Profiling

Hayden McAlister

This project looks into different implementations of a very simple program. Each implementation will, for various numbers of vectors, multiply each vector by a matrix various numbers of times, applying a nonlinear transformation (the heaviside function) after each multiplication. By altering the number of vectors, number of multiplications, and other factors such as number of threads, we can build a dataset that tells us how each implementation handles this task.

The goal overall is to analyze this data to best select a language for this task - taking into account the time we investigate here as well as factors such as ease of development.

An example plot from the dashboard developed, showing all implementations colored by threads and batchsizes

Structure

Dashboard

The main visualization tool, developed using plotly and dash. To view the dashboard, pull this repo, install the requirements and run the server using

cd Dashboard
pip install -r dash-requirements.txt
python main.py

Open the URL given by the script in your webbrowser to view the dashboard.

An example plot from the dashboard developed, showing the best performing implementations

Implementations

The Implementations directory holds subdirectories, each of which is a different implementation of the above task. Implementations all accept common command line arguments (for the sake of consistency), which are:

  • trials: Number of trials to attempt

  • multiplications: Number of multiplications per trial

  • threads: Number of threads to use

Note the Tensorflow implementation also accepts a batchsize argument, and does not tend to play nicely with threading (especially on the GPU).

Each implementation subdirectory also holds a .csv file that contains data on the profiling from that implementation. This data is collected, cleaned, and collated in the Data Cleaning directory.

Data Cleaning

The Data Cleaning directory contains main.ipynb - a jupyter notebook that reads each implementation .csv file, ensures the data is valid, melts all individual implementation data into a single dataframe, and performs some basic analysis on it to investigate relationships. This notebook also holds a lot of visualizations of the data along with some discussion - if you just want to have a peek at what's involved, start here!

Motivation

My Masters project is looking into properties of the Hopfield network. The above definition (heaviside(matrix * vector)) is effectively the simplest Hopfield network implementation - so in effect we are profiling the Hopfield network!

About

Profiling basic linear algebra operations in various languages - with data analysis!

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages