Performance loss when matrix size becomes a significant fraction of physical RAM #5

alugowski · 2023-07-13T06:19:23Z

I've been benchmarking a few matrix loaders, and noticed a performance degradation on PIGO if the matrix being loaded is a large fraction of the available RAM.

See https://github.com/alugowski/sparse-matrix-io-comparison

The machine has 16GiB RAM (it's a laptop). The 1GiB file shows amazing read and write performance from PIGO, but the 10GiB file is about an order of magnitude slower in each. While experimenting I noticed that the performance drop is gradual and dependent on memory fraction. So an 8GiB file shows less degradation than 10GiB, but more than 6 GiB.

(The generated MatrixMarket files and code are defined such that the filesize is roughly equal to the memory requirement of the matrix)

I noticed the PIGO paper used a 1TB machine to load at most ~30GiB files, so this may or may not be important.

alugowski · 2023-07-13T06:24:11Z

My first suspicion was that PIGO reads the mmaped region twice, but with a pattern such that if the entire thing does not fit in RAM then the second pass will not find its data in cache from the first pass.

There must be more going on, though, because that would explain a 2x drop, not a ~10x one. Perhaps OS caching methods come into play as well.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance loss when matrix size becomes a significant fraction of physical RAM #5

Performance loss when matrix size becomes a significant fraction of physical RAM #5

alugowski commented Jul 13, 2023

alugowski commented Jul 13, 2023

Performance loss when matrix size becomes a significant fraction of physical RAM #5

Performance loss when matrix size becomes a significant fraction of physical RAM #5

Comments

alugowski commented Jul 13, 2023

alugowski commented Jul 13, 2023