Improves IO performance for the report readers #104

sergiorg-hpc · 2020-10-09T14:50:14Z

Changes

IO:

Loop through rows (timesteps) instead of columns (node_ids).
- Changes from tsteps * node_ids reads, to tsteps reads (i.e.,
  constant time in terms of IO, regardless of node_ids reqs.).
Read all node_ids in a buffer every timestep, between min and max
offsets of node_ids requested.

Memory:

Allocate buffer before passing to HighFive to avoid extra and
unnecessary memory allocations before reading.
Change data structure of node_ids/offsets from vector to map
for faster search (from n^2 to nlogn).
In soma reports, assign values directly to return buffer instead
of std::copy/memcpy.

New features and others:

Add support for strided reads, allowing to reduce the amount of
timesteps (e.g., 1 by default, 2 for every 2 timesteps, etc.).
Eliminate duplicated code, avoid calling HDF5 metadata in every
iteration, updated unit tests, and other minor changes.

Important note

Performance evaluations and other related discussions can be obtained here:
https://bbpteam.epfl.ch/project/issues/browse/REP-60
https://bbpteam.epfl.ch/project/issues/browse/BLPY-217

> IO: * Loop through rows (timesteps) instead of columns (node_ids). - Changes from tsteps * node_ids reads, to tsteps reads (i.e., constant time in terms of IO, regardless of node_ids reqs.). * Read all node_ids in a buffer every timestep, between min and max offsets of node_ids requested. > Memory * Allocate buffer before passing to HighFive to avoid extra and unnecessary memory allocations before reading. * Change data structure of node_ids/offsets from vector to map for faster search (from n^2 to nlogn). * In soma reports, assign values directly to return buffer instead of std::copy/memcpy. > New features and others: * Add support for strided reads, allowing to reduce the amount of timesteps (e.g., 1 by default, 2 for every 2 timesteps, etc.). * Eliminate duplicated code, avoid calling HDF5 metadata in every iteration, updated unit tests, and other minor changes.

src/report_reader.cpp

alkino · 2020-10-09T16:14:08Z

Really nice way to improve things

matz-e

Great! Some nitpicks about the docs… if that's intentional, please ignore.

include/bbp/sonata/report_reader.h

src/report_reader.cpp

mgeplf

Very nice addition.

src/report_reader.cpp

sergiorg-hpc · 2020-10-19T14:13:52Z

@jorblancoa and I went through all the comments and suggestions, and integrated everything (specially Jorge, thanks for the effort). Thanks for all the great feedback!

If no one has any additional comments, we are ready to close the PR and merge the changes!

include/bbp/sonata/report_reader.h

python/tests/test.py

mgeplf

LGTM, nice work!

sergiorg-hpc requested review from alkino, matz-e and mgeplf October 9, 2020 14:50

sergiorg-hpc assigned sergiorg-hpc and jorblancoa Oct 9, 2020

Updated docstrings and clang format

2b4e03f

alkino previously approved these changes Oct 9, 2020

View reviewed changes

src/report_reader.cpp Outdated Show resolved Hide resolved

src/report_reader.cpp Outdated Show resolved Hide resolved

Change min/max calcuations to one-liners

fc0da88

jorblancoa dismissed alkino’s stale review via fc0da88 October 9, 2020 17:03

matz-e previously approved these changes Oct 9, 2020

View reviewed changes

include/bbp/sonata/report_reader.h Show resolved Hide resolved

include/bbp/sonata/report_reader.h Outdated Show resolved Hide resolved

src/report_reader.cpp Show resolved Hide resolved

GianlucaFicarelli reviewed Oct 12, 2020

View reviewed changes

src/report_reader.cpp Show resolved Hide resolved

src/report_reader.cpp Show resolved Hide resolved

mgeplf reviewed Oct 17, 2020

View reviewed changes

Address comments and suggestions

24d294e

jorblancoa dismissed matz-e’s stale review via 24d294e October 19, 2020 13:58

alkino previously approved these changes Oct 19, 2020

View reviewed changes

mgeplf reviewed Oct 19, 2020

View reviewed changes

include/bbp/sonata/report_reader.h Outdated Show resolved Hide resolved

python/tests/test.py Show resolved Hide resolved

Moves the Range/Ranges types definition locally

a69e779

sergiorg-hpc dismissed alkino’s stale review via a69e779 October 19, 2020 16:37

Throw exception when datatype of dataset 'data' is not Float32

9300a81

mgeplf self-requested a review October 20, 2020 11:20

mgeplf approved these changes Oct 20, 2020

View reviewed changes

sergiorg-hpc merged commit 3bb0906 into master Oct 20, 2020

sergiorg-hpc deleted the iofix_readers branch October 20, 2020 12:29

sergiorg-hpc mentioned this pull request Jan 4, 2022

Changes how the ids and positions are retrieved #174

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improves IO performance for the report readers #104

Improves IO performance for the report readers #104

sergiorg-hpc commented Oct 9, 2020 •

edited by jorblancoa

Loading

alkino commented Oct 9, 2020

matz-e left a comment

mgeplf left a comment

sergiorg-hpc commented Oct 19, 2020 •

edited

Loading

mgeplf left a comment

Improves IO performance for the report readers #104

Improves IO performance for the report readers #104

Conversation

sergiorg-hpc commented Oct 9, 2020 • edited by jorblancoa Loading

Changes

Important note

alkino commented Oct 9, 2020

matz-e left a comment

Choose a reason for hiding this comment

mgeplf left a comment

Choose a reason for hiding this comment

sergiorg-hpc commented Oct 19, 2020 • edited Loading

mgeplf left a comment

Choose a reason for hiding this comment

sergiorg-hpc commented Oct 9, 2020 •

edited by jorblancoa

Loading

sergiorg-hpc commented Oct 19, 2020 •

edited

Loading