Changes how the ids and positions are retrieved #174

sergiorg-hpc · 2021-12-22T01:55:13Z

The PR alters how the node IDs and positions are retrieved from storage. The node pointers are now filtered into a sub-map and used directly for different purposes, instead of splitting the IDs and positions like in the original code. The IO pattern for the IDs is also different and avoids the use of multiple hyperslabs, following a similar approach to what was introduced when fetching the actual data from storage (#104). In addition, the changes simplify and unify parts of the code (e.g., removing the use of lambda-functions).

Using a reference report of around 60TB, fetching one single timestamp from the file for all the GIDs would imply the following:

In the former implementation, getting the metadata of each of the 4,234,929 GIDs of the report implies triggering a hyperslab + read for around 300 values per compartment (i.e., 4,234,929 reads of around 300 values each).
In the proposed implementation, only a single hyperslab is triggered to fetch the data for the 4,234,929 IDs. Then, this data is conveniently filtered in memory.

These millions of small reads from the original implementation were causing problems recently in the GPFS file system of BB5, specially when scientists were launching multiple processes over multiple nodes conducting similar requests on big reports.

mgeplf · 2022-01-11T07:16:47Z

Using a reference report of around 60TB with a single process, fetching one single timestamp from the file for all the GIDs takes the following:

How does the change impact fetching multiple timesteps w/ a subset of GIDS?

sergiorg-hpc · 2022-01-11T09:52:43Z

How does the change impact fetching multiple timesteps w/ a subset of GIDS?

@mgeplf If the GIDs are quite far from each other (e.g., 2 GIDs, one in the very beginning and one at the very end), then the performance can be worse if the report is extremely large. However, the data will always be fetched following the same technique, meaning that even if we leave the implementation as it is, the cost to fetch the data will still be worse in this particular edge case. This was the trade-off we made last year when we switched from having the millions of hyperslabs to just one hyperslab per timestep. Overall, it provides lots of performance benefits in comparison.

But just to clarify, my next work item would be in that direction actually! My plan is to have a threshold that would create ranges of data selection, instead of a single range like now. This way, following the previous example, if the gap between the GIDs is large, then we would just create two (or more) separate hyperslabs with as much data as possible. This would still be optimal to the file system because we have fewer requests, while still handling the edge cases that can happen.

However, this is a separate task. For now, I wanted to unify both implementations, merge this into libsonata and then the next step would be to add this extra step. Both Joseph and Andras have tried this patch and have obtained clear performance benefits.

What do you think?

mgeplf · 2022-01-11T10:03:17Z

But just to clarify, my next work item would be in that direction actually!

Cool.

Can we at least run the fetching multiple timesteps w/ a subset of GIDS w/ the old code, and the new code, and put that in this ticket?

sergiorg-hpc · 2022-01-11T10:13:44Z

Yup, I will run a test throughout the morning and add it here, no problem at all.

Just an additional note that I forgot to mention. Keeping the old implementation is heavy for GPFS when multiple processes open reports. For instance, Joseph was involuntarily blocking GPFS for everybody with his simulations and most of it was just in the beginning fetching the IDs and positions. Regardless of whether we merge this or not, the data will still be fetched with the min/max range for the data anyway. So, having the IDs fetched similarly is equivalent to adding an extra timestep to the report, if that makes sense (i.e., like 10001 tsteps instead of 10000).

Nonetheless, I 100% agree, let me just do a couple of runs just to understand where we are and we can take it from there.

sergiorg-hpc · 2022-01-21T20:13:23Z

After rebasing from master to fix the compilation mode bug (see #175), here are some results using a single process to fetch one timestep with the current version from master (i.e., v0.1.11) versus the same version with the changes from this PR (i.e., v0.1.11+Patch):

The figure illustrates Execution Time in seconds on the y-axis and the number of GIDs requested in the x-axys. The test script uses a reference ~60TB report with 4.2M GIDs and randomly selects GIDs from the input, mimicking some short of spatial bining. The tests ask for a single timestep, which would first trigger internally the fetching of the GIDs and the positions (i.e., the part that involves the changes from this PR) plus accessing the data to retrieve a single timestep (i.e., this part identical in both versions and previously optimized in #104*). The table at the bottom shows specifically the execution time for each test, being the 4M test equivalent to All 4.2M GIDs in the report.

I must admit that, after the compilation fix, the original code doesn't look as bad as it used to. For instance, selecting chunks of sequential GIDs is not that worse. However, I would like to kindly point out that we cannot keep the code as it is, otherwise it will be quite stressful for GPFS when scientists utilize hundreds of processes on big reports. Moreover, this PR is required for the next optimization step that reads in chunks of a certain threshold, as I mentioned earlier in one of my comments. This would improve the performance when requesting a small number of GIDs that are located far apart, as in the figure above.

For reference purposes, here is the Python script used for the tests:

import libsonata
import time
import sys
import numpy as np

elements = libsonata.ElementReportReader('/gpfs/.../002/AllCurrents.h5')
population_elements = elements['All']
print("File loaded!")

for i in range(0, 7, 1):
    count = pow(10, i)
    node_ids = population_elements.get_node_ids()

    np.random.seed(21)
    node_ids = list(np.random.choice(node_ids, count, replace=False))

    start = time.time()
    data_frame = population_elements.get(node_ids, tstart=5000.0, tstop=5000.1)
    end = time.time()

    print(str(count) + " / " + str(end-start) + "\n-----")

* Note that fetching more than one timestep in the tests will not illustrate any significant difference, as this part of the code is identical.

matz-e

Looks good to me, some top-level abstract documentation would be good, though.

include/bbp/sonata/report_reader.h

src/report_reader.cpp

include/bbp/sonata/report_reader.h

src/report_reader.cpp

sergiorg-hpc · 2022-01-28T16:55:23Z

Thank you very much for the comments and suggestions, @mgeplf. @jorblancoa and I have addressed most of the hints and also verified that the performance still remains equivalent.

If you like this version, let's go ahead and merge it to create a new libsonata release for deployment.

Have a nice weekend and thank you once again!

include/bbp/sonata/report_reader.h

src/report_reader.cpp

sergiorg-hpc · 2022-02-01T14:12:46Z

@mgeplf With all due respect, I think that we are missing the point here. Once again: the current libsonata reads the data of the reports with the min/max optimization. This is no different and will also be penalized when fetching the first and last GID in the old implementation.

This PR unifies the same IO pattern for the metadata that is required to obtain the data. Without this change, the code triggers millions (I insist, millions) of small IO requests on big reports, just before fetching a single timestep of the data. The performance numbers listed above is to fetch a single timestep, just one! Then, you have to add the time to fetch the other 9999 timesteps of the report, that will have identical cost in both implementations.

I will upload now some of your suggested changes, ok?

mgeplf · 2022-02-01T14:31:41Z

This PR unifies the same IO pattern for the metadata that is required to obtain the data.

I think this actually pessimizing some use cases, though; for instance if I use get_node_id_element_id_mapping(sel) on a selection of the large report, the time to get the result goes from about 6s to almost 10s.

sergiorg-hpc · 2022-02-01T14:43:11Z

This PR unifies the same IO pattern for the metadata that is required to obtain the data.

I think this actually pessimizing some use cases, though; for instance if I use get_node_id_element_id_mapping(sel) on a selection of the large report, the time to get the result goes from about 6s to almost 10s.

Can you provide the scripts and report you used? Also, I repeatedly mentioned that this PR is needed in order to include an optimization for the edge cases, while handling the big reports.

We have already optimized the code in 4-5 contributions, and this one adds to the previous contributions.

mgeplf · 2022-02-02T07:29:23Z

The test case is:

import libsonata
p = '/gpfs/bbp.cscs.ch/data/scratch/proj85/lotsofSims/baseline/633cbccb-f6b2-4e10-8f83-0a2655dcbcc7/000/AllCurrents.h5'
r = libsonata.ElementReportReader(p)['All']
mapping = r.get_node_id_element_id_mapping(libsonata.Selection([4228232, 15872]))
print(len(mapping))
print(mapping[0], mapping[-1])

The reason this is slower is likely due to getNodeIdElementIdMapping calling getElementIds which is doing more work than before.

The behavior also has changed (which we have to be careful about b/c of Hyrum's law): since there is a map now, the returned values are different:
Before, the second print gave: [4228232, 0] [15872, 176], and now it gives: [15872, 0] [4228232, 579]`; so we should mention that the returned value isn't guaranteed to be stable.

matz-e · 2022-02-02T09:00:49Z

Seems to me that the test case is fetching two well-separated IDs, which is a known regression to be followed up and optimized on. It just seemed that doing it all in one go would lead to a more complex PR and breaking the work up would be better.

With the return values… that could be a costly re-sorting according to the user input. Seems doable, though.

mgeplf · 2022-02-04T09:34:26Z

Seems to me that the test case is fetching two well-separated IDs, which is a known regression to be followed up and optimized on.

Yeah, fair enough, we just need to be clear on what is improving, and this seems like a reasonable tradeoff.

It just seemed that doing it all in one go would lead to a more complex PR and breaking the work up would be better.

Yup, for sure.

* struct ElementIdsData -> NodeIdElementLayout * getElementIds -> getNodeIdElementLayout

commit 79ce37c Author: Mike Gevaert <[email protected]> Date: Mon Feb 7 08:16:36 2022 +0100 Try using a vector instead of a map * make NodeIdElementLayout::node_ranges a vector instead of a map; the key of the map wasn't used, in the calling functions, and only iteration was performed * rename NodeIdElementLayout::range -> min_max_range * only fill .times if there is a data payload

src/report_reader.cpp

mgeplf · 2022-02-08T14:55:15Z

Looks good to me; I asked @NadirRoGue about what the usage patterns were on the viz side, and it sounds like the usual number of node id's examined is it can go from 100 to 200k; so I think it's fine to pessimize the low end number for the gains in the larger views.

I can merge, or you can do the honours, @sergiorg-hpc. Thanks for contribution!

sergiorg-hpc · 2022-02-08T15:06:06Z

Looks good to me; I asked @NadirRoGue about what the usage patterns were on the viz side, and it sounds like the usual number of node id's examined is it can go from 100 to 200k; so I think it's fine to pessimize the low end number for the gains in the larger views.

I can merge, or you can do the honours, @sergiorg-hpc. Thanks for contribution!

Do not thank me, this was a team effort (props to @jorblancoa, @matz-e and others). Also, thanks to you and your team, @mgeplf.

I cannot merge because I do not have permissions, so please, you can go ahead. Also, if you don't mind, create a new tag v0.1.11 so that I can create a PR in Spack to release the Debug->Release patch and this one on BB5 as soon as possible, ok? I will add you to the reviewers of that PR as well.

Once again, thank you for the insight and let's hope this fix can further help our users.

sergiorg-hpc changed the title ~~Temp. commit - Changes how the ids and positions are retrieved~~ Changes how the ids and positions are retrieved Dec 22, 2021

sergiorg-hpc self-assigned this Dec 22, 2021

sergiorg-hpc force-pushed the sandbox/srivas/ids_perf_improvements branch from f754003 to 1920499 Compare December 22, 2021 02:10

mgeplf added the WIP work in progress label Dec 22, 2021

sergiorg-hpc force-pushed the sandbox/srivas/ids_perf_improvements branch from 02f77c6 to 1630710 Compare December 22, 2021 16:01

sergiorg-hpc force-pushed the sandbox/srivas/ids_perf_improvements branch from 1630710 to 8f26d4f Compare January 4, 2022 09:55

sergiorg-hpc removed the WIP work in progress label Jan 10, 2022

sergiorg-hpc force-pushed the sandbox/srivas/ids_perf_improvements branch from 8f26d4f to d2b9b61 Compare January 10, 2022 13:41

sergiorg-hpc requested review from matz-e and mgeplf January 10, 2022 13:58

sergiorg-hpc marked this pull request as ready for review January 10, 2022 13:58

sergiorg-hpc requested a review from jorblancoa January 12, 2022 09:48

sergiorg-hpc added 2 commits January 18, 2022 09:31

Temp. commit - Changes how the ids and positions are retrieved

7d733c6

Minor changes + clang-format

4d794dd

sergiorg-hpc force-pushed the sandbox/srivas/ids_perf_improvements branch from d2b9b61 to 4d794dd Compare January 18, 2022 08:32

matz-e previously approved these changes Jan 24, 2022

View reviewed changes

include/bbp/sonata/report_reader.h Outdated Show resolved Hide resolved

include/bbp/sonata/report_reader.h Outdated Show resolved Hide resolved

Unify methods to avoid returning unnecesary values

6664df9

sergiorg-hpc dismissed matz-e’s stale review via 6664df9 January 24, 2022 10:58

sergiorg-hpc force-pushed the sandbox/srivas/ids_perf_improvements branch from a11cc2f to 6664df9 Compare January 24, 2022 10:58

mgeplf reviewed Jan 28, 2022

View reviewed changes

Address proposed changes in the PR

53eec98

mgeplf reviewed Feb 1, 2022

View reviewed changes

Address additional comments in the review

226217d

mgeplf and others added 2 commits February 4, 2022 10:44

rename functions/datastructure for clarity

88c5d49

* struct ElementIdsData -> NodeIdElementLayout * getElementIds -> getNodeIdElementLayout

sergiorg-hpc force-pushed the sandbox/srivas/ids_perf_improvements branch from 492bab1 to d6e7e5b Compare February 8, 2022 14:01

Clarifies HDF5 fix

5ea2a3b

sergiorg-hpc force-pushed the sandbox/srivas/ids_perf_improvements branch from d6e7e5b to 5ea2a3b Compare February 8, 2022 14:02

matz-e approved these changes Feb 8, 2022

View reviewed changes

src/report_reader.cpp Show resolved Hide resolved

mgeplf approved these changes Feb 8, 2022

View reviewed changes

mgeplf merged commit 2863397 into master Feb 9, 2022

mgeplf deleted the sandbox/srivas/ids_perf_improvements branch February 9, 2022 07:24

sergiorg-hpc mentioned this pull request Feb 18, 2022

Improve mem. ops. + NodeID-ElementID mapping bindings #179

Merged

sergiorg-hpc mentioned this pull request Mar 15, 2022

Add support for IO-blocks #183

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Changes how the ids and positions are retrieved #174

Changes how the ids and positions are retrieved #174

sergiorg-hpc commented Dec 22, 2021 •

edited

Loading

mgeplf commented Jan 11, 2022

sergiorg-hpc commented Jan 11, 2022

mgeplf commented Jan 11, 2022

sergiorg-hpc commented Jan 11, 2022

sergiorg-hpc commented Jan 21, 2022 •

edited

Loading

matz-e left a comment

sergiorg-hpc commented Jan 28, 2022

sergiorg-hpc commented Feb 1, 2022 •

edited

Loading

mgeplf commented Feb 1, 2022

sergiorg-hpc commented Feb 1, 2022

mgeplf commented Feb 2, 2022

matz-e commented Feb 2, 2022

mgeplf commented Feb 4, 2022

mgeplf commented Feb 8, 2022

sergiorg-hpc commented Feb 8, 2022

Changes how the ids and positions are retrieved #174

Changes how the ids and positions are retrieved #174

Conversation

sergiorg-hpc commented Dec 22, 2021 • edited Loading

mgeplf commented Jan 11, 2022

sergiorg-hpc commented Jan 11, 2022

mgeplf commented Jan 11, 2022

sergiorg-hpc commented Jan 11, 2022

sergiorg-hpc commented Jan 21, 2022 • edited Loading

matz-e left a comment

Choose a reason for hiding this comment

sergiorg-hpc commented Jan 28, 2022

sergiorg-hpc commented Feb 1, 2022 • edited Loading

mgeplf commented Feb 1, 2022

sergiorg-hpc commented Feb 1, 2022

mgeplf commented Feb 2, 2022

matz-e commented Feb 2, 2022

mgeplf commented Feb 4, 2022

mgeplf commented Feb 8, 2022

sergiorg-hpc commented Feb 8, 2022

sergiorg-hpc commented Dec 22, 2021 •

edited

Loading

sergiorg-hpc commented Jan 21, 2022 •

edited

Loading

sergiorg-hpc commented Feb 1, 2022 •

edited

Loading