CP013: Add initial proposal for `this_system::discover_topology` #103

AerialMantis · 2019-09-28T23:32:28Z

Add terms of art for system topology, system resource and topology discovery policy.
Add minimal design forsystem_topology
Add free function this_system::discover_topology.
Introduce minimal system_resource class.
Introduce free function traverse_topology for traversing a
system_topology according to a topology traversal policy.
Update the changelog.

* Add initial wording for system topology term of art. * Add initial proposal for `system_topology` * Add initial proposal for `this_system::discover_topology`.

* Introduce terms of art for system resource and topology discovery policy. * Introduce minimal `system_resource` class. * Introduce free function `traverse_topology` for traversing a `system_topology` according to a topology traversal policy. * Update the changelog.

mhoemmen

Thanks for working on this! A few comments.

affinity/cpp-23/d1795r1.md

mhoemmen · 2019-10-06T03:08:14Z

affinity/cpp-23/d1795r1.md

+std::chrono::time_point<std::chrono::system_clock> timestamp() const noexcept;
+```
+
+*Returns:* A `std::chrono::time_point<std::chrono::system_clock>` object representing the time at which the runtime discovery of the system topology performed to construct the `system_topology` object was completed.


If we permit dynamic topologies, then does it make sense to talk about a timestamp for a whole topology? It's not really an atomic thing -- you have to traverse subtrees to discover things, so some subtrees may have gotten added after you started the process. Thus, you can't use the timestamp of discovering the first thing, since other things might not have existed yet at that time. You can't use the timestamp of discovering the last thing, since other things might have disappeared since then.

Applications might want to use a timestamp to decide whether they need to refresh the topology, but applications could just as easily compute their own timestamps.

You make a good point here. My initial thinking was that discover_topology would perform topology discovery for the entire system and then if something changed you could then discover again and compare the new topology with the previous. But perhaps we do want the interface to be more fine grained than that.

I can take the timestamp member function out until we decide how granular the interface should be.

@AerialMantis I think it's a good idea to take it out, if that's OK with you. My worry is that there's no good way to define a single timestamp for a whole topology, since discovery is not atomic. Providing a timestamp might give users the false impression that it is atomic.

I agree FWIW, it's something the user can do if they want to that the library can't really do a better job of than they can. It also might incur costs that would otherwise be unnecessary if we ever get to looking at compile-time visible versions or subsets.

These are good points, I agree we should allow the user flexibility to choose when they discover different parts of the topology and timestamp in a way that is appropriate to their use case.

I have removed the timestamp member function from this revision.

mhoemmen · 2019-10-06T03:52:41Z

affinity/cpp-23/d1795r1.md

+### `system_resource` constructors
+
+```cpp
+system_resource() = delete;


I'm pretty sure you can't have a std::vector of objects that aren't default constructible.

That's a good point, thanks for catching that! I believe we ran into the same issue in an earlier revision of the paper.

Though now that C++20 has ranges we can have this return a ranges::view, this also allows us to do range adaptations on top for further filtering the topology results.

We did have a debate earlier about whether getting the topology should return a copy (vector) or a view (span etc.). Both have issues: the former implies semiregularity of system_resource (and thus implies either reference counting or some way of assigning unique IDs), while the latter constrains the topology's lifetime in ways we don't want (if the topology is no longer valid, what happens to the view?).

I like the "assign unique IDs" approach, since it better matches how a lot of systems distinguish resources. (Affinity regions, discrete GPUs, MPI processes, threads, etc. all have IDs.) That would let users do convenient things like stuff resources into containers and run algorithms over them. The IDs would really need to be unique (and not like, say, Unix process IDs, that can get reused) so that users could compare two topologies. That suggests a debate about resource identity (e.g., how much do I care whether a reappeared remote resource is the same as before?) that will occupy people's time longer than it deserves ;-) (Theseus would care more about what his ship does than about its sameness).

You make a good point here. I can see value in both of these approaches, I think having lazy views over the system_resources within a system_topology could be a nice way to compose topology traversal routines, but I also agree that it's important that a system_resource has a unique identifier for the reasons you mentioned and that we ultimately want to be able to store the resulting system_resource(s) without worrying about their lifetimes being tied to the system_topology object.

Perhaps we could have the best of both here, we could have the system_resource be semiregular and provide a unique identifier, and then have traverse_topology return a ranges::view that is temporarily tied to the lifetime of the system_topology object but capable of being assigned to a vector storing a copy of each resulting system_resource. I believe this should work, though I am not an expert in ranges and will have to look into this some more.

Though I also think this is tied to some interesting questions about the lifetime of the system topology and the objects associated with it (system_resources, but also at some point executors, allocators, etc). I believe we made a lot of progress in this area in P0796 when defining the semantics of the execution_resource, so we should revisit that wording and look at incorporating that into the new paper for the next revision.

For now, I will leave the specific type returned from traverse_topology as to be decided with a note describing the two options.

All true. For me, I'd lean toward the view option with a lifetime tied to the system_topology if only because I can't think of a way to work with the result if it can be randomly invalidated asynchronously with my code and would prefer not to incur an extra copy to get that safety.

affinity/cpp-23/d1795r1.md

* Remove `timestamp` member function until we can decide on the granularity of `discover_topology`. * Update link to P0443 to point to latest revision. * Change return value of `traverse_topology` to a ranges::view<system_resource>`.

mhoemmen · 2019-10-07T10:11:27Z

affinity/cpp-23/d1795r1.md

+/* traverse_topology */
+
+template <class T>
+ranges::view<system_resource> traverse_topology(const system_topology &, const T &) noexcept;


See my previous comments. It might be good to lay out the copy vs. view approaches as alternatives for voting.

We could discuss it, but given that you can trivially collect the view into a vector if you want to keep a copy, is there a strong argument to be made for defaulting to copy?

@trws The issue with views is their lifetime. If the system topology changes, then what happens to the view? Do we make the implementation store a snapshot of the "last accessed topology"?

As part of the system_topology object? I'd say yes.

trws · 2019-10-07T21:04:40Z

I like the updates @AerialMantis, thanks for all your hard work on this!

* Make the return type of `traverse_topology` be to be defined and add note about potential options. * Add more detail to open questions section.

AerialMantis · 2019-11-23T17:22:11Z

I've merged this MR but created an issue - #107 to continue the discussion around the return type of traverse_topology for the next revision.

Add initial proposal for this_system::discover_topology

7057fa5

* Add initial wording for system topology term of art. * Add initial proposal for `system_topology` * Add initial proposal for `this_system::discover_topology`.

AerialMantis requested review from Ruyk and mhoemmen September 28, 2019 23:32

AerialMantis changed the title ~~Add initial proposal for this_system::discover_topology~~ CP013: Add initial proposal for this_system::discover_topology Sep 28, 2019

mhoemmen reviewed Oct 6, 2019

View reviewed changes

Make changes based on feedback.

0e2cd7f

* Remove `timestamp` member function until we can decide on the granularity of `discover_topology`. * Update link to P0443 to point to latest revision. * Change return value of `traverse_topology` to a ranges::view<system_resource>`.

mhoemmen reviewed Oct 7, 2019

View reviewed changes

Gordon Brown added 2 commits October 8, 2019 02:28

Add further improvements based on feedback.

01c03eb

* Make the return type of `traverse_topology` be to be defined and add note about potential options. * Add more detail to open questions section.

Merge branch 'master' into d1795r1-add-initial-proposal

55458f4

AerialMantis merged commit 9413386 into codeplaysoftware:master Nov 23, 2019

AerialMantis mentioned this pull request Nov 23, 2019

CP013: Return type for traverse_topology (P1795) #107

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CP013: Add initial proposal for `this_system::discover_topology` #103

CP013: Add initial proposal for `this_system::discover_topology` #103

AerialMantis commented Sep 28, 2019 •

edited

Loading

mhoemmen left a comment

mhoemmen Oct 6, 2019

mhoemmen Oct 6, 2019

AerialMantis Oct 6, 2019

mhoemmen Oct 7, 2019

trws Oct 7, 2019

AerialMantis Oct 8, 2019

mhoemmen Oct 6, 2019

AerialMantis Oct 7, 2019

mhoemmen Oct 7, 2019 •

edited

Loading

AerialMantis Oct 8, 2019

trws Oct 8, 2019

mhoemmen Oct 7, 2019

trws Oct 7, 2019

mhoemmen Oct 8, 2019 •

edited

Loading

trws Oct 8, 2019

trws commented Oct 7, 2019

AerialMantis commented Nov 23, 2019

CP013: Add initial proposal for this_system::discover_topology #103

CP013: Add initial proposal for this_system::discover_topology #103

Conversation

AerialMantis commented Sep 28, 2019 • edited Loading

mhoemmen left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mhoemmen Oct 7, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mhoemmen Oct 8, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

trws commented Oct 7, 2019

AerialMantis commented Nov 23, 2019

CP013: Add initial proposal for `this_system::discover_topology` #103

CP013: Add initial proposal for `this_system::discover_topology` #103

AerialMantis commented Sep 28, 2019 •

edited

Loading

mhoemmen Oct 7, 2019 •

edited

Loading

mhoemmen Oct 8, 2019 •

edited

Loading