Updates to split execution and memory resources, reduce scope #52

hcedwar · 2018-05-04T00:18:08Z

Split execution and memory resources within topology.
Make execution resources iterable.
Reduce scope by removing specific execution context, which was not needed for affinity exploration.
Tied current naming and discussion to execution resource capable of executing std:;thread.

Ruyk · 2018-05-04T08:03:46Z

affinity/cpp-20/d0796r2.md


-      affinity_query(execution_resource &&, execution_resource &&) noexcept;
+      affinity_query(execution_resource & const , std::pmr::memory_resource & const ) noexcept;


#49 suggests we should get rid of pmr::memory_resource, should we use simply memory_resource here? Or should we do a separate pull request for that change?

Let's do a follow up discussion on #49 and follow up PR.

Makes sense, lets keep your change until #49 is sorted out.

I agree. We decided that for the purposes of affinity-based allocation it didn't make sense to use the pmr memory resource, however after separating the execution and memory topologies, I think the pmr memory resource could be a good choice to represent a node within the memory topology, even if that is then passed to a concrete affinity aware allocator type. Though I think we should (as @hcedwar suggested in #41) make it a type which is derived from the pmr memory resource so that it can be made iterable. I think we need to discuss this further.

Ruyk · 2018-05-04T08:04:35Z

affinity/cpp-20/d0796r2.md

@@ -474,7 +402,7 @@ The `affinity_query` class template provides an abstraction for a relative affin

 ### `affinity_query` constructors

-    affinity_query(const execution_resource &, const execution_resource &) noexcept;
+    affinity_query(const execution_resource &, const std::pmr::memory_resource &) noexcept;


see my comment above about #49

AerialMantis · 2018-05-05T18:34:16Z

affinity/cpp-20/d0796r2.md


 The capability of querying underlying *execution resources* of a given *system* is particularly important towards supporting affinity control in C++. The current proposal for executors [[22]][p0443r4] leaves the *execution resource* largely unspecified. This is intentional: *execution resources* will vary greatly between one implementation and another, and it is out of the scope of the current executors proposal to define those. There is current work [[23]][p0737r0] on extending the executors proposal to describe a typical interface for an *execution context*. In this paper a typical *execution context* is defined with an interface for construction and comparison, and for retrieving an *executor*, waiting on submitted work to complete and querying the underlying *execution resource*. Extending the executors interface to provide topology information can serve as a basis for providing a unified interface to expose affinity. This interface cannot mandate a specific architectural definition, and must be generic enough that future architectural evolutions can still be expressed.

-Two important considerations when defining a unified interface for querying the *resource topology* of a *system*, are (a) what level of abstraction such an interface should have, and (b) at what granularity it should describe the topology's *execution resources*. As both the level of abstraction of an *execution resource* and the granularity that it is described in will vary greatly from one implementation to another, it’s important for the interface to be generic enough to support any level of abstraction. To achieve this we propose a generic hierarchical structure of *execution resources*, each *execution resource* being composed of other *execution resources* recursively. Each *execution resource* within this hierarchy can be used to place memory (i.e., allocate memory within the *execution resource’s* memory region), place execution (i.e. bind an execution to an *execution resource’s execution agents*), or both.
+Two important considerations when defining a unified interface for querying the *resource topology* of a *system*, are (a) what level of abstraction such an interface should have, and (b) at what granularity it should describe the topology's *execution resources* and *memory resources*. As both the level of abstraction of resources and the granularity that they describe in will vary greatly from one implementation to another, it’s important for the interface to be generic enough to support any level of abstraction. To achieve this we propose generic hierarchical structures for *execution resources* and *memory resources*, where each *resource* being composed of other *resources* recursively. 


I think we should state here that each execution resource should be made up of other execution resources and each memory resource should be made up of other memory resources:

"where each resource being composed of other resources recursively" -> "where each is recursively composed of other execution resources and memory resources respectively"

I started down this path and the problem I ran into was having purely hierarchical memory resource. Even without GPU in the picture you will have main memory, high bandwidth memory, and non-volatile memory. A hierarchical memory resource would require a top-level allocator that spans all of them. The path I followed was instead for each execution resource to have a set of memory resources.
I am uneasy rushing this scope, a little more design-discussion time would have been helpful but we are up against the pre-meeting mailing clock.

That's a good point, having a separate set of memory resources for each execution resource could work, though this could lead to many execution resources pointing to the same memory resource, which could be confusing for users to figure out which memory resources are the same and which ones are different. Though perhaps we could maintain a hierarchy of memory resources, but have each execution resource point to the highest level memory resource which it can allocate memory in, and then the lower level memory resources could be traversed from that it would be consistent enough as to avoid confusion.

One way we could get around the problem of needing to have a top-level allocator could be to have a memory resource which represents the top of the topology which is either not capable of allocating (though that would likely mean throwing an exception) or is free to allocate anywhere within the topology.

I agree I think the separation of execution and memory topologies has revealed some difficult design points which I think we will need some further discussion to resolve.

AerialMantis · 2018-05-05T20:06:18Z

affinity/cpp-20/d0796r2.md


-The interface for querying the *resource topology* of a *system* must be flexible enough to allow querying all *execution resources* available under an *execution context*, querying the *execution resources* available to the entire system, and constructing an *execution context* for a particular *execution resource*. This is important, as many standards such as OpenCL [[6]][opencl-2-2] and HSA [[7]][hsa] require the ability to query the *resource topology* available in a *system* before constructing an *execution context* for executing work.
+The interface for querying the *resource topology* of a *system* 
+must be flexible enough to allow 


This sentence reads a bit strange, perhaps it's best to split this up now that it's gotten quite long, maybe even have this as bullet points?

AerialMantis · 2018-05-05T20:10:47Z

affinity/cpp-20/d0796r2.md


-An `execution_resource` is a light weight structure which acts as an identifier to particular piece of hardware within a system. It can be queried for whether it can allocate memory via `can_place_memory` and whether it can execute work via `can_place_agents`, and for it's name via `name`. An `execution_resource` can also represent other `execution_resource`s, these are refered to as being *members of* that `execution_resource` and can be queried via `resources`. Additionally the `execution_resource` which another is a *member of* can be queried vis `member_of`. An `execution_resource` can also be queried for the concurrency it can provide; the total number of *threads of execution* supported by that *execution_resource* and all resources it represents.
+An **execution agent** executes work, typically implemented by a *callable*, 


Is this taken from #47? If so the same comments apply here.

AerialMantis · 2018-05-05T20:11:49Z

affinity/cpp-20/d0796r2.md

-Below *(Listing 2)* is an example of iterating over the system level resources and priniting out it's capabilities.
+A *system* includes execution resources and memory resources.
+Execution resources are organized hierarchically.
+A particular execution resource may be a partitioned into


"may be a partitioned into" -> "may be partitioned into"

AerialMantis · 2018-05-05T20:50:26Z

affinity/cpp-20/d0796r2.md

+A particular execution resource may be a partitioned into
+a collection of execution resources referred to as *members of*
+that execution resource.
+The partitioning of execution resources implies a locality relationship;


Is it always possible to make this guarantee? Could you, for example, have an architecture which is represented with a hierarchy of {{A, B, C, D},{E, F, G, H}} but where D is more local to E than to A?

I haven't encountered such a thing. If I did see such a situation I'd say the topology definition was in error.

AerialMantis · 2018-05-05T21:23:34Z

affinity/cpp-20/d0796r2.md

+
+An execution resource has one or more memory resources 
+which can allocate memory accessible to that execution resource.
+When an execution resource hierarchy is traversed the associated


Does this imply a relationship between the levels of the execution topology with the equivalent levels of the memory topology?

... prior comment about problem with a pure hierarchy for memory resources

AerialMantis · 2018-05-05T21:24:33Z

affinity/cpp-20/d0796r2.md

+size_t count = 0 ;
+for (auto member : execution::this_system::execution_resource()) {
+    count += member.concurrency();
+    std::cout << member.name()  `\n`;


Missing "<<".

AerialMantis · 2018-05-05T21:32:05Z

affinity/cpp-20/d0796r2.md


-> [*Note:* Note that an execution resource is not limited to resources which execute work, but also a general resource where no execution can take place but memory can be allocated such as off-chip memory. *--end note*]


I think this note has been lost in the changes.

Defines a memory resource vs. an execution resource.

That's right, ignore me :)

AerialMantis · 2018-05-05T22:36:17Z

affinity/cpp-20/d0796r2.md

 ```
 *Listing 3: Example of querying affinity between two `execution_resource`s.*

 > [*Note:* This interface for querying relative affinity is a very low-level interface designed to be abstracted by libraries and later affinity policies. *--end note*]

-### Execution context


I don't think we should remove the wording for the execution_context, this is quite important for the design of being able to construct an execution context which represents part of the system topology.

Challenge is what kind of execution context? Static thread pool? Dynamic thread pool? etc.

This is a tough question which I think will require a lot more design discussion. The intention of the current execution_context was to provide the minimum required interface in order to allow constructing a context around an execution resource. The idea was that this could then be built on to provide more customization likely through the use of properties.

Perhaps we should move more towards the execution context we define being a concept which requires a particular interface, i.e. one which allows it to be constructed from an execution resource and to provide an executor and allocator. This approach may actually make it easier for concrete execution context types, such as the static_thread_pool to support this interface by simply satisfying the requirements of the concept.

Though I'm not keen to make such a large change to the paper so close to the mailing deadline, I think we should raise this as an open question for Rapperswil and then discuss it further for the following revision. Maybe if we can resolve this in time we can present the idea at Rapperswil.

AerialMantis · 2018-05-05T22:44:49Z

affinity/cpp-20/d0796r2.md


 ## Header `<execution>` synopsis

    namespace std {
    namespace experimental {
    namespace execution {

-    /* Execution resource */
+    /* Execution resource capable of executing std::thread */


I don't think we should couple the execution resource type with std::thread as this could restrict from using other kinds of threads of execution.

Such a co-routines?
Given the limited time I thought best to start with current-standard foundation of std::thread and just associate thread_execution_resource . We can branch out into other architectures later.

I was thinking of GPU threads, though co-routines would be another example of non-std::thread threads of execution. As I understand it SG1 are moving away from a tight coupling between std::thread and threads of execution, and instead, having std::thread just be one of many ways to invoke threads of execution. I feel we should try to follow this direction in our design.

Ruyk · 2018-05-05T22:47:23Z

affinity/cpp-20/d0796r2.md

 ```
 *Listing 3: Example of querying affinity between two `execution_resource`s.*

 > [*Note:* This interface for querying relative affinity is a very low-level interface designed to be abstracted by libraries and later affinity policies. *--end note*]

-### Execution context


Seems you are removing entirely the Execution context (I thought it was just moved elsewhere at first). Is that intentional?
The text refers to the concept of Execution context, so I think we should keep it well defined, even if we make changes to the interface itself later.

The discussion of execution context is still in the paper, just no immediate proposal for a concrete execution context such as static or dynamic thread pools.

AerialMantis · 2018-05-05T22:49:23Z

affinity/cpp-20/d0796r2.md

-      execution_resource &operator=(execution_resource &&);
-      ~execution_resource();
+      ~thread_execution_resource();
+      thread_execution_resource() = delete;


I think we decided to make the execution resource type copyable and moveable so that it would be easier to use them in standard algorithms.

I thought this was for the std::vector<execution_context> interface, and with iterate-able thread_execution_resource was not longer necessary.

That's true, but we decided that we should keep resources copyable and moveable so that they could be used in standard algorithms, in order to for example sort by affinity or find particular resources. Although this is something we will need to investigate further. Something like this:

std::sort(execRes.begin(), execRes.end(), [&memRes](const foo &lhs, const foo &rhs){ return affinity_query<read, latency>(lhs, memRes) > affinity_query<read, latency>(rhs, memRes); });

AerialMantis · 2018-05-05T23:43:16Z

affinity/cpp-20/d0796r2.md

-## Level of abstraction
-
-The current proposal provides an interface for querying whether an `execution_resource` can allocate and/or execute work, it can provide the concurrency it supports and it can provide a name. We also provide the `affinity_query` structure for querying the relative affinity metrics between two `execution_resource`s. However this may not be enough information for users to take full advance of the system, they may also want to know what kind of memory is available or the properties by which work is executed. It was decided that attempting to enumerate the various hardware components would not be ideal as that would make it harder for implementers to support new hardware. It has been discussed that a better approach would be to parameterise the additional properties of hardware such that hardware queries could be much more generic.
+## Dynamic topology discovery - deferred


I think we should leave in the discussion of dynamic topology discovery, as it's an important topic to SG1, we want to show it's still on the roadmap but we just haven't gotten to it yet.

Still in the intro material. Seemed prudent to mention but not spend any SG1 time discussing.

…aysoftware#50.

hcedwar · 2018-05-06T22:49:10Z

Rebased pull request the eliminate conflicts.

Ruyk reviewed May 4, 2018

View reviewed changes

hcedwar mentioned this pull request May 5, 2018

CP013: Stalled! #55

Closed

AerialMantis reviewed May 5, 2018

View reviewed changes

Ruyk reviewed May 5, 2018

View reviewed changes

AerialMantis reviewed May 5, 2018

View reviewed changes

AerialMantis mentioned this pull request May 6, 2018

CP013: Remove polymorphic allocator #49

Open

hcedwar added 3 commits May 6, 2018 14:49

Add terminology section aligned with P0737 Execution Context.

701b3e5

Separating execution and memory resource hierarchies.

6884549

Address CP013 issues codeplaysoftware#41, codeplaysoftware#42, codepl…

0a15f5d

…aysoftware#50.

hcedwar force-pushed the master branch from b0e12c0 to 0a15f5d Compare May 6, 2018 22:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Updates to split execution and memory resources, reduce scope #52

Updates to split execution and memory resources, reduce scope #52

hcedwar commented May 4, 2018

Ruyk May 4, 2018

hcedwar May 4, 2018

Ruyk May 5, 2018

AerialMantis May 5, 2018

Ruyk May 4, 2018

AerialMantis May 5, 2018 •

edited

Loading

hcedwar May 6, 2018

AerialMantis May 6, 2018

AerialMantis May 5, 2018

AerialMantis May 5, 2018

AerialMantis May 5, 2018

AerialMantis May 5, 2018 •

edited

Loading

hcedwar May 6, 2018

AerialMantis May 5, 2018

hcedwar May 6, 2018

AerialMantis May 5, 2018

AerialMantis May 5, 2018

hcedwar May 6, 2018

AerialMantis May 6, 2018

AerialMantis May 5, 2018

hcedwar May 6, 2018

AerialMantis May 7, 2018 •

edited

Loading

AerialMantis May 5, 2018

hcedwar May 6, 2018

AerialMantis May 7, 2018

Ruyk May 5, 2018

hcedwar May 6, 2018

AerialMantis May 5, 2018

hcedwar May 6, 2018

AerialMantis May 7, 2018

AerialMantis May 5, 2018

hcedwar May 6, 2018

hcedwar commented May 6, 2018


		affinity_query(execution_resource &&, execution_resource &&) noexcept;
		affinity_query(execution_resource & const , std::pmr::memory_resource & const ) noexcept;


		An `execution_resource` is a light weight structure which acts as an identifier to particular piece of hardware within a system. It can be queried for whether it can allocate memory via `can_place_memory` and whether it can execute work via `can_place_agents`, and for it's name via `name`. An `execution_resource` can also represent other `execution_resource`s, these are refered to as being members of that `execution_resource` and can be queried via `resources`. Additionally the `execution_resource` which another is a member of can be queried vis `member_of`. An `execution_resource` can also be queried for the concurrency it can provide; the total number of threads of execution supported by that execution_resource and all resources it represents.
		An execution agent executes work, typically implemented by a callable,


		> [Note: Note that an execution resource is not limited to resources which execute work, but also a general resource where no execution can take place but memory can be allocated such as off-chip memory. --end note]

Updates to split execution and memory resources, reduce scope #52

Are you sure you want to change the base?

Updates to split execution and memory resources, reduce scope #52

Conversation

hcedwar commented May 4, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AerialMantis May 5, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AerialMantis May 5, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AerialMantis May 7, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hcedwar commented May 6, 2018

AerialMantis May 5, 2018 •

edited

Loading

AerialMantis May 5, 2018 •

edited

Loading

AerialMantis May 7, 2018 •

edited

Loading