Implement simple sequenced keys constraint #273

devreal · 2024-02-27T23:25:27Z

Here is a first implementation of constraints. The flow is as follows:

The application creates a constraint that it can pass to one or more TTs. Currently, all TTs must have the same key (like BcastA/B). We have to supply different mappers for BcastA/B because the keys contain different information (for A, k is in key[1], for B it's in key[0]). The mapper extracts the sequence number from the key and returns it as an ordinal to the constraint. The mapper can be either provided to the constraint directly or to the TT, which then passes it on to the constraint. That (and handling void :/) makes the code somewhat more complex than needed. Users probably will implement only one version for the configuration they need.
Once a task has all its inputs, it will query the constraint whether the key can be scheduled for execution. If not, the TT will store the task in a hash table that is used once the key is released.
When a task completes, the TT reports to constraint that a given key has completed. It is up to the constraint what to do with that information. In the example constraint, we decrement a counter and once it's zero we release the next wave of tasks.
TTs register themselves as a listener and are informed about a new set of tasks that is released. Releasing tasks in bulk gives us some room for optimization. Ultimately, that's why I diverted from the release callback per task that we talked about in Knoxville.

User-defined constraints should inherit from ConstraintBase to get the listener handling and have to provide the proper check and complete callbacks.

I still need to remove the control flow we have today in SPMM and check that it works as expected. Overall, the user-facing interface is quite clean and the internal interface is not terrible (a win already).

therault · 2024-02-29T15:27:39Z

ttg/ttg/parsec/ttg.h

@@ -1272,6 +1274,9 @@ namespace ttg_parsec {

    bool m_defer_writer = TTG_PARSEC_DEFER_WRITER;

+    std::vector<ttg::meta::detail::constraint_callback_t<keyT>> constraints_check;


I'm not sure I see the difference between constraints_check and constraints_complete. It looks like we always push_back a callback to both vectors (or to none), and sometimes we iterate over constraints_check only, but since we appear to never remove the callback from either, I'm not sure I follow why we need both (not saying I found a problem, just wanting more explanations :) )

I'll put some comments there. These are two different callbacks:

constraints_check hands the task over to the constraint, which will either tell us directly that it is not constrained (i.e., can be executed) or stores the key for later release through a callback (the listener action) and tells us that the key is indeed constrained.

constraints_complete tells the constraint that the task has executed (so that the constraint can check for new tasks to release through the listener action)

therault · 2024-02-29T15:31:48Z

examples/spmm/spmm.cc

@@ -292,11 +292,16 @@ class SpMM25D {
      , parallel_bcasts_(parallel_bcasts) {
    Edge<Key<2>, void> a_ctl, b_ctl;
    Edge<Key<2>, int> a_rowctl, b_colctl; // TODO: can we have multiple control inputs per TT?
+    auto constraint = ttg::make_shared_constraint<ttg::SequencedKeysConstraint<Key<2>>>();


That looks neat indeed.

I'm slightly worried about the limitation that the constraint needs to work on a single key type.

We don't need it in GEMM, because by constraining the broadcast tasks, we indirectly constrain the GEMM tasks, but let's say that I wanted to explicitly constrain the GEMM tasks too: I would need to define a new shared constraint with SequencedKeysConstraint<Key<3>> and I would have to release the constraint on both shared constraints to make them progress simultaneously?

Also, I don't see in the GEMM code any explicit release of constraints? Who makes them progress?

I agree. I have not found a good way to type pun the keys. Today, we couldn't include the gemm task in the constraint because the key types are different (Key<2> and Key<3>). I'm sure there is way to do that, I'm not sure about the complexity...

therault · 2024-02-29T15:37:27Z

ttg/ttg/constraint.h

+    /**
+     * Used for external key mapper.
+     */
+    SequencedKeysConstraint()


This is not what I had in mind. Might work too, but let's go over it during the next meeting, to make sure it does what we want.

AFAICT, the SequencedKeyConstraint makes sure tasks are released in the order of the integer returned by the mapping function; release of tasks can be paused (calling the stop() method) and resumed (calling the start() method), but I'm not sure what calls stop or stat in the GEMM example. It looks to me that with this implementation, if we discover enough BCastA / BCastB tasks, we'll push them in the right order in the scheduler (which is already a progress), but there is no mechanism to actually prevent a BCastA with a high 'k' to be scheduled if it is discovered first.

In the SPMM example we attach the mapping function to the TT (the keys for BcastA/B are the same type but we need different information from them). We cannot pass one mapping here. The different check functions below are conditionally enabled on whether a mapper was provided to the constraint or whether the TT has to provide the ordinal.

devreal · 2024-06-06T16:25:47Z

The constraints need to be ported to the MADNESS backend. I'd like to refactor the common parts out into the TT base class to avoid duplication yet another bunch of code.

Signed-off-by: Joseph Schuchart <[email protected]>

Constraints should be checked in the order in which they were added. This avoids deadlocks if all constraints ensure progress (e.g., they don't wait for a specific task to appear) but may lead to different outcomes (execution orders) if the order of constraints changes. Signed-off-by: Joseph Schuchart <[email protected]>

Signed-off-by: Joseph Schuchart <[email protected]>

Replaced by constraints. Signed-off-by: Joseph Schuchart <[email protected]>

Auto-release makes sure there are no deadlocks by enabling the next set of keys once the current ordinal is done. Without auto-release applications must release the next set explitly and ensure there are no deadlocks. Signed-off-by: Joseph Schuchart <[email protected]>

The constructor of the SequencedKeysConstraint takes a Boolean argument that determines whether tasks are automatically released or whether we depend on the application to release the next wave of tasks. Signed-off-by: Joseph Schuchart <[email protected]>

Signed-off-by: Joseph Schuchart <[email protected]>

Constraints will be ignored in the MADNESS backend for now. Signed-off-by: Joseph Schuchart <[email protected]>

…nable moves

…s::Tensor ops

…self

Signed-off-by: Joseph Schuchart <[email protected]>

…onstraints

…ledarray#472

Constraints are a no-op in the madness backend so don't try to execute with them. Signed-off-by: Joseph Schuchart <[email protected]>

most importantly pulls in ValeevGroup/tiledarray#483 which solves linking errors due to missing fmt instantiations

…straints

We should consider this a hint and fall back to non-persistent values if the type is not derived from TTValue or we are running with madness. Signed-off-by: Joseph Schuchart <[email protected]>

Signed-off-by: Joseph Schuchart <[email protected]>

…onstraints

Signed-off-by: Joseph Schuchart <[email protected]>

We cannot fully serialize the DeviceTensor and we don't need the DeviceTensor in host execution. Signed-off-by: Joseph Schuchart <[email protected]>

Fixes issues in TA device header. Signed-off-by: Joseph Schuchart <[email protected]>

We should only call that when we run on the PaRSEC backend. Otherwise we'll pull the rug from under the madness backend. Signed-off-by: Joseph Schuchart <[email protected]>

Signed-off-by: Joseph Schuchart <[email protected]>

devreal · 2024-11-01T11:41:15Z

Somehow this breaks the element-wise SPMM. Not sure why, and how to dig into that. Will need some time or volunteers :)

Signed-off-by: Joseph Schuchart <[email protected]>

We have to provide our own check for equal based on the provided comparator. Signed-off-by: Joseph Schuchart <[email protected]>

Signed-off-by: Joseph Schuchart <[email protected]>

devreal · 2024-11-14T19:33:50Z

@evaleev Something is off with the Boosts in the Ubuntu CI:

CMake Error at CMakeLists.txt:4 (find_package):
  Found package configuration file:

    /home/runner/work/ttg/ttg/install/lib/cmake/ttg/ttg-config.cmake

  but it set ttg_FOUND to FALSE so package "ttg" is considered to be NOT
  FOUND.  Reason given by package:

  The following imported targets are referenced, but are missing:
  Boost::callable_traits Boost::serialization Boost::iostreams

I've seen similar issues on one of our machines. This PR bumps the minimum required to 1.81. What was the reason for that?

evaleev · 2024-11-14T22:00:02Z

@evaleev Something is off with the Boosts in the Ubuntu CI:

Ubuntu provides too old of a Boost, which wreaks unseen levels of havoc (build succeeds but metadata in ttg-config.cmake is somehow all screwed up).

I've seen similar issues on one of our machines. This PR bumps the minimum required to 1.81. What was the reason for that?

tracking issues across the entirety of the stack is difficult. From what I recall some parts of boost 1.80 used std::unary_function which was removed in C++17, but most libs still shipped it ... but not Apple clang 15, so I bumped to boost to 1.81 across my stack without trying to fine-grain it. Building from source provides Boost 1.81 as it is a good-enough implementation of modularized Boost build, so it makes sense to demand at least that version to avoid diffs between system-provided boost and from-source boost.

Boost is hell. But reimplementing it is hell^2.

…ut,Error}.log

devreal requested review from evaleev and therault February 27, 2024 23:25

devreal force-pushed the spmm_constraints branch from 9d0fd20 to f3bd891 Compare February 27, 2024 23:39

therault reviewed Feb 29, 2024

View reviewed changes

devreal and others added 8 commits June 20, 2024 22:09

Implement simple sequenced keys constraint

e642a59

Signed-off-by: Joseph Schuchart <[email protected]>

Constraints: Add first test and improve impl details

403350f

Signed-off-by: Joseph Schuchart <[email protected]>

SPMM: Remove control flow from BcastA/B

46de344

Replaced by constraints. Signed-off-by: Joseph Schuchart <[email protected]>

SPMM: release next bcast once all prior gemms have executed

e789a90

Signed-off-by: Joseph Schuchart <[email protected]>

Replace std::span with ttg::span

99f03b6

Signed-off-by: Joseph Schuchart <[email protected]>

devreal force-pushed the spmm_constraints branch from 5d02164 to 99f03b6 Compare June 21, 2024 02:12

devreal and others added 12 commits July 11, 2024 14:33

SPMM: Adjust maxTs if only minTs is provided

ef8bd3b

Signed-off-by: Joseph Schuchart <[email protected]>

SPMM: Fix up-front release of k tasks

d916ae0

Signed-off-by: Joseph Schuchart <[email protected]>

SPMM: second try at initial release of k

95df973

Signed-off-by: Joseph Schuchart <[email protected]>

SPMM: Fix initial constraint release of k

fbfdf0c

Signed-off-by: Joseph Schuchart <[email protected]>

Add stubs for constraints to MADNESS backend

bebd5d8

Constraints will be ignored in the MADNESS backend for now. Signed-off-by: Joseph Schuchart <[email protected]>

Merge branch 'refs/heads/master' into spmm_constraints

f3050d5

typo

c12a88c

bump TA+MAD tags

4c6cc15

introduced ttg::matrix::Triplet (replacement for Eigen::Triplet) to e…

2a8621c

…nable moves

fixup spmm_cuda.cc

36a309e

fixup DeviceTensor's move ctor/assignment to avoid use of copying bta…

67606e5

…s::Tensor ops

ttg::matrix::Triplet is read-once by default

a36a88b

evaleev force-pushed the spmm_constraints branch from b22fd8d to a36a88b Compare July 24, 2024 18:04

evaleev added 2 commits July 24, 2024 15:17

DeviceTensor's move ctro/assignment can just move buffer (?)

c8c2b95

ttg_parsec_data_wrapper_t's move assignment always cleans up after it…

2dd5905

…self

devreal mentioned this pull request Jul 26, 2024

Add devicemap to spmm benchmark #294

Merged

devreal and others added 17 commits September 10, 2024 15:01

SPMM: remove warmup since it's too expensive and useless on the device

7362231

Signed-off-by: Joseph Schuchart <[email protected]>

Merge branch 'spmm_constraints' of github.com:devreal/ttg into spmm_c…

fc98c69

…onstraints

bump BTAS, MADNESS, and TA tag; most important pull is ValeevGroup/ti…

84c9c13

…ledarray#472

TiledArray/device/allocators.h -> TiledArray/external/device.h

318ea20

Limit execution of constraints test to PaRSEC backend

8f94781

Constraints are a no-op in the madness backend so don't try to execute with them. Signed-off-by: Joseph Schuchart <[email protected]>

bump TA tag + associated MAD tag

45af6ec

most importantly pulls in ValeevGroup/tiledarray#483 which solves linking errors due to missing fmt instantiations

Merge remote-tracking branch 'devreal/spmm_constraints' into spmm_con…

ea4c690

…straints

Fix ttg::persistent

319cd6b

We should consider this a hint and fall back to non-persistent values if the type is not derived from TTValue or we are running with madness. Signed-off-by: Joseph Schuchart <[email protected]>

SPMM: Restrict empty() calls to block-sparse execution

3d4632c

Signed-off-by: Joseph Schuchart <[email protected]>

Merge branch 'spmm_constraints' of github.com:devreal/ttg into spmm_c…

fd343eb

…onstraints

Fix change in TiledArray include path

ce80a38

Signed-off-by: Joseph Schuchart <[email protected]>

Only use TiledArray allocator if available

d405b7e

Signed-off-by: Joseph Schuchart <[email protected]>

Fix Allocator type for MatrixTile

c3df91c

Signed-off-by: Joseph Schuchart <[email protected]>

SPMM: Only use DeviceTensor for device execution

5d06eec

We cannot fully serialize the DeviceTensor and we don't need the DeviceTensor in host execution. Signed-off-by: Joseph Schuchart <[email protected]>

Bump TA tag

635cdbc

Fixes issues in TA device header. Signed-off-by: Joseph Schuchart <[email protected]>

Protect user-level madness init/finalize against the madness backend

8fd0f17

We should only call that when we run on the PaRSEC backend. Otherwise we'll pull the rug from under the madness backend. Signed-off-by: Joseph Schuchart <[email protected]>

Fix constraints test

4f0f89a

Signed-off-by: Joseph Schuchart <[email protected]>

evaleev and others added 4 commits November 13, 2024 22:44

amended 1948ee7 to make scalar spmm work

595f650

Fix manual release constraint test

a64c8d3

Signed-off-by: Joseph Schuchart <[email protected]>

Sequence: Comparator must provide a strict ordering

393b4ee

We have to provide our own check for equal based on the provided comparator. Signed-off-by: Joseph Schuchart <[email protected]>

Static assert that comparator provides a strict ordering

6a6fa70

Signed-off-by: Joseph Schuchart <[email protected]>

evaleev added 2 commits November 14, 2024 15:48

[ci] debug install-tree tests

df2e124

[cmake] bump VRG cmake kit version to avoid using flaky FindBoost module

448edcf

evaleev added 2 commits November 14, 2024 22:27

[cmake] default TTG_FETCH_BOOST to ON

62bc42b

[ci skip] if dump CMakeConfigureLog.yaml instead of legacy CMake{Outp…

7048ab5

…ut,Error}.log

evaleev approved these changes Nov 15, 2024

View reviewed changes

evaleev merged commit 6f7c82d into TESSEorg:master Nov 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement simple sequenced keys constraint #273

Implement simple sequenced keys constraint #273

devreal commented Feb 27, 2024

therault Feb 29, 2024

devreal Feb 29, 2024

therault Feb 29, 2024

devreal Feb 29, 2024

therault Feb 29, 2024

devreal Feb 29, 2024

devreal commented Jun 6, 2024

devreal commented Nov 1, 2024

devreal commented Nov 14, 2024

evaleev commented Nov 14, 2024

		@@ -1272,6 +1274,9 @@ namespace ttg_parsec {

		bool m_defer_writer = TTG_PARSEC_DEFER_WRITER;

		std::vector<ttg::meta::detail::constraint_callback_t<keyT>> constraints_check;

Implement simple sequenced keys constraint #273

Implement simple sequenced keys constraint #273

Conversation

devreal commented Feb 27, 2024

therault Feb 29, 2024

Choose a reason for hiding this comment

devreal Feb 29, 2024

Choose a reason for hiding this comment

therault Feb 29, 2024

Choose a reason for hiding this comment

devreal Feb 29, 2024

Choose a reason for hiding this comment

therault Feb 29, 2024

Choose a reason for hiding this comment

devreal Feb 29, 2024

Choose a reason for hiding this comment

devreal commented Jun 6, 2024

devreal commented Nov 1, 2024

devreal commented Nov 14, 2024

evaleev commented Nov 14, 2024