Matrix Multiplication example using MdSpan #2317

mehmetyusufoglu · 2024-07-18T10:44:15Z

This matrix multiplication example is using mdspan data structure for passing the data to/from kernel and for filling the data in the host conveniently.

By using mdspan passing data size and pitch values to the kernel for each 2d array is not needed. Filling the host data is not difficult as well.

If these workshop examples are found to be useful for the users of Alpaka in general; this PR can be merged as well.

must be squashed before merge

example/matrixAddWithMdspan/CMakeLists.txt

example/matrixAddWithMdspan/src/matrixAddMdSpan.cpp

SimeonEhrig · 2024-07-18T13:26:00Z

example/matrixMulWithMdspan/CMakeLists.txt

@@ -0,0 +1,53 @@
+#
+# Copyright 2023 Erik Zenker, Benjamin Worpitz, Jan Stephan


Needs to be modiefied

ok changed to 2024 and SPDX-License-Identifier: MPL-2.0

example/matrixMulWithMdspan/src/matrixMulMdSpan.cpp

example/matrixAddWithMdspan/src/matrixAddMdSpan.cpp

fwyzard · 2024-07-18T15:30:16Z

example/matrixAddWithMdspan/src/matrixAddMdSpan.cpp

+// Needed for running example for all backends available; one by one
+#include <alpaka/example/ExecuteForEachAccTag.hpp>
+
+#include <experimental/mdspan>


should this be wrapped in a check on the C++ standard, so if we are using c++23 we take <mdspan> and otherise if we are using the version that comes with alpaka we take "experimental/mdspan" ?

I disagree because the version shipped with alpaka is different to the standard implementation in an important point. The alpaka version uses the operator() [1] to access elements. The standard implementation uses the operator[] because since C++ 23 more than one argument is allowed. Therefore we would need to add a preporcessor if-else or wrapper function each time the memory is accessed.

[1] The experimental implementation provides also the operator[] if we use C++ 23. But we also support C++17 and 20.

Mhm, I see.

On the other hand, alpaka (the library) does not use the mdspan objects, it only returns them.

I think that, if somebody uses alpaka in a c++23 environment, they should get the official mdspan, not the experimental version (like we do for boost::atomic_ref vs std::atomic_ref).

Can we #if __cplusplus the examples to use () for c++17/c++20, and [] for c++23 ?

I agree, but I think this can be a little bit more work.

The implementation we use creates a defines MDSPAN_USE_BRACKET_OPERATOR. Than it looks like this: https://github.com/SimeonEhrig/MyCpp/blob/28d20632657697533184f95c883739718f8397ad/features/23/mdspan/hello_mspan/main.cpp#L44
I expect, that the define will be not created by the standard library and we need to create it by our self.

Another point is, that our implementation annotates the functions with __host__ and __device__ to make it compatible to CUDA and HIP. At the moment, Nvidia does not provide a stl implementation and I'm not sure, if AMD does it: https://en.cppreference.com/w/cpp/compiler_support#cpp23
But Clang provides already an implementation. Therefore, in the worst case wit detect the stl implementation of Clang, which is maybe not annotated and tries to compile it with the CUDA or HIP backend. It this case, we need a workaround selection mechanism, if the mdspan implementation is not annotated.

Long story short, I agree that we should be forward compatible but I think we need to solve this in an extra PR.

Long story short, I agree that we should be forward compatible but I think we need to solve this in an extra PR.

OK for me.

ok, i will open a related issue not to forget, or @SimeonEhrig could you open since you know the details...

I opened it already: #2319

example/matrixAddWithMdspan/src/matrixAddMdSpan.cpp

fwyzard · 2024-07-18T15:32:40Z

example/matrixMulWithMdspan/src/matrixMulMdSpan.cpp

+    //! \param C Output matrix where the result of A * B will be stored
+    //! \param K The shared dimension between A and B
+    template<typename TAcc, typename MdSpan>
+    ALPAKA_FN_ACC void operator()(TAcc const& acc, MdSpan A, MdSpan B, MdSpan C, Idx K) const


shouldn't K be derived from A and B, with a check that the dimensions match ?

Good catch. Thanks done.

fwyzard · 2024-07-18T15:33:25Z

example/matrixMulWithMdspan/src/matrixMulMdSpan.cpp

+    // Wait for the kernel to finish
+    alpaka::wait(queue);
+


Not needed.

I will keep this and added a comment saying "if the queue is blocking this is not needed."

No, this wait is never needed, also is the queue is not blocking.

The wait below (after the memcpy) is indeed needed, and it should always be there.

I meant the last wait (after the memcpy), if the queue is blocking the host is blocked by memcpy hence no need to wait after memcpy? (and the queue is blocking-queue using Queue = alpaka::Queue<Acc, alpaka::Blocking>;)

Yes.

If a queue is non-blocking, it is necessary to do alpaka::wait(queue) after the memcpy and before accessing the data on the host.

If a queue is blocking, it is not necessary to do alpaka::wait(queue).
However, calling alpaka::wait(queue) should be cheap, though not completely free.
So maybe we can keep it anyway ?

ok, I used wait only after memcpy statements and added no comments... Although the queue is blocking, as you said it is not expensive.

fwyzard · 2024-07-18T15:34:16Z

If these workshop examples are found to be useful for the users of Alpaka in general; this PR can be merged as well.

I like the idea of having examples with mdspan, though I would prefer to see if the types can be used more explicitly.

mehmetyusufoglu · 2024-07-18T20:51:55Z

If these workshop examples are found to be useful for the users of Alpaka in general; this PR can be merged as well.

I like the idea of having examples with mdspan, though I would prefer to see if the types can be used more explicitly.

By "Explicitly" you mean while passing to the kernel and instead of auto in the line auto mdHostA = alpaka::experimental::getMdSpan(bufHostA); using mdspan ? I will try, but getMdSpan does a lot of work it seems :

 ALPAKA_FN_HOST static auto getMdSpan(TView& view)
                {
                    constexpr auto dim = Dim<TView>::value;
                    using Element = Elem<TView>;
                    auto extents = detail::makeExtents(view, std::make_index_sequence<dim>{});
                    auto* ptr = reinterpret_cast<std::byte*>(getPtrNative(view));
                    auto const strides = toArray(getPitchesInBytes(view));
                    layout_stride::mapping<decltype(extents)> m{extents, strides};
                    return std::experimental::mdspan<Element, decltype(extents), layout_stride, detail::ByteIndexedAccessor<Element>>{
                        ptr,
                        m};
                }

fwyzard · 2024-07-18T23:45:03Z

By "Explicitly" you mean while passing to the kernel and instead of auto in the line auto mdHostA = alpaka::experimental::getMdSpan(bufHostA); using mdspan ?

No, I only meant that instead of

template<typename TAcc, typename MdSpan>
ALPAKA_FN_ACC void operator()(TAcc const& acc, MdSpan A, MdSpan B, MdSpan C) const { ... }

I would prefer something like

template<typename TAcc,
    typename T,
    typename Extents,
    typename LayoutPolicy,
    typename AccessorPolicy>
ALPAKA_FN_ACC void operator()(TAcc const& acc,
    mdspan<T, Extents, LayoutPolicy, AccessorPolicy> A,
    mdspan<T, Extents, LayoutPolicy, AccessorPolicy> B,
    mdspan<T, Extents, LayoutPolicy, AccessorPolicy> C) const
{ ... }

However, I acknowledge that this is a lot more verbose, and maybe easier to write once we enable c++20 and concepts.

fwyzard · 2024-07-19T11:11:30Z

example/matrixAddWithMdspan/src/matrixAddMdSpan.cpp

+
+//! Specialization for mdspan with four template arguments
+template<typename ElementType, typename Extents, typename LayoutPolicy, typename AccessorPolicy>
+struct isMdspan<std::experimental::mdspan<ElementType, Extents, LayoutPolicy, AccessorPolicy>> : std::true_type


can you rename this to IsMdspan ?

and add

template <typename T> inline constexpr bool is_mdspan = IsMdspan<T>::value;

?

done, thanks.

mehmetyusufoglu · 2024-07-19T11:12:38Z

By "Explicitly" you mean while passing to the kernel and instead of auto in the line auto mdHostA = alpaka::experimental::getMdSpan(bufHostA); using mdspan ?

No, I only meant that instead of
template<typename TAcc, typename MdSpan>
ALPAKA_FN_ACC void operator()(TAcc const& acc, MdSpan A, MdSpan B, MdSpan C) const { ... }
I would prefer something like
template<typename TAcc,
    typename T,
    typename Extents,
    typename LayoutPolicy,
    typename AccessorPolicy>
ALPAKA_FN_ACC void operator()(TAcc const& acc,
    mdspan<T, Extents, LayoutPolicy, AccessorPolicy> A,
    mdspan<T, Extents, LayoutPolicy, AccessorPolicy> B,
    mdspan<T, Extents, LayoutPolicy, AccessorPolicy> C) const
{ ... }
However, I acknowledge that this is a lot more verbose, and maybe easier to write once we enable c++20 and concepts.

Your suggestion is good and much expressive, but for simplicity i pass TMdSpan then check whether it is exactly std::experimental::mdspan.

template<typename TAcc, typename TMdSpan>
   ALPAKA_FN_ACC void operator()(TAcc const& acc, TMdSpan A, TMdSpan B, TMdSpan C) const
   {
       // compile time check
       static_assert(isMdspan<TMdSpan>::value, "The type TMdSpan should be an std mdspan");

       auto const i = alpaka::getIdx<alpaka::Grid, alpaka::Threads>(acc)[0];
       auto const j = alpaka::getIdx<alpaka::Grid, alpaka::Threads>(acc)[1];

       if(i < A.extent(0) && j < A.extent(1))
       {
           C(i, j) = A(i, j) + B(i, j);
       }
   }

fwyzard · 2024-07-19T11:15:34Z

for simplicity i pass TMdSpan then check whether it is exactly std::experimental::mdspan.

Good idea, that's a good compromise.

psychocoderHPC

please rebase to use getValidWorkDivKErnel

psychocoderHPC · 2024-07-24T08:03:21Z

example/matrixMulWithMdspan/src/matrixMulMdSpan.cpp

+    auto mdDevC = alpaka::experimental::getMdSpan(bufDevC);
+
+    //  Let alpaka calculate good block and grid sizes given our full problem extent.
+    auto const workDiv = alpaka::getValidWorkDiv<Acc>(


please use getValidWorkDivKernel

rebased, done, thanks.

psychocoderHPC · 2024-07-25T12:19:44Z

offline discussed with @mehmetyusufoglu : we will remove the add example because it is 99.9% equal to the multiplication and IMO the code duplication is not required.

psychocoderHPC · 2024-07-25T13:30:46Z

example/matrixMulWithMdspan/src/matrixMulMdSpan.cpp

+
+    std::cout << "Multiplication of matrices of size " << M << "x" << K << " and " << K << "x" << N << " using mdspan "
+              << (success ? "succeeded" : "failed") << "!" << std::endl;
+    return EXIT_SUCCESS;


if success is false you should end with EXIT_FAILURE

ok, added. Thanks a lot. (Tested by ctest for both true-positive, true-negative cases. ) Other examples like convolution1d,2d,2dwithMdSpan are correct in this sense.

SimeonEhrig reviewed Jul 18, 2024

View reviewed changes

SimeonEhrig previously approved these changes Jul 18, 2024

View reviewed changes

fwyzard reviewed Jul 18, 2024

View reviewed changes

example/matrixAddWithMdspan/src/matrixAddMdSpan.cpp Outdated Show resolved Hide resolved

fwyzard reviewed Jul 18, 2024

View reviewed changes

example/matrixAddWithMdspan/src/matrixAddMdSpan.cpp Outdated Show resolved Hide resolved

fwyzard reviewed Jul 18, 2024

View reviewed changes

mehmetyusufoglu dismissed SimeonEhrig’s stale review via d8a8ca1 July 19, 2024 10:41

mehmetyusufoglu force-pushed the matrixExamplesMdSpan branch from b49a304 to f874872 Compare July 19, 2024 11:08

fwyzard reviewed Jul 19, 2024

View reviewed changes

mehmetyusufoglu force-pushed the matrixExamplesMdSpan branch from f874872 to 5c2a4f1 Compare July 19, 2024 11:14

SimeonEhrig mentioned this pull request Jul 22, 2024

Native std::mdspan support #2319

Open

mehmetyusufoglu force-pushed the matrixExamplesMdSpan branch 2 times, most recently from cdda967 to e80097e Compare July 23, 2024 13:15

psychocoderHPC requested changes Jul 24, 2024

View reviewed changes

mehmetyusufoglu force-pushed the matrixExamplesMdSpan branch 3 times, most recently from d1d1c95 to aa6783e Compare July 25, 2024 10:19

psychocoderHPC added the Type:Example label Jul 25, 2024

psychocoderHPC added this to the 1.2.0 milestone Jul 25, 2024

mehmetyusufoglu force-pushed the matrixExamplesMdSpan branch from aa6783e to 0f68e49 Compare July 25, 2024 12:25

mehmetyusufoglu changed the title ~~[For workshop] Add 2 MdSpan examples: Matrix Addition and Multiplication~~ [For workshop] Add a MdSpan example: Matrix Multiplication Jul 25, 2024

mehmetyusufoglu changed the title ~~[For workshop] Add a MdSpan example: Matrix Multiplication~~ Matrix Multiplication example using MdSpan Jul 25, 2024

mehmetyusufoglu force-pushed the matrixExamplesMdSpan branch from 0f68e49 to 74d8315 Compare July 25, 2024 12:31

mehmetyusufoglu force-pushed the matrixExamplesMdSpan branch 2 times, most recently from 693d146 to a3d1443 Compare July 25, 2024 12:46

psychocoderHPC requested changes Jul 25, 2024

View reviewed changes

mehmetyusufoglu force-pushed the matrixExamplesMdSpan branch from a3d1443 to bb1f4e6 Compare July 25, 2024 13:44

Add a matrix multiplication example using mdspan

ed330b1

mehmetyusufoglu force-pushed the matrixExamplesMdSpan branch from bb1f4e6 to ed330b1 Compare July 25, 2024 19:58

psychocoderHPC approved these changes Jul 26, 2024

View reviewed changes

psychocoderHPC merged commit d1cc2e0 into alpaka-group:develop Jul 26, 2024
22 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Matrix Multiplication example using MdSpan #2317

Matrix Multiplication example using MdSpan #2317

mehmetyusufoglu commented Jul 18, 2024 •

edited

Loading

SimeonEhrig Jul 18, 2024

mehmetyusufoglu Jul 19, 2024

fwyzard Jul 18, 2024

SimeonEhrig Jul 22, 2024

fwyzard Jul 22, 2024

SimeonEhrig Jul 22, 2024

fwyzard Jul 22, 2024

mehmetyusufoglu Jul 22, 2024

SimeonEhrig Jul 22, 2024

fwyzard Jul 18, 2024

mehmetyusufoglu Jul 19, 2024

fwyzard Jul 18, 2024

mehmetyusufoglu Jul 19, 2024

fwyzard Jul 19, 2024 •

edited

Loading

fwyzard Jul 19, 2024

mehmetyusufoglu Jul 22, 2024

fwyzard Jul 22, 2024

mehmetyusufoglu Jul 22, 2024

fwyzard commented Jul 18, 2024 •

edited

Loading

mehmetyusufoglu commented Jul 18, 2024 •

edited

Loading

fwyzard commented Jul 18, 2024

fwyzard Jul 19, 2024

mehmetyusufoglu Jul 22, 2024

mehmetyusufoglu commented Jul 19, 2024 •

edited

Loading

fwyzard commented Jul 19, 2024

psychocoderHPC left a comment

psychocoderHPC Jul 24, 2024

mehmetyusufoglu Jul 25, 2024

psychocoderHPC commented Jul 25, 2024

psychocoderHPC Jul 25, 2024

mehmetyusufoglu Jul 25, 2024 •

edited

Loading

		@@ -0,0 +1,53 @@
		#
		# Copyright 2023 Erik Zenker, Benjamin Worpitz, Jan Stephan

Matrix Multiplication example using MdSpan #2317

Matrix Multiplication example using MdSpan #2317

Conversation

mehmetyusufoglu commented Jul 18, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fwyzard Jul 19, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fwyzard commented Jul 18, 2024 • edited Loading

mehmetyusufoglu commented Jul 18, 2024 • edited Loading

fwyzard commented Jul 18, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mehmetyusufoglu commented Jul 19, 2024 • edited Loading

fwyzard commented Jul 19, 2024

psychocoderHPC left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

psychocoderHPC commented Jul 25, 2024

Choose a reason for hiding this comment

mehmetyusufoglu Jul 25, 2024 • edited Loading

Choose a reason for hiding this comment

mehmetyusufoglu commented Jul 18, 2024 •

edited

Loading

fwyzard Jul 19, 2024 •

edited

Loading

fwyzard commented Jul 18, 2024 •

edited

Loading

mehmetyusufoglu commented Jul 18, 2024 •

edited

Loading

mehmetyusufoglu commented Jul 19, 2024 •

edited

Loading

mehmetyusufoglu Jul 25, 2024 •

edited

Loading