Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure that CCL output is contiguous on modules #666

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

stephenswat
Copy link
Member

Right now we are using a sorting algorithm to ensure that the entire array of measurements is contiguous in global memory, but this isn't strictly necessary. This commit alters the CCL algorithm slightly to guarantee that the output is always contiguous.

Sorting small arrays is a relatively common problem in GPGPU
programming. Many useful algorithms exist, and some are provided by
libraries like CUB. An algorithm close to my heart is odd-even sort
because it is exceedingly simply, relatively efficient for small arrays
and, importantly, it uses O(1) space. This commit adds new
implementations of this sorting algorithm for block-wide odd-even sort
in a portable way.
The current CCL kernels have so many parameters that it's a real pain in
the rear to maintain them and to make changes to them. This commit
reduces the number of parameters a little bit by taking all
statically-known shared memory data and unifying it into a single struct
which can be passed around more easily.
Right now we are using a sorting algorithm to ensure that the entire
array of measurements is contiguous in global memory, but this isn't
strictly necessary. This commit alters the CCL algorithm slightly to
guarantee that the output is always contiguous.
@stephenswat stephenswat added improvement Improve an existing feature shared Changes related to shared code labels Aug 2, 2024
@stephenswat
Copy link
Member Author

Depends on #552, #665. I will do some performance testing on this.

Copy link
Member

@krasznaa krasznaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the actual CCL code I'll go with your judgement here. Nothing jumps out to me as code that I wouldn't like, but I also didn't try to understand exactly what is being done. 😦

It doesn't have to be this PR, but I guess we'll want to remove the sorting algorithm(s) from the example applications then. I'm fine with that happening in a separate PR, though you should definitely test at least locally that track finding/fitting would take this sorting well. 🤔

Comment on lines +141 to +145
#ifndef NDEBUG
TRACCC_CUDA_ERROR_CHECK(cudaStreamSynchronize(stream));
assert(is_contiguous_on(measurement_module_projection(), m_mr.main, m_copy,
m_stream, measurements));
#endif
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really not an important point, but would it not be possible to express this with something like:

assert([&]() -> bool {
   TRACCC_CUDA_ERROR_CHECK(cudaStreamSynchronize(stream));
   return is_contiguous_on(measurement_module_projection(), m_mr.main, m_copy,
                            m_stream, measurements); }() == true);

? I'd just prefer to use NDEBUG directly if I can help it...

@stephenswat
Copy link
Member Author

It doesn't have to be this PR, but I guess we'll want to remove the sorting algorithm(s) from the example applications then. I'm fine with that happening in a separate PR, though you should definitely test at least locally that track finding/fitting would take this sorting well. 🤔

Indeed, we should disable the sorting application. I was thinking of adding some boolean flag to the output to indicate whether or not the output is contiguous, but I am not sure if it's worth the hassle.

@krasznaa
Copy link
Member

krasznaa commented Aug 2, 2024

No, there's no need to represent this in the EDM. As long as the documentation on the algorithms is clear, it's fine to just know whether extra sorting is needed on something or not.

@beomki-yeo
Copy link
Contributor

Once this works. could you also remove the measurement_sorting_algorithm?

@krasznaa
Copy link
Member

krasznaa commented Aug 9, 2024

Once this works. could you also remove the measurement_sorting_algorithm?

I would not want to do that! Yes, we will need to remove that / those algorithm(s) from the standard algorithm chain. But having an algorithm in the repository that is not actively used at a specific time (but is otherwise functional), is not a bad thing. I want to keep those algorithms around. We may make use of them in some other way later on, who knows. 🤔

@beomki-yeo
Copy link
Contributor

OK that makes sense

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improve an existing feature shared Changes related to shared code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants