Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure that CCL output is contiguous on modules #666

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from

Commits on Aug 1, 2024

  1. Implement block-wide odd-even sort

    Sorting small arrays is a relatively common problem in GPGPU
    programming. Many useful algorithms exist, and some are provided by
    libraries like CUB. An algorithm close to my heart is odd-even sort
    because it is exceedingly simply, relatively efficient for small arrays
    and, importantly, it uses O(1) space. This commit adds new
    implementations of this sorting algorithm for block-wide odd-even sort
    in a portable way.
    stephenswat committed Aug 1, 2024
    Configuration menu
    Copy the full SHA
    e32233f View commit details
    Browse the repository at this point in the history

Commits on Aug 2, 2024

  1. Move shared CCL variables into single struct

    The current CCL kernels have so many parameters that it's a real pain in
    the rear to maintain them and to make changes to them. This commit
    reduces the number of parameters a little bit by taking all
    statically-known shared memory data and unifying it into a single struct
    which can be passed around more easily.
    stephenswat committed Aug 2, 2024
    Configuration menu
    Copy the full SHA
    3c9c613 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    3b7ce40 View commit details
    Browse the repository at this point in the history
  3. Ensure that CCL output is contiguous on modules

    Right now we are using a sorting algorithm to ensure that the entire
    array of measurements is contiguous in global memory, but this isn't
    strictly necessary. This commit alters the CCL algorithm slightly to
    guarantee that the output is always contiguous.
    stephenswat committed Aug 2, 2024
    Configuration menu
    Copy the full SHA
    9b6e5da View commit details
    Browse the repository at this point in the history