-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GPU implementation of hamming distance #541
base: main
Are you sure you want to change the base?
Conversation
for more information, see https://pre-commit.ci
… into numba_hamming
for more information, see https://pre-commit.ci
… into numba_hamming
for more information, see https://pre-commit.ci
…tcrdist distance metrics
… into numba_hamming
for more information, see https://pre-commit.ci
… into numba_hamming
for more information, see https://pre-commit.ci
… into numba_hamming
…curing in all sequences
for more information, see https://pre-commit.ci
…into gpu_hamming
for more information, see https://pre-commit.ci
Hi @felixpetschko, what's the status here? Do you need anything from myself or Severin? I've seen you switched to Cupy, could you elaborate how that compares to the numba implementation? |
Hi @grst! I am mainly done with my implementation here. Currently the speedup on my laptop for 1 million cells for the ir_dist function with hamming is at around 10 (45 vs. 480 seconds) compared to the new fast numba CPU implementation (and probably >100 compared to the original CPU implemenation). I think this is also the maximum speedup I would aim at for now, because there are currently some sequential parts (1 cpu) in the ir_dist function besides the hamming GPU kernel and the upstream processing for reading and preparing the data takes already longer anyway. So further optimization of the hamming kernel wouldn't be very effective. My plan would be to prepare a pull request that is ready for review over the next days. The reasons for switching to CuPy were the following: |
Makes sense, thanks! |
Hey @felixpetschko can you send me a larger dataset to test this? I have some ideas and want to see if this works. |
…into gpu_hamming
for more information, see https://pre-commit.ci
@felixpetschko, the outcome of this discussion was that the function stays here, and we'll setup a GPU CI for scirpy. @ilan-gold or @flying-sheep can help with that once this PR is ready. |
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #541 +/- ##
==========================================
- Coverage 81.43% 79.44% -2.00%
==========================================
Files 49 50 +1
Lines 4213 4525 +312
==========================================
+ Hits 3431 3595 +164
- Misses 782 930 +148 ☔ View full report in Codecov by Sentry. |
@Intron7, Appears phil created documentation for the gpu ci: https://github.com/scverse/governance/blob/main/developer/gpu_ci.md Maybe this helps? |
Hamming distance implementation with numba.cuda for GPU support.
This is built on top of the changes in Hamming distance implementation with Numba #512