Skip to content

Commit

Permalink
Adjust kNumThreads for bounds_check_indices_kernel (#3299)
Browse files Browse the repository at this point in the history
Summary:
X-link: facebookresearch/FBGEMM#398

Pull Request resolved: #3299

Increase the thread block size to increase the number of warps per SM.
This results in a better kernel time

Reviewed By: spcyppt

Differential Revision: D65071345

fbshipit-source-id: 0196eadf38278373fe36edb470221c12f219510b
  • Loading branch information
sryap authored and facebook-github-bot committed Nov 4, 2024
1 parent 21e86af commit cb99183
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion fbgemm_gpu/codegen/utils/embedding_bounds_check_v2.cu
Original file line number Diff line number Diff line change
Expand Up @@ -251,7 +251,7 @@ void _bounds_check_indices_cuda_v2(
" is not equal to indices size " + std::to_string(num_indices));
}

constexpr size_t kNumThreads = 256;
constexpr size_t kNumThreads = 1024;
const auto max_B_ = vbe ? max_B : B;

const int32_t vbe_bound = max_B_ * T;
Expand Down

0 comments on commit cb99183

Please sign in to comment.