Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CUDA] Fix NumericLimits #22738

Merged
merged 2 commits into from
Nov 6, 2024
Merged

[CUDA] Fix NumericLimits #22738

merged 2 commits into from
Nov 6, 2024

Conversation

tianleiwu
Copy link
Contributor

@tianleiwu tianleiwu commented Nov 5, 2024

Description

  • Fix NumericLimits<float> that used infinity as max, which is not consistent with std::numeric_limits<float>::max()
    In Windows, (float)(1e+300) is used for INFINITY, which causes compiler error in Visual Studio 2022 v17.12 Preview 5.
  • Rename NumericLimits<T>::Min to Lowest to be consistent with std::numeric_limits
  • Fix topk implementation: use NumericLimits<CudaT> instead of NumericLimits<T> in kernel. That could avoid defining a confusing defintion of NumericLimits<MLFloat16> that returns half instead of MLFloat16.
  • Use CUDART_MAX_NORMAL_FP16 if possible. It sets bits value directly, which is faster than converting float to half.

Note that NumericLimits does not support __nv_bfloat16 and _nv_fp8_e4m3 and __nv_fp8_e5m2 right now.

Motivation and Context

#22728

@tianleiwu tianleiwu marked this pull request as draft November 5, 2024 22:16
@tianleiwu tianleiwu marked this pull request as ready for review November 6, 2024 00:27
@tianleiwu tianleiwu merged commit d993ec3 into main Nov 6, 2024
91 checks passed
@tianleiwu tianleiwu deleted the tlwu/fix_numeric_limits branch November 6, 2024 17:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants