Enable FP16 Clip and Handle Bias in FP16 Depthwise Conv #21493

yihonglyu · 2024-07-25T02:31:20Z

Improved accuracy for face-detection, image-classification, and object-detection in the GeekBench ML benchmark on ARM64.
Fixed issue Incorrect result for converted FP16 model with Conv Op when run on arm64 Linux with onnxruntime >= 1.15.0 #18992

onnxruntime/core/mlas/inc/mlas.h

onnxruntime/core/mlas/lib/dwconv.cpp

onnxruntime/core/mlas/lib/fp16_common.h

chenfucn

look good to me

onnxruntime/core/mlas/lib/fp16_common.h

### Description * Add std::numeric_limits for MLFloat16 and BFloat16. * Update some comments in csharp ORTFloat16.shared.cs. * Add unit tests (including Clip) Note that the canonical NaN is not consistent in C++ and C#. C# uses negative quiet NaN as canonical NaN, while C++ uses positive quiet NaN. The choice of CSharp Float16.NaN is to be consistent with System.Half.NaN. FP16 data returns from CUDA might have 7FFF as NaN; FP16 data from CPU provider might have 0x7E00 as NaN. Anyway there is no consistent canonical NaN in ORT right now. Because all these NaNs are aligned with IEEE spec, there shall not an issue in downstream. ### Motivation and Context std::numeric_limits is used in codebase but not defined for MLFloat16 and BFloat16. It causes some bugs like #21957 introduced by #21493.

yihonglyu added 2 commits July 24, 2024 22:18

Enable FP16 Clip and Fix Bias Issue in FP16 Depthwise Conv

f69e4ba

clang-format the code

a8706d4

yihonglyu requested review from chenfucn, edgchen1 and yufenglee July 25, 2024 02:31

yihonglyu requested a review from a team as a code owner July 25, 2024 02:31

Add FP16/MLFloat16 Clip op test

647d225

edgchen1 reviewed Jul 25, 2024

View reviewed changes

onnxruntime/core/mlas/inc/mlas.h Show resolved Hide resolved

onnxruntime/core/mlas/lib/dwconv.cpp Outdated Show resolved Hide resolved

yihonglyu changed the title ~~Enable FP16 Clip and Fix Bias bug in FP16 Depthwise Conv~~ Enable FP16 Clip and Handle Bias in FP16 Depthwise Conv Jul 25, 2024

yihonglyu added 3 commits July 27, 2024 03:58

Implement MlasLoadPartialFloat16x4

26d31a0

Fix uint16x4_t to MLAS_FLOAT16X4 conversion error

3df8960

Add comment for MlasConvDepthwise new parameter

7725154

yihonglyu requested a review from edgchen1 July 27, 2024 05:28

yihonglyu added 2 commits July 27, 2024 10:40

Update docs/OperatorKernels.md

f479051

Amend tests so bias cannot be ignored

8fbe6df

yihonglyu requested review from jchen351, yuslepukhin and liqunfu July 28, 2024 12:29

edgchen1 reviewed Jul 29, 2024

View reviewed changes

onnxruntime/core/mlas/lib/fp16_common.h Show resolved Hide resolved

yihonglyu requested a review from edgchen1 July 29, 2024 01:30

chenfucn approved these changes Jul 29, 2024

View reviewed changes

onnxruntime/core/mlas/lib/fp16_common.h Show resolved Hide resolved

edgchen1 approved these changes Jul 29, 2024

View reviewed changes

yihonglyu merged commit 530a2d7 into main Jul 30, 2024
95 checks passed

yihonglyu deleted the yilyu/fix-fp16-dw-conv-bias branch July 30, 2024 10:49

arnej27959 mentioned this pull request Sep 2, 2024

1.19: Clip operator with type FLOAT16 defaults to min or max value 0.0 if not explicitly given, breaking many models using FLOAT16 #21957

Closed

tianleiwu mentioned this pull request Sep 25, 2024

Add numeric_limits for MLFloat16 and BFloat16 #22197

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable FP16 Clip and Handle Bias in FP16 Depthwise Conv #21493

Enable FP16 Clip and Handle Bias in FP16 Depthwise Conv #21493

yihonglyu commented Jul 25, 2024 •

edited

Loading

chenfucn left a comment

Enable FP16 Clip and Handle Bias in FP16 Depthwise Conv #21493

Enable FP16 Clip and Handle Bias in FP16 Depthwise Conv #21493

Conversation

yihonglyu commented Jul 25, 2024 • edited Loading

chenfucn left a comment

Choose a reason for hiding this comment

yihonglyu commented Jul 25, 2024 •

edited

Loading