Switch DirectX Target to use the Itanium ABI #111632

pow2clk · 2024-10-09T04:55:03Z

To consolidate behavior of function mangling and limit the number of places that ABI changes will need to be made, this switches the DirectX target used for HLSL to use the Itanium ABI from the Microsoft ABI. The Itanium ABI has greater flexibility in decisions regarding mangling of new types of which we have more than a few yet to add.

One effect of this will be that linking library shaders compiled with DXC will not be possible with shaders compiled with clang. That isn't considered a terribly interesting use case and one that would likely have been onerous to maintain anyway.

This involved adding a function to call all global destructors as the Microsoft ABI had done.

This requires a few changes to tests. Most notably the mangling style has changed which accounts for most of the changes. In making those changes, I took the opportunity to harmonize some very similar tests for greater consistency. I also shaved off some unneeded run flags that had probably been copied over from one test to another.

Other changes effected by using the new ABI include using different types when manipulating smaller bitfields, eliminating an unnecessary alloca in one instance in this-assignment.hlsl, changing the way static local initialization is guarded, and changing the order of inout parameters getting copied in and out. That last is a subtle change in functionality, but one where there was sufficient inconsistency in the past that standardizing is important, but the particular direction of the standardization is less important for the sake of existing shaders.

fixes #110736

To consolidate behavior of function mangling and limit the number of places that ABI changes will need to be made, this switches the DirectX target used for HLSL to use the Itanium ABI from the Microsoft ABI. The Itanium ABI has greater flexibility in decisions regarding mangling of new types of which we have more than a few yet to add. This required adding a function to call all global destructors as the Microsoft ABI had done. This requires a few changes to tests. Most notably the mangling style has changed which accounts for most of the changes. In making those changes, I took the opportunity to harmonize some very similar tests for greater consistency. I also shaved off some unneeded run flags that had probably been copied over from one test to another. Other changes effected by using the new ABI include using smaller types when possible in a few instances, eliminating an unnecessary alloca in one instance in this-assignment.hlsl, and changing the order of inout parameters getting copied in and out. That last is a subtle change in functionality, but one where there was sufficient inconsistency in the past that standardizing is important, but the particular direction of the standardization is less important for the sake of existing shaders. fixes llvm#110736

llvmbot · 2024-10-09T04:55:42Z

@llvm/pr-subscribers-backend-directx
@llvm/pr-subscribers-hlsl
@llvm/pr-subscribers-clang-codegen

@llvm/pr-subscribers-clang

Author: Greg Roth (pow2clk)

Changes

To consolidate behavior of function mangling and limit the number of places that ABI changes will need to be made, this switches the DirectX target used for HLSL to use the Itanium ABI from the Microsoft ABI. The Itanium ABI has greater flexibility in decisions regarding mangling of new types of which we have more than a few yet to add.

This required adding a function to call all global destructors as the Microsoft ABI had done.

This requires a few changes to tests. Most notably the mangling style has changed which accounts for most of the changes. In making those changes, I took the opportunity to harmonize some very similar tests for greater consistency. I also shaved off some unneeded run flags that had probably been copied over from one test to another.

Other changes effected by using the new ABI include using smaller types when possible in a few instances, eliminating an unnecessary alloca in one instance in this-assignment.hlsl, and changing the order of inout parameters getting copied in and out. That last is a subtle change in functionality, but one where there was sufficient inconsistency in the past that standardizing is important, but the particular direction of the standardization is less important for the sake of existing shaders.

fixes #110736

Patch is 142.96 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/111632.diff

48 Files Affected:

(modified) clang/lib/Basic/Targets/DirectX.h (+1-1)
(modified) clang/lib/CodeGen/ItaniumCXXABI.cpp (+4)
(modified) clang/test/CodeGenHLSL/ArrayTemporary.hlsl (+4-4)
(modified) clang/test/CodeGenHLSL/BasicFeatures/OutputArguments.hlsl (+14-12)
(modified) clang/test/CodeGenHLSL/GlobalConstructorFunction.hlsl (+4-4)
(modified) clang/test/CodeGenHLSL/GlobalConstructorLib.hlsl (+8-4)
(modified) clang/test/CodeGenHLSL/GlobalConstructors.hlsl (+1-1)
(modified) clang/test/CodeGenHLSL/GlobalDestructors.hlsl (+7-7)
(modified) clang/test/CodeGenHLSL/basic_types.hlsl (+32-32)
(modified) clang/test/CodeGenHLSL/builtins/RWBuffer-annotations.hlsl (+6-6)
(modified) clang/test/CodeGenHLSL/builtins/RWBuffer-elementtype.hlsl (+13-13)
(modified) clang/test/CodeGenHLSL/builtins/RasterizerOrderedBuffer-annotations.hlsl (+6-6)
(modified) clang/test/CodeGenHLSL/builtins/StructuredBuffer-annotations.hlsl (+6-6)
(modified) clang/test/CodeGenHLSL/builtins/StructuredBuffer-elementtype.hlsl (+13-13)
(modified) clang/test/CodeGenHLSL/builtins/abs.hlsl (+38-35)
(modified) clang/test/CodeGenHLSL/builtins/ceil.hlsl (+18-19)
(modified) clang/test/CodeGenHLSL/builtins/clamp.hlsl (+50-51)
(modified) clang/test/CodeGenHLSL/builtins/cos.hlsl (+18-19)
(modified) clang/test/CodeGenHLSL/builtins/exp.hlsl (+18-19)
(modified) clang/test/CodeGenHLSL/builtins/exp2.hlsl (+18-19)
(modified) clang/test/CodeGenHLSL/builtins/floor.hlsl (+18-19)
(modified) clang/test/CodeGenHLSL/builtins/hlsl_resource_t.hlsl (+2-2)
(modified) clang/test/CodeGenHLSL/builtins/log.hlsl (+18-19)
(modified) clang/test/CodeGenHLSL/builtins/log10.hlsl (+18-19)
(modified) clang/test/CodeGenHLSL/builtins/log2.hlsl (+18-19)
(modified) clang/test/CodeGenHLSL/builtins/max.hlsl (+50-51)
(modified) clang/test/CodeGenHLSL/builtins/min.hlsl (+50-51)
(modified) clang/test/CodeGenHLSL/builtins/pow.hlsl (+18-19)
(modified) clang/test/CodeGenHLSL/builtins/round.hlsl (+18-19)
(modified) clang/test/CodeGenHLSL/builtins/saturate.hlsl (+44-79)
(modified) clang/test/CodeGenHLSL/builtins/sin.hlsl (+18-19)
(modified) clang/test/CodeGenHLSL/builtins/sqrt.hlsl (+18-19)
(modified) clang/test/CodeGenHLSL/builtins/trunc.hlsl (+19-20)
(modified) clang/test/CodeGenHLSL/export.hlsl (+5-6)
(modified) clang/test/CodeGenHLSL/float3.hlsl (+1-1)
(modified) clang/test/CodeGenHLSL/group_shared.hlsl (+1-1)
(modified) clang/test/CodeGenHLSL/half.hlsl (+2-2)
(modified) clang/test/CodeGenHLSL/implicit-norecurse-attrib.hlsl (+4-4)
(modified) clang/test/CodeGenHLSL/inline-constructors.hlsl (+2-2)
(modified) clang/test/CodeGenHLSL/inline-functions.hlsl (+5-5)
(modified) clang/test/CodeGenHLSL/semantics/GroupIndex-codegen.hlsl (+1-1)
(modified) clang/test/CodeGenHLSL/shift-mask.hlsl (+37-6)
(modified) clang/test/CodeGenHLSL/sret_output.hlsl (+3-4)
(modified) clang/test/CodeGenHLSL/static-local-ctor.hlsl (+7-7)
(modified) clang/test/CodeGenHLSL/static_global_and_function_in_cb.hlsl (+3-4)
(modified) clang/test/CodeGenHLSL/this-assignment-overload.hlsl (+4-4)
(modified) clang/test/CodeGenHLSL/this-assignment.hlsl (+2-5)
(modified) clang/test/CodeGenHLSL/this-reference.hlsl (+2-2)

diff --git a/clang/lib/Basic/Targets/DirectX.h b/clang/lib/Basic/Targets/DirectX.h
index cf7ea5e83503dc..19b61252409b09 100644
--- a/clang/lib/Basic/Targets/DirectX.h
+++ b/clang/lib/Basic/Targets/DirectX.h
@@ -62,7 +62,7 @@ class LLVM_LIBRARY_VISIBILITY DirectXTargetInfo : public TargetInfo {
     PlatformName = llvm::Triple::getOSTypeName(Triple.getOS());
     resetDataLayout("e-m:e-p:32:32-i1:32-i8:8-i16:16-i32:32-i64:64-f16:16-f32:"
                     "32-f64:64-n8:16:32:64");
-    TheCXXABI.set(TargetCXXABI::Microsoft);
+    TheCXXABI.set(TargetCXXABI::GenericItanium);
   }
   bool useFP16ConversionIntrinsics() const override { return false; }
   void getTargetDefines(const LangOptions &Opts,
diff --git a/clang/lib/CodeGen/ItaniumCXXABI.cpp b/clang/lib/CodeGen/ItaniumCXXABI.cpp
index 965e09a7a760ec..75dab596e1b2c4 100644
--- a/clang/lib/CodeGen/ItaniumCXXABI.cpp
+++ b/clang/lib/CodeGen/ItaniumCXXABI.cpp
@@ -2997,6 +2997,10 @@ void ItaniumCXXABI::registerGlobalDtor(CodeGenFunction &CGF, const VarDecl &D,
   if (D.isNoDestroy(CGM.getContext()))
     return;
 
+  // HLSL doesn't support atexit.
+  if (CGM.getLangOpts().HLSL)
+    return CGM.AddCXXDtorEntry(dtor, addr);
+
   // OpenMP offloading supports C++ constructors and destructors but we do not
   // always have 'atexit' available. Instead lower these to use the LLVM global
   // destructors which we can handle directly in the runtime. Note that this is
diff --git a/clang/test/CodeGenHLSL/ArrayTemporary.hlsl b/clang/test/CodeGenHLSL/ArrayTemporary.hlsl
index 63a30b61440eb5..7d77c0aff736cc 100644
--- a/clang/test/CodeGenHLSL/ArrayTemporary.hlsl
+++ b/clang/test/CodeGenHLSL/ArrayTemporary.hlsl
@@ -68,11 +68,11 @@ void call4(float Arr[2][2]) {
 // CHECK: [[Tmp2:%.*]] = alloca [4 x float]
 // CHECK: [[Tmp3:%.*]] = alloca [3 x i32]
 // CHECK: call void @llvm.memcpy.p0.p0.i32(ptr align 4 [[Tmp1]], ptr align 4 [[FA2]], i32 8, i1 false)
-// CHECK: call void @"??$template_fn@$$BY01M@@YAXY01M@Z"(ptr noundef byval([2 x float]) align 4 [[Tmp1]])
+// CHECK: call void @_Z11template_fnIA2_fEvT_(ptr noundef byval([2 x float]) align 4 [[Tmp1]])
 // CHECK: call void @llvm.memcpy.p0.p0.i32(ptr align 4 [[Tmp2]], ptr align 4 [[FA4]], i32 16, i1 false)
-// CHECK: call void @"??$template_fn@$$BY03M@@YAXY03M@Z"(ptr noundef byval([4 x float]) align 4 [[Tmp2]])
+// CHECK: call void @_Z11template_fnIA4_fEvT_(ptr noundef byval([4 x float]) align 4 [[Tmp2]])
 // CHECK: call void @llvm.memcpy.p0.p0.i32(ptr align 4 [[Tmp3]], ptr align 4 [[IA3]], i32 12, i1 false)
-// CHECK: call void @"??$template_fn@$$BY02H@@YAXY02H@Z"(ptr noundef byval([3 x i32]) align 4 [[Tmp3]])
+// CHECK: call void @_Z11template_fnIA3_iEvT_(ptr noundef byval([3 x i32]) align 4 [[Tmp3]])
 
 template<typename T>
 void template_fn(T Val) {}
@@ -90,7 +90,7 @@ void template_call(float FA2[2], float FA4[4], int IA3[3]) {
 
 // CHECK: [[Addr:%.*]] = getelementptr inbounds [2 x float], ptr [[FA2]], i32 0, i32 0
 // CHECK: [[Tmp:%.*]] = load float, ptr [[Addr]]
-// CHECK: call void @"??$template_fn@M@@YAXM@Z"(float noundef [[Tmp]])
+// CHECK: call void @_Z11template_fnIfEvT_(float noundef [[Tmp]])
 
 // CHECK: [[Idx0:%.*]] = getelementptr inbounds [2 x float], ptr [[FA2]], i32 0, i32 0
 // CHECK: [[Val0:%.*]] = load float, ptr [[Idx0]]
diff --git a/clang/test/CodeGenHLSL/BasicFeatures/OutputArguments.hlsl b/clang/test/CodeGenHLSL/BasicFeatures/OutputArguments.hlsl
index 58237889db1dca..6afead4f233660 100644
--- a/clang/test/CodeGenHLSL/BasicFeatures/OutputArguments.hlsl
+++ b/clang/test/CodeGenHLSL/BasicFeatures/OutputArguments.hlsl
@@ -260,10 +260,10 @@ void order_matters(inout int X, inout int Y) {
 // CHECK: store i32 [[VVal]], ptr [[Tmp0]]
 // CHECK: [[VVal:%.*]] = load i32, ptr [[V]]
 // CHECK: store i32 [[VVal]], ptr [[Tmp1]]
-// CHECK: call void {{.*}}order_matters{{.*}}(ptr noalias noundef nonnull align 4 dereferenceable(4) [[Tmp1]], ptr noalias noundef nonnull align 4 dereferenceable(4) [[Tmp0]])
-// CHECK: [[Arg1Val:%.*]] = load i32, ptr [[Tmp1]]
+// CHECK: call void {{.*}}order_matters{{.*}}(ptr noalias noundef nonnull align 4 dereferenceable(4) [[Tmp0]], ptr noalias noundef nonnull align 4 dereferenceable(4) [[Tmp1]])
+// CHECK: [[Arg1Val:%.*]] = load i32, ptr [[Tmp0]]
 // CHECK: store i32 [[Arg1Val]], ptr [[V]]
-// CHECK: [[Arg2Val:%.*]] = load i32, ptr [[Tmp0]]
+// CHECK: [[Arg2Val:%.*]] = load i32, ptr [[Tmp1]]
 // CHECK: store i32 [[Arg2Val]], ptr [[V]]
 
 // OPT: ret i32 2
@@ -289,17 +289,19 @@ void setFour(inout int I) {
 // CHECK: [[B:%.*]] = alloca %struct.B
 // CHECK: [[Tmp:%.*]] = alloca i32
 
-// CHECK: [[BFLoad:%.*]] = load i32, ptr [[B]]
-// CHECK: [[BFshl:%.*]] = shl i32 [[BFLoad]], 24
-// CHECK: [[BFashr:%.*]] = ashr i32 [[BFshl]], 24
-// CHECK: store i32 [[BFashr]], ptr [[Tmp]]
+// CHECK: [[BFLoad:%.*]] = load i16, ptr [[B]]
+// CHECK: [[BFshl:%.*]] = shl i16 [[BFLoad]], 8
+// CHECK: [[BFashr:%.*]] = ashr i16 [[BFshl]], 8
+// CHECK: [[BFcast:%.*]] = sext i16 [[BFashr]] to i32
+// CHECK: store i32 [[BFcast]], ptr [[Tmp]]
 // CHECK: call void {{.*}}setFour{{.*}}(ptr noalias noundef nonnull align 4 dereferenceable(4) [[Tmp]])
 // CHECK: [[RetVal:%.*]] = load i32, ptr [[Tmp]]
-// CHECK: [[BFLoad:%.*]] = load i32, ptr [[B]]
-// CHECK: [[BFValue:%.*]] = and i32 [[RetVal]], 255
-// CHECK: [[ZerodField:%.*]] = and i32 [[BFLoad]], -256
-// CHECK: [[BFSet:%.*]] = or i32 [[ZerodField]], [[BFValue]]
-// CHECK: store i32 [[BFSet]], ptr [[B]]
+// CHECK: [[TruncVal:%.*]] = trunc i32 [[RetVal]] to i16
+// CHECK: [[BFLoad:%.*]] = load i16, ptr [[B]]
+// CHECK: [[BFValue:%.*]] = and i16 [[TruncVal]], 255
+// CHECK: [[ZerodField:%.*]] = and i16 [[BFLoad]], -256
+// CHECK: [[BFSet:%.*]] = or i16 [[ZerodField]], [[BFValue]]
+// CHECK: store i16 [[BFSet]], ptr [[B]]
 
 // OPT: ret i32 8
 export int case11() {
diff --git a/clang/test/CodeGenHLSL/GlobalConstructorFunction.hlsl b/clang/test/CodeGenHLSL/GlobalConstructorFunction.hlsl
index b39311ad67cd62..c0eb1b138ed047 100644
--- a/clang/test/CodeGenHLSL/GlobalConstructorFunction.hlsl
+++ b/clang/test/CodeGenHLSL/GlobalConstructorFunction.hlsl
@@ -25,11 +25,11 @@ void main(unsigned GI : SV_GroupIndex) {}
 // CHECK: define void @main()
 // CHECK-NEXT: entry:
 // Verify function constructors are emitted
-// NOINLINE-NEXT:   call void @"?call_me_first@@YAXXZ"()
-// NOINLINE-NEXT:   call void @"?then_call_me@@YAXXZ"()
+// NOINLINE-NEXT:   call void @_Z13call_me_firstv()
+// NOINLINE-NEXT:   call void @_Z12then_call_mev()
 // NOINLINE-NEXT:   %0 = call i32 @llvm.dx.flattened.thread.id.in.group()
-// NOINLINE-NEXT:   call void @"?main@@YAXI@Z"(i32 %0)
-// NOINLINE-NEXT:   call void @"?call_me_last@@YAXXZ"(
+// NOINLINE-NEXT:   call void @_Z4mainj(i32 %0)
+// NOINLINE-NEXT:   call void @_Z12call_me_lastv(
 // NOINLINE-NEXT:   ret void
 
 // Verify constructor calls are inlined when AlwaysInline is run
diff --git a/clang/test/CodeGenHLSL/GlobalConstructorLib.hlsl b/clang/test/CodeGenHLSL/GlobalConstructorLib.hlsl
index 78f6475462bc47..09c44f6242c53c 100644
--- a/clang/test/CodeGenHLSL/GlobalConstructorLib.hlsl
+++ b/clang/test/CodeGenHLSL/GlobalConstructorLib.hlsl
@@ -13,7 +13,7 @@ void FirstEntry() {}
 // CHECK: define void @FirstEntry()
 // CHECK-NEXT: entry:
 // NOINLINE-NEXT:   call void @_GLOBAL__sub_I_GlobalConstructorLib.hlsl()
-// NOINLINE-NEXT:   call void @"?FirstEntry@@YAXXZ"()
+// NOINLINE-NEXT:   call void @_Z10FirstEntryv()
 // Verify inlining leaves only calls to "llvm." intrinsics
 // INLINE-NOT:   call {{[^@]*}} @{{[^l][^l][^v][^m][^\.]}}
 // CHECK: ret void
@@ -25,7 +25,7 @@ void SecondEntry() {}
 // CHECK: define void @SecondEntry()
 // CHECK-NEXT: entry:
 // NOINLINE-NEXT:   call void @_GLOBAL__sub_I_GlobalConstructorLib.hlsl()
-// NOINLINE-NEXT:   call void @"?SecondEntry@@YAXXZ"()
+// NOINLINE-NEXT:   call void @_Z11SecondEntryv()
 // Verify inlining leaves only calls to "llvm." intrinsics
 // INLINE-NOT:   call {{[^@]*}} @{{[^l][^l][^v][^m][^\.]}}
 // CHECK: ret void
@@ -33,6 +33,10 @@ void SecondEntry() {}
 
 // Verify the constructor is alwaysinline
 // NOINLINE: ; Function Attrs: {{.*}}alwaysinline
-// NOINLINE-NEXT: define internal void @_GLOBAL__sub_I_GlobalConstructorLib.hlsl() [[IntAttr:\#[0-9]+]]
+// NOINLINE-NEXT: define linkonce_odr void @_ZN4hlsl8RWBufferIfEC2Ev({{.*}} [[CtorAttr:\#[0-9]+]]
 
-// NOINLINE: attributes [[IntAttr]] = {{.*}} alwaysinline
+// NOINLINE: ; Function Attrs: {{.*}}alwaysinline
+// NOINLINE-NEXT: define internal void @_GLOBAL__sub_I_GlobalConstructorLib.hlsl() [[InitAttr:\#[0-9]+]]
+
+// NOINLINE-DAG: attributes [[InitAttr]] = {{.*}} alwaysinline
+// NOINLINE-DAG: attributes [[CtorAttr]] = {{.*}} alwaysinline
diff --git a/clang/test/CodeGenHLSL/GlobalConstructors.hlsl b/clang/test/CodeGenHLSL/GlobalConstructors.hlsl
index 7e2f288726c954..7b26dba0d19010 100644
--- a/clang/test/CodeGenHLSL/GlobalConstructors.hlsl
+++ b/clang/test/CodeGenHLSL/GlobalConstructors.hlsl
@@ -12,5 +12,5 @@ void main(unsigned GI : SV_GroupIndex) {}
 //CHECK-NEXT: entry:
 //CHECK-NEXT:   call void @_GLOBAL__sub_I_GlobalConstructors.hlsl()
 //CHECK-NEXT:   %0 = call i32 @llvm.dx.flattened.thread.id.in.group()
-//CHECK-NEXT:   call void @"?main@@YAXI@Z"(i32 %0)
+//CHECK-NEXT:   call void @_Z4mainj(i32 %0)
 //CHECK-NEXT:   ret void
diff --git a/clang/test/CodeGenHLSL/GlobalDestructors.hlsl b/clang/test/CodeGenHLSL/GlobalDestructors.hlsl
index ea28354222f885..f98318601134bb 100644
--- a/clang/test/CodeGenHLSL/GlobalDestructors.hlsl
+++ b/clang/test/CodeGenHLSL/GlobalDestructors.hlsl
@@ -1,7 +1,7 @@
-// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.0-compute -std=hlsl202x -emit-llvm -disable-llvm-passes %s -o - | FileCheck %s --check-prefixes=CS,NOINLINE,CHECK
-// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.3-library -std=hlsl202x -emit-llvm -disable-llvm-passes %s -o - | FileCheck %s --check-prefixes=LIB,NOINLINE,CHECK
-// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.0-compute -std=hlsl202x -emit-llvm -O0 %s -o - | FileCheck %s --check-prefixes=INLINE,CHECK
-// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.3-library -std=hlsl202x -emit-llvm -O0 %s -o - | FileCheck %s --check-prefixes=INLINE,CHECK
+// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.0-compute -emit-llvm -disable-llvm-passes %s -o - | FileCheck %s --check-prefixes=CS,NOINLINE,CHECK
+// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.3-library -emit-llvm -disable-llvm-passes %s -o - | FileCheck %s --check-prefixes=LIB,NOINLINE,CHECK
+// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.0-compute -emit-llvm -O0 %s -o - | FileCheck %s --check-prefixes=INLINE,CHECK
+// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.3-library -emit-llvm -O0 %s -o - | FileCheck %s --check-prefixes=INLINE,CHECK
 
 // Tests that constructors and destructors are appropriately generated for globals
 // and that their calls are inlined when AlwaysInline is run
@@ -59,7 +59,7 @@ void main(unsigned GI : SV_GroupIndex) {
 // Verify destructor is emitted
 // NOINLINE-NEXT:   call void @_GLOBAL__sub_I_GlobalDestructors.hlsl()
 // NOINLINE-NEXT:   %0 = call i32 @llvm.dx.flattened.thread.id.in.group()
-// NOINLINE-NEXT:   call void @"?main@@YAXI@Z"(i32 %0)
+// NOINLINE-NEXT:   call void @_Z4mainj(i32 %0)
 // NOINLINE-NEXT:   call void @_GLOBAL__D_a()
 // NOINLINE-NEXT:   ret void
 // Verify inlining leaves only calls to "llvm." intrinsics
@@ -71,8 +71,8 @@ void main(unsigned GI : SV_GroupIndex) {
 
 // NOINLINE: define internal void @_GLOBAL__D_a() [[IntAttr:\#[0-9]+]]
 // NOINLINE-NEXT: entry:
-// NOINLINE-NEXT:   call void @"??1Tail@@QAA@XZ"(ptr @"?T@?1??Wag@@YAXXZ@4UTail@@A")
-// NOINLINE-NEXT:   call void @"??1Pupper@@QAA@XZ"(ptr @"?GlobalPup@@3UPupper@@A")
+// NOINLINE-NEXT:   call void @_ZN4TailD1Ev(ptr @_ZZ3WagvE1T)
+// NOINLINE-NEXT:   call void @_ZN6PupperD1Ev(ptr @GlobalPup)
 // NOINLINE-NEXT:   ret void
 
 // NOINLINE: attributes [[IntAttr]] = {{.*}} alwaysinline
diff --git a/clang/test/CodeGenHLSL/basic_types.hlsl b/clang/test/CodeGenHLSL/basic_types.hlsl
index 15c963dfa666f4..d987af45a649fb 100644
--- a/clang/test/CodeGenHLSL/basic_types.hlsl
+++ b/clang/test/CodeGenHLSL/basic_types.hlsl
@@ -6,38 +6,38 @@
 // RUN:   -emit-llvm -disable-llvm-passes -o - -DNAMESPACED| FileCheck %s
 
 
-// CHECK:"?uint16_t_Val@@3GA" = global i16 0, align 2
-// CHECK:"?int16_t_Val@@3FA" = global i16 0, align 2
-// CHECK:"?uint_Val@@3IA" = global i32 0, align 4
-// CHECK:"?uint64_t_Val@@3KA" = global i64 0, align 8
-// CHECK:"?int64_t_Val@@3JA" = global i64 0, align 8
-// CHECK:"?int16_t2_Val@@3T?$__vector@F$01@__clang@@A" = global <2 x i16> zeroinitializer, align 4
-// CHECK:"?int16_t3_Val@@3T?$__vector@F$02@__clang@@A" = global <3 x i16> zeroinitializer, align 8
-// CHECK:"?int16_t4_Val@@3T?$__vector@F$03@__clang@@A" = global <4 x i16> zeroinitializer, align 8
-// CHECK:"?uint16_t2_Val@@3T?$__vector@G$01@__clang@@A" = global <2 x i16> zeroinitializer, align 4
-// CHECK:"?uint16_t3_Val@@3T?$__vector@G$02@__clang@@A" = global <3 x i16> zeroinitializer, align 8
-// CHECK:"?uint16_t4_Val@@3T?$__vector@G$03@__clang@@A" = global <4 x i16> zeroinitializer, align 8
-// CHECK:"?int2_Val@@3T?$__vector@H$01@__clang@@A" = global <2 x i32> zeroinitializer, align 8
-// CHECK:"?int3_Val@@3T?$__vector@H$02@__clang@@A" = global <3 x i32> zeroinitializer, align 16
-// CHECK:"?int4_Val@@3T?$__vector@H$03@__clang@@A" = global <4 x i32> zeroinitializer, align 16
-// CHECK:"?uint2_Val@@3T?$__vector@I$01@__clang@@A" = global <2 x i32> zeroinitializer, align 8
-// CHECK:"?uint3_Val@@3T?$__vector@I$02@__clang@@A" = global <3 x i32> zeroinitializer, align 16
-// CHECK:"?uint4_Val@@3T?$__vector@I$03@__clang@@A" = global <4 x i32> zeroinitializer, align 16
-// CHECK:"?int64_t2_Val@@3T?$__vector@J$01@__clang@@A" = global <2 x i64> zeroinitializer, align 16
-// CHECK:"?int64_t3_Val@@3T?$__vector@J$02@__clang@@A" = global <3 x i64> zeroinitializer, align 32
-// CHECK:"?int64_t4_Val@@3T?$__vector@J$03@__clang@@A" = global <4 x i64> zeroinitializer, align 32
-// CHECK:"?uint64_t2_Val@@3T?$__vector@K$01@__clang@@A" = global <2 x i64> zeroinitializer, align 16
-// CHECK:"?uint64_t3_Val@@3T?$__vector@K$02@__clang@@A" = global <3 x i64> zeroinitializer, align 32
-// CHECK:"?uint64_t4_Val@@3T?$__vector@K$03@__clang@@A" = global <4 x i64> zeroinitializer, align 32
-// CHECK:"?half2_Val@@3T?$__vector@$f16@$01@__clang@@A" = global <2 x half> zeroinitializer, align 4
-// CHECK:"?half3_Val@@3T?$__vector@$f16@$02@__clang@@A" = global <3 x half> zeroinitializer, align 8
-// CHECK:"?half4_Val@@3T?$__vector@$f16@$03@__clang@@A" = global <4 x half> zeroinitializer, align 8
-// CHECK:"?float2_Val@@3T?$__vector@M$01@__clang@@A" = global <2 x float> zeroinitializer, align 8
-// CHECK:"?float3_Val@@3T?$__vector@M$02@__clang@@A" = global <3 x float> zeroinitializer, align 16
-// CHECK:"?float4_Val@@3T?$__vector@M$03@__clang@@A" = global <4 x float> zeroinitializer, align 16
-// CHECK:"?double2_Val@@3T?$__vector@N$01@__clang@@A" = global <2 x double> zeroinitializer, align 16
-// CHECK:"?double3_Val@@3T?$__vector@N$02@__clang@@A" = global <3 x double> zeroinitializer, align 32
-// CHECK:"?double4_Val@@3T?$__vector@N$03@__clang@@A" = global <4 x double> zeroinitializer, align 32
+// CHECK: @uint16_t_Val = global i16 0, align 2
+// CHECK: @int16_t_Val = global i16 0, align 2
+// CHECK: @uint_Val = global i32 0, align 4
+// CHECK: @uint64_t_Val = global i64 0, align 8
+// CHECK: @int64_t_Val = global i64 0, align 8
+// CHECK: @int16_t2_Val = global <2 x i16> zeroinitializer, align 4
+// CHECK: @int16_t3_Val = global <3 x i16> zeroinitializer, align 8
+// CHECK: @int16_t4_Val = global <4 x i16> zeroinitializer, align 8
+// CHECK: @uint16_t2_Val = global <2 x i16> zeroinitializer, align 4
+// CHECK: @uint16_t3_Val = global <3 x i16> zeroinitializer, align 8
+// CHECK: @uint16_t4_Val = global <4 x i16> zeroinitializer, align 8
+// CHECK: @int2_Val = global <2 x i32> zeroinitializer, align 8
+// CHECK: @int3_Val = global <3 x i32> zeroinitializer, align 16
+// CHECK: @int4_Val = global <4 x i32> zeroinitializer, align 16
+// CHECK: @uint2_Val = global <2 x i32> zeroinitializer, align 8
+// CHECK: @uint3_Val = global <3 x i32> zeroinitializer, align 16
+// CHECK: @uint4_Val = global <4 x i32> zeroinitializer, align 16
+// CHECK: @int64_t2_Val = global <2 x i64> zeroinitializer, align 16
+// CHECK: @int64_t3_Val = global <3 x i64> zeroinitializer, align 32
+// CHECK: @int64_t4_Val = global <4 x i64> zeroinitializer, align 32
+// CHECK: @uint64_t2_Val = global <2 x i64> zeroinitializer, align 16
+// CHECK: @uint64_t3_Val = global <3 x i64> zeroinitializer, align 32
+// CHECK: @uint64_t4_Val = global <4 x i64> zeroinitializer, align 32
+// CHECK: @half2_Val = global <2 x half> zeroinitializer, align 4
+// CHECK: @half3_Val = global <3 x half> zeroinitializer, align 8
+// CHECK: @half4_Val = global <4 x half> zeroinitializer, align 8
+// CHECK: @float2_Val = global <2 x float> zeroinitializer, align 8
+// CHECK: @float3_Val = global <3 x float> zeroinitializer, align 16
+// CHECK: @float4_Val = global <4 x float> zeroinitializer, align 16
+// CHECK: @double2_Val = global <2 x double> zeroinitializer, align 16
+// CHECK: @double3_Val = global <3 x double> zeroinitializer, align 32
+// CHECK: @double4_Val = global <4 x double> zeroinitializer, align 32
 
 #ifdef NAMESPACED
 #define TYPE_DECL(T)  hlsl::T T##_Val
diff --git a/clang/test/CodeGenHLSL/builtins/RWBuffer-annotations.hlsl b/clang/test/CodeGenHLSL/builtins/RWBuffer-annotations.hlsl
index 7ca78e60fb9c59..e1e047485e4df0 100644
--- a/clang/test/CodeGenHLSL/builtins/RWBuffer-annotations.hlsl
+++ b/clang/test/CodeGenHLSL/builtins/RWBuffer-annotations.hlsl
@@ -16,9 +16,9 @@ void main() {
 }
 
 // CHECK: !hlsl.uavs = !{![[Single:[0-9]+]], ![[Array:[0-9]+]], ![[SingleAllocated:[0-9]+]], ![[ArrayAllocated:[0-9]+]], ![[SingleSpace:[0-9]+]], ![[ArraySpace:[0-9]+]]}
-// CHECK-DAG: ![[Single]] = !{ptr @"?Buffer1@@3V?$RWBuffer@M@hlsl@@A", i32 10, i32 9, i1 false, i32 -1, i32 0}
-// CHECK-DAG: ![[Array]] = !{ptr @"?BufferArray@@3PAV?$RWBuffer@T?$__vector@M$03@__clang@@@hlsl@@A", i32 10, i32 9, i1 false, i32 -1, i32 0}
-// CHECK-DAG: ![[SingleAllocated]] = !{ptr @"?Buffer2@@3V?$RWBuffer@M@hlsl@@A", i32 10, i32 9, i1 false, i32 3, i32 0}
-// CHECK-DAG: ![[ArrayAllocated]] = !{ptr @"?BufferArray2@@3PAV?$RWBuffer@T?$__vector@M$03@__clang@@@hlsl@@A", i32 10, i32 9, i1 false, i32 4, i32 0}
-// CHECK-DAG: ![[SingleSpace]] = !{ptr @"?Buffer3@@3V?$RWBuffer@M@hlsl@@A", i32 10, i32 9, i1 false, i32 3, i32 1}
-// CHECK-DAG: ![[ArraySpace]] = !{ptr @"?BufferArray3@@3PAV?$RWBuffer@T?$__vector@M$03@__clang@@@hlsl@@A", i32 10, i32 9, i1 false, i32 4, i32 1}
+// CHECK-DAG: ![[Single]] = !{ptr @Buffer1, i32 10, i32 9, i1 false, i32 -1, i32 0}
+// CHECK-DAG: ![[Array]] = !{ptr @BufferArray, i32 10, i32 9, i1 false, i32 -1, i32 0}
+// CHECK-DAG: ![[SingleAllocated]] = !{ptr @Buffer2, i32 10, i32 9, i1 false, i32 3, i32 0}
+// CHECK-DAG: ![[ArrayAllocated]] = !{ptr @BufferArray2, i32 10, i32 9, i1 false, i32 4, i32 0}
+// CHECK-DAG: ![[SingleSpace]] = !{ptr @Buffer3, i32 10, i32 9, i1 false, i32 3, i32 1}
+// CHECK-DAG: ![[ArraySpace]] = !{ptr @BufferArray3, i32 10, i32 9, i1 false, i32 4, i32 1}
diff --git a/clang/test/CodeGenHLSL/builtins/RWBuffer-elementtype.hlsl b/clang/test/CodeGenHLSL/builtins/RWBuffer-elementtype.hlsl
index 036c9c28ef2779..eca4f1598fd658 100644
--- a/clang/test/CodeGenHLSL/builtins/RWBuffer-elementtype.hlsl
+++ b/clang/test/CodeGenHLSL/builtins/RWBuffer-elementtype.hlsl
@@ -37,16 +37,16 @@ void main(int GI : SV_GroupIndex) {
   BufF32x3[GI] = 0;
 }
 
-// CHECK: !{{[0-9]+}} = !{ptr @"?BufI16@@3V?$RWBuffer@F@hlsl@@A", i32 10, i32 2,
-// CHECK: !{{[0-9]+}} = !{ptr @"?BufU16@@3V?$RWBuffer@G@hlsl@@A", i32 10, i32 3,
-// CHECK: !{{[0-9]+}} = !{ptr @"?BufI32@@3V?$RWBuffer@H@hlsl@@A", i32 10, i32 4,
-// CHECK: !{{[0-9]+}} = !{ptr @"?BufU32@@3V?$RWBuffer@I@hlsl@@A", i32 10, i32 5,
-// CHECK: !{{[0-9]+}} = !{ptr @"?BufI64@@3V?$RWBuffer@J@hlsl@@A", i32 10, i32 6,
-// CHECK: !{{[0-9]+}} = !{ptr @"?BufU64@@3V?$RWBuffer@K@hlsl@@A", i32 10, i32 7,
-// CHECK: !{{[0-9]+}} = !{ptr @"?BufF16@@3V?$RWBuffer@$f16@@hlsl@@A", i32 10, i32 8,
-// CHECK: !{{[0-9]+}} = !{ptr @"?BufF32@@3V?$RWBuffer@M@hlsl@@A", i32 10, i32 9,
-// CHECK: !{{[0-9]+}} = !{ptr @"?BufF64@@3V?$RWBuffer@N@hlsl@@A", i32 10, i32 10,
-// CHECK: !{{[0-9]+}} = !{ptr @"?BufI16x4@@3V?$RWBuffer@T?$__vector@F$03@__clang@@@hlsl@@A", i32 10, i32 2,
-// CHECK: !{{[0-9]+}} = !{ptr @"?BufU32x3@@3V?$R...
[truncated]

pow2clk

Just a few comments to explain the changes that aren't just mangling style changes.

pow2clk · 2024-10-09T04:59:37Z

clang/test/CodeGenHLSL/shift-mask.hlsl

+
+// CHECK-LABEL: define noundef i64 @_Z6shru64mm(i64 noundef %V, i64 noundef %S) #0 {
+// CHECK-DAG:  %[[Masked:.*]] = and i64 %{{.*}}, 63
+// CHECK-DAG:  %{{.*}} = lshr i64 %{{.*}}, %[[Masked]]


This is all incidental, I just thought we should test some logical shifts.

pow2clk · 2024-10-09T05:00:46Z

clang/test/CodeGenHLSL/BasicFeatures/OutputArguments.hlsl

 // CHECK: store i32 [[Arg1Val]], ptr [[V]]
-// CHECK: [[Arg2Val:%.*]] = load i32, ptr [[Tmp0]]
+// CHECK: [[Arg2Val:%.*]] = load i32, ptr [[Tmp1]]


This is where the new order of copy in copy out is reflected

pow2clk · 2024-10-09T05:01:34Z

clang/test/CodeGenHLSL/BasicFeatures/OutputArguments.hlsl

+// CHECK: [[BFValue:%.*]] = and i16 [[TruncVal]], 255
+// CHECK: [[ZerodField:%.*]] = and i16 [[BFLoad]], -256
+// CHECK: [[BFSet:%.*]] = or i16 [[ZerodField]], [[BFValue]]
+// CHECK: store i16 [[BFSet]], ptr [[B]]


This is an instance of using smaller types for bitfields. Doing so introduced some additional commands.

pow2clk · 2024-10-09T05:03:30Z

clang/test/CodeGenHLSL/builtins/abs.hlsl


 using hlsl::abs;

 #ifdef __HLSL_ENABLE_16_BIT
-// NATIVE_HALF: define noundef i16 @
+// NATIVE_HALF-LABEL: define noundef i16 @_Z16test_abs_int16_t


There was a lot of inconsistency in these sorts of checks. Some included more and some less. I tried to include enough to unambiguously identify the function in question.

pow2clk · 2024-10-09T05:06:13Z

clang/test/CodeGenHLSL/this-assignment.hlsl

 // CHECK-NEXT: store ptr {{.*}}, ptr [[ThisPtrAddr]]
 // CHECK-NEXT: [[ThisPtr:%.*]] = load ptr, ptr [[ThisPtrAddr]]
 // CHECK-NEXT: call void @llvm.memcpy.p0.p0.i32(ptr align 4 [[ThisPtr]], ptr align 4 [[Obj:%.*]], i32 8, i1 false)
 // CHECK-NEXT: [[FirstAddr:%.*]] = getelementptr inbounds nuw %struct.Pair, ptr [[ThisPtr]], i32 0, i32 0
 // CHECK-NEXT: [[First:%.*]] = load i32, ptr [[FirstAddr]]
 // CHECK-NEXT: [[FirstPlusTwo:%.*]] = add nsw i32 [[First]], 2
 // CHECK-NEXT: store i32 [[FirstPlusTwo]], ptr [[FirstAddr]]
-// CHECK-NEXT: call void @llvm.memcpy.p0.p0.i32(ptr align 4 [[AggRes]], ptr align 4 [[Obj]], i32 8, i1 false)
+// CHECK-NEXT: call void @llvm.memcpy.p0.p0.i32(ptr align 4 {{.*}}, ptr align 4 [[Obj]], i32 8, i1 false)


I can't explain exactly why the ResPtr alloca was removed, but it was clearly unnecessary. The wildcard here represents the same location as was previously captured in AggRes and copied into ResPtr, but then ResPtr was never used.

pow2clk · 2024-10-09T05:06:56Z

clang/test/CodeGenHLSL/GlobalDestructors.hlsl

+// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.0-compute -emit-llvm -disable-llvm-passes %s -o - | FileCheck %s --check-prefixes=CS,NOINLINE,CHECK
+// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.3-library -emit-llvm -disable-llvm-passes %s -o - | FileCheck %s --check-prefixes=LIB,NOINLINE,CHECK
+// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.0-compute -emit-llvm -O0 %s -o - | FileCheck %s --check-prefixes=INLINE,CHECK
+// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.3-library -emit-llvm -O0 %s -o - | FileCheck %s --check-prefixes=INLINE,CHECK


hlsl202x is the default now, so this flag is redundant.

pow2clk · 2024-10-09T05:07:41Z

clang/test/CodeGenHLSL/builtins/abs.hlsl

+// RUN:  FileCheck %s --check-prefixes=CHECK,NATIVE_HALF
+// RUN: %clang_cc1 -finclude-default-header -triple dxil-pc-shadermodel6.3-library %s \
+// RUN:  -emit-llvm -disable-llvm-passes -o - | \
+// RUN:  FileCheck %s --check-prefixes=CHECK,NO_HALF


The file extension already has the same effect as the flag -x hlsl

pow2clk · 2024-10-09T05:10:21Z

clang/test/CodeGenHLSL/static-local-ctor.hlsl

-// CHECK-NEXT: = or i32 [[Tmp2]], 1
+// CHECK: init.check:
+// CHECK-NEXT: call void @_ZN4hlsl8RWBufferIiEC1Ev
+// CHECK-NEXT: store i8 1, ptr @_ZGVZ4mainvE5mybuf


These changes represent a different, but equivalent way of protecting the one-time initialization of a static local variable

llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:codegen backend:DirectX HLSL HLSL Language Support labels Oct 9, 2024

pow2clk commented Oct 9, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Switch DirectX Target to use the Itanium ABI #111632

Switch DirectX Target to use the Itanium ABI #111632

pow2clk commented Oct 9, 2024 •

edited

Loading

llvmbot commented Oct 9, 2024 •

edited

Loading

pow2clk left a comment

pow2clk Oct 9, 2024

pow2clk Oct 9, 2024 •

edited

Loading

pow2clk Oct 9, 2024 •

edited

Loading

pow2clk Oct 9, 2024

pow2clk Oct 9, 2024

pow2clk Oct 9, 2024

pow2clk Oct 9, 2024

pow2clk Oct 9, 2024

Switch DirectX Target to use the Itanium ABI #111632

Are you sure you want to change the base?

Switch DirectX Target to use the Itanium ABI #111632

Conversation

pow2clk commented Oct 9, 2024 • edited Loading

llvmbot commented Oct 9, 2024 • edited Loading

pow2clk left a comment

Choose a reason for hiding this comment

pow2clk Oct 9, 2024

Choose a reason for hiding this comment

pow2clk Oct 9, 2024 • edited Loading

Choose a reason for hiding this comment

pow2clk Oct 9, 2024 • edited Loading

Choose a reason for hiding this comment

pow2clk Oct 9, 2024

Choose a reason for hiding this comment

pow2clk Oct 9, 2024

Choose a reason for hiding this comment

pow2clk Oct 9, 2024

Choose a reason for hiding this comment

pow2clk Oct 9, 2024

Choose a reason for hiding this comment

pow2clk Oct 9, 2024

Choose a reason for hiding this comment

pow2clk commented Oct 9, 2024 •

edited

Loading

llvmbot commented Oct 9, 2024 •

edited

Loading

pow2clk Oct 9, 2024 •

edited

Loading

pow2clk Oct 9, 2024 •

edited

Loading