[wip] MSL: Fix dynamically indexed pull interpolants #2364

ncesario-lunarg · 2024-08-15T05:42:03Z

This is a first attempt at fixing the test failures mentioned in #1796 by copying scalarized/flattened arrays back into arrays in the fragment shader for each "unique" interpolate opration. This approach has as number of drawbacks such as negative impacts to runtime performance and shader size, as well as having limited coverage for dynamic offsets (InterpolateAtOffset) and samples (InterpolateAtSample).

This change currently fixes all the CTS tests mentioned in #1796 except dEQP-VK.pipeline.monolithic.multisample_interpolation.nonuniform_interpolant_indexing.offset, which requires some additional work (or running spirv-opt -O on the spirv before running spirv-cross also works), however, I'd like to get an idea of whether or not this approach would be acceptable in spirv-cross or an alternative is preferable.

This is a first attempt at fixing the test failures mentioned in KhronosGroup#1796 by copying scalarized/flattened arrays back into arrays in the fragment shader for each "unique" interpolate opration. This approach has as number of drawbacks such as negative impacts to runtime performance and shader size, as well as having limited coverage for dynamic offsets (InterpolateAtOffset) and samples (InterpolateAtSample).

billhollings · 2024-08-15T15:08:16Z

FWIW...

The original 24 CTS failures in #1796 have already been fixed.

The 3 additional pull-model interpolation CTS failures that @aitor-lunarg added to #1796, are still occurring, but report a different error from the original (see #1796-comment).

So, the fixes here should just address pull-model interpolation, which I see it mostly does, but any extra aimed at the original 24 errors, would not be needed. However, I have not reviewed the code here in enough detail to confirm any unnecessary code.

HansKristian-Work · 2024-08-20T08:20:44Z

It's very hard to reason from this PR what is being solved exactly. A solution that doesn't solve interpolateAtOffset and Samples is not good enough I think.

Do you have an example GLSL shader that demonstrates the problem? I barely remember what the pull model MSL stuff is and need a refresher.

cdavis5e · 2024-08-20T09:20:44Z

I barely remember what the pull model MSL stuff is and need a refresher.

In Metal, pull-model interpolants use a special wrapper template, interpolant<T>. The various interpolations are then obtained using methods called on the interpolant object. The interpolant object itself cannot be used directly--all accesses must go through one of the interpolant accessor methods.

As I recall, the problem is that Metal only allows certain types in a stage_in struct; interpolant<T> is one of those types, but an array of interpolant<T> is not. I don't know if this is still true.

ncesario-lunarg · 2024-08-20T21:11:25Z

A solution that doesn't solve interpolateAtOffset and Samples is not good enough I think.

I agree. This particular change is trying to address the CTS failures, and if this is the correct approach there should be a follow up change to make it more general.

Do you have an example GLSL shader that demonstrates the problem?

The fragment shader from external/vulkancts/data/vulkan/amber/pipeline/nonuniform_interpolant_indexing/centroid.amber is an example of one of the CTS tests hitting this. In short, something like the following:

#version 450

layout(location = 0) sample in vec2 inVar[2];
layout(location = 0) out vec4 fs_out_color;

void main (void)
{
    int i = int(gl_FragCoord.x) % 2;

    // This is where spirv-cross will throw at https://github.com/KhronosGroup/SPIRV-Cross/blob/d33a39045dd8f5b87eadc3d4cc88b32918d1c41a/spirv_msl.cpp#L8395 due to i not being a constant.
    const float x = interpolateAtCentroid(inVar[i].x);

    fs_out_color = vec4(x, 0, 0, 1);
}

reproduces the issue. Replacing interpolateAtCentroid with any of the other interpolateAt* functions will have the same effect. With this proposed change, the resulting metal shader would be something like:

// ...

struct main0_in
{
    interpolant<float2, interpolation::perspective> inVar_0 [[user(locn0)]];
    interpolant<float2, interpolation::perspective> inVar_1 [[user(locn1)]];
};

fragment main0_out main0(main0_in in [[stage_in]], uint gl_SampleID [[sample_id]], float4 gl_FragCoord [[position]])
{
    main0_out out = {};
    spvUnsafeArray<float2, 2> inVar = {};

    // One new array per interpolate function times unique value parameter passed to that function.
    // This would obviously increase the shader size by quite a bit, though realistically perhaps not very
    // much after optimization. This could also be done via setting up a large "if/else" branch; I'm not
    // sure what's more amenable to metal.
    spvUnsafeArray<float2, 2> inVar_centroid_108;
    inVar[0] = in.inVar_0.interpolate_at_sample(gl_SampleID);
    inVar_centroid_108[0] = in.inVar_0.interpolate_at_centroid();
    inVar[1] = in.inVar_1.interpolate_at_sample(gl_SampleID);
    inVar_centroid_108[1] = in.inVar_1.interpolate_at_centroid();

    gl_FragCoord.xy += get_sample_position(gl_SampleID) - 0.5;
    int i = int(gl_FragCoord.x) % 2;

    // This is the "solution"
    float x = inVar2_centroid_108[i].x;

    out.fs_out_color += float4(x, 0.0, 0.0, 1.0);
    return out;
}

This change will work for the interpolateAt{Sample,Offset} cases, but the reason I say "limited coverage" for those cases is if we have something like:

// ...
float acc = 0.0;
for (int i = 0; i < 4; ++i) {
    acc += interpolateAtSample(inVar[i].x, i);
}
// ...

This is perhaps doable with loop unrolling, but if it's not possible to unroll the loop for some reason, I'm not sure how to approach it.

HansKristian-Work · 2024-08-23T08:31:17Z

It seems like arrays should work, e.g. this compiles:

#pragma clang diagnostic ignored "-Wmissing-prototypes"

#include <metal_stdlib>
#include <simd/simd.h>

using namespace metal;

struct main0_out
{
    float4 FragColor [[color(0)]];
};

struct main0_in
{
    interpolant<float4, interpolation::perspective> Attr [[user(locn0)]];
    interpolant<float4, interpolation::perspective> Attr2 [[user(locn1)]];
	uint index [[user(locn2)]];
};

static inline __attribute__((always_inline))
float4 func(thread main0_in& in)
{
	thread interpolant<float4, interpolation::perspective> *Attrs[] = { &in.Attr, &in.Attr2 };
    return Attrs[in.index]->interpolate_at_sample(0);
}

fragment main0_out main0(main0_in in [[stage_in]])
{
    main0_out out = {};
    out.FragColor = func(in);
    return out;
}

define <{ <4 x float> }> @main0(%struct._interpolant_t addrspace(1)* %0, %struct._interpolant_t addrspace(1)* %1, i32 %2) local_unnamed_addr #0 {
  %4 = alloca [2 x %"struct.metal::interpolant"*], align 8
  %5 = alloca %struct.main0_in, align 8
  %6 = getelementptr inbounds %struct.main0_in, %struct.main0_in* %5, i64 0, i32 0, i32 0
  store %struct._interpolant_t addrspace(1)* %0, %struct._interpolant_t addrspace(1)** %6, align 8, !tbaa !20
  %7 = getelementptr inbounds %struct.main0_in, %struct.main0_in* %5, i64 0, i32 1, i32 0
  store %struct._interpolant_t addrspace(1)* %1, %struct._interpolant_t addrspace(1)** %7, align 8, !tbaa !27
  %8 = bitcast [2 x %"struct.metal::interpolant"*]* %4 to i8*
  call void @llvm.lifetime.start.p0i8(i64 16, i8* nonnull %8) #3
  %9 = getelementptr inbounds [2 x %"struct.metal::interpolant"*], [2 x %"struct.metal::interpolant"*]* %4, i64 0, i64 0
  %10 = getelementptr inbounds %struct.main0_in, %struct.main0_in* %5, i64 0, i32 0
  store %"struct.metal::interpolant"* %10, %"struct.metal::interpolant"** %9, align 8, !tbaa !28
  %11 = getelementptr inbounds [2 x %"struct.metal::interpolant"*], [2 x %"struct.metal::interpolant"*]* %4, i64 0, i64 1
  %12 = getelementptr inbounds %struct.main0_in, %struct.main0_in* %5, i64 0, i32 1
  store %"struct.metal::interpolant"* %12, %"struct.metal::interpolant"** %11, align 8, !tbaa !28
  %13 = zext i32 %2 to i64
  %14 = getelementptr inbounds [2 x %"struct.metal::interpolant"*], [2 x %"struct.metal::interpolant"*]* %4, i64 0, i64 %13
  %15 = load %"struct.metal::interpolant"*, %"struct.metal::interpolant"** %14, align 8, !tbaa !28
  %16 = getelementptr inbounds %"struct.metal::interpolant", %"struct.metal::interpolant"* %15, i64 0, i32 0
  %17 = load %struct._interpolant_t addrspace(1)*, %struct._interpolant_t addrspace(1)** %16, align 8, !tbaa !30
  %18 = call fast <4 x float> @air.interpolate_sample_perspective.v4f32(%struct._interpolant_t addrspace(1)* nocapture readonly %17, i32 0) #2
  call void @llvm.lifetime.end.p0i8(i64 16, i8* nonnull %8) #3
  %19 = insertvalue <{ <4 x float> }> undef, <4 x float> %18, 0
  ret <{ <4 x float> }> %19
}

HansKristian-Work · 2024-08-23T08:32:50Z

If the array can be placed in a wrapper struct that resolves the pointer stuff, this sounds like the way to go. I'm not overly concerned with performance. This is an extremely esoteric feature that only has to work, not be optimal.

ncesario-lunarg · 2024-08-24T00:01:24Z

It seems like arrays should work

Yep, that works and makes this so much easier; I should have tried that in metal first :/ I think I was making some erroneous assumptions based on

SPIRV-Cross/spirv_msl.cpp

Lines 8395 to 8396 in d33a390

    
           SPIRV_CROSS_THROW("Trying to dynamically index into an array interface variable in tessellation. " 
        
                             "This is currently unsupported.");

.

I've shifted to working on a different issue, but will try to find some cycles to wrap this up next week time permitting.

Thank you @HansKristian-Work!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[wip] MSL: Fix dynamically indexed pull interpolants #2364

[wip] MSL: Fix dynamically indexed pull interpolants #2364

ncesario-lunarg commented Aug 15, 2024

billhollings commented Aug 15, 2024 •

edited

Loading

HansKristian-Work commented Aug 20, 2024

cdavis5e commented Aug 20, 2024

ncesario-lunarg commented Aug 20, 2024 •

edited

Loading

HansKristian-Work commented Aug 23, 2024

HansKristian-Work commented Aug 23, 2024

ncesario-lunarg commented Aug 24, 2024

[wip] MSL: Fix dynamically indexed pull interpolants #2364

Are you sure you want to change the base?

[wip] MSL: Fix dynamically indexed pull interpolants #2364

Conversation

ncesario-lunarg commented Aug 15, 2024

billhollings commented Aug 15, 2024 • edited Loading

HansKristian-Work commented Aug 20, 2024

cdavis5e commented Aug 20, 2024

ncesario-lunarg commented Aug 20, 2024 • edited Loading

HansKristian-Work commented Aug 23, 2024

HansKristian-Work commented Aug 23, 2024

ncesario-lunarg commented Aug 24, 2024

billhollings commented Aug 15, 2024 •

edited

Loading

ncesario-lunarg commented Aug 20, 2024 •

edited

Loading