Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TAA blend shader redundant barrier #883

Open
Katzeee opened this issue Sep 1, 2024 · 2 comments
Open

TAA blend shader redundant barrier #883

Katzeee opened this issue Sep 1, 2024 · 2 comments

Comments

@Katzeee
Copy link
Contributor

Katzeee commented Sep 1, 2024

I have found in the implementation of TAA that during the blending process, after completing the Prefetch from the Color buffer, there are two calls to the GroupMemoryBarrierWithGroupSync(). However, there is no write to the groupshared memory after the first call, so the second barrier should be redundant, is that correct?

    for (uint i = GI; i < 45; i += 64)
    {
        uint X = (i % ldsHalfPitch) * 2;
        uint Y = (i / ldsHalfPitch) * 2;
        uint TopLeftIdx = X + Y * kLdsPitch;
        int2 TopLeftST = Gid.xy * uint2(8, 8) - 1 + uint2(X / 2, Y);
        float2 UV = RcpBufferDim * (TopLeftST * float2(2, 1) + float2(2, 1));

        float4 Depths = CurDepth.Gather(LinearSampler, UV);
        ldsDepth[TopLeftIdx + 0] = Depths.w;
        ldsDepth[TopLeftIdx + 1] = Depths.z;
        ldsDepth[TopLeftIdx + kLdsPitch] = Depths.x;
        ldsDepth[TopLeftIdx + 1 + kLdsPitch] = Depths.y;

        float4 R4 = InColor.GatherRed(LinearSampler, UV);
        float4 G4 = InColor.GatherGreen(LinearSampler, UV);
        float4 B4 = InColor.GatherBlue(LinearSampler, UV);
        StoreRGB(TopLeftIdx, float3(R4.w, G4.w, B4.w));
        StoreRGB(TopLeftIdx + 1, float3(R4.z, G4.z, B4.z));
        StoreRGB(TopLeftIdx + kLdsPitch, float3(R4.x, G4.x, B4.x));
        StoreRGB(TopLeftIdx + 1 + kLdsPitch, float3(R4.y, G4.y, B4.y));
    }

    GroupMemoryBarrierWithGroupSync();

    uint Idx0 = GTid.x * 2 + GTid.y * kLdsPitch + kLdsPitch + 1;
    uint Idx1 = Idx0 + 1;

    GroupMemoryBarrierWithGroupSync(); // <- redundant?
@stanard
Copy link
Member

stanard commented Sep 1, 2024

That does look redundant. It's probably vestigial (left in from an earlier version of the code). However, it is benign because the compiler should remove unnecessary barriers.

@stanard
Copy link
Member

stanard commented Sep 1, 2024

PR approved

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants