Skip to content

Creating a compute shader

MWstudios edited this page Sep 10, 2021 · 9 revisions

Starting with Ensoftener 3.0, you can create vertex and compute shaders. They work similarly to pixel shaders and are used by inheriting the PVShaderBase and ComputeShaderBase classes.

So far, we've used PVShaderBase for pixel shaders, but PVShaderBase also has a vertex shader stage that deforms the bitmap (or TEXCOORD, to be exact) before the pixel shader stage. PVShaderBase's vertex shader is a port of the SharpDX vertex shader sample, which in return was a port of the MSUWP vertex shader sample. It's fully functional, but requires extra setup and I haven't quite understood the process of creating those yet. Instead, we will focus on ComputeShaderBase, which also requires extra setup, but isn't that problematic.

How to do it

Compute shader is a "better" version of a pixel shader - you can read from and write to any pixel from any point (the downside is, it's slower). The texture is split into groups, or "tiles", which are then processed in parallel. This is a default compute shader class.

public class ComputeTest : ComputeShaderBase
{
	static Guid sGuid = Guid.NewGuid();

	public ComputeTest() : base(sGuid) { }

	public override RawInt3 CalculateThreadgroups(RawRectangle outputRect) { return new RawInt3(8, 4, 1); }
}

Assuming you went through the pixel shader tutorial, you can see that the class works just like PVShaderBase, except that there's a new method called CalculateThreadgroups. This method determines how many tiles to draw, if the tile count and size is smaller than the texture size, only the covered part of the texture will be drawn. This is the tile count, the tile size is specified on the HLSL side (more on that later).

Registering, creating the effect and drawing it doesn't differentiate from pixel shaders. Now let's create the compute shader in HLSL.

First, we register everything that's passed to the compute shader:

Texture2D InputTexture : register(t0);
SamplerState InputSampler : register(s0);
RWTexture2D<float4> OutputTexture; //change this to RWStructuredBuffer if under Shader Model 5

cbuffer systemConstants : register(b0)
{
    int4 resultRect;
    int2 outputOffset;
    float2 sceneToInput0X;
    float2 sceneToInput0Y;
};

As you can see, there is a new RWTexture2D type and a constant buffer even though we didn't create one. The RWTexture2D is the output texture that will show up on screen, and the constant buffer is automatically generated by Direct2D.

Value Details
resultRect The image rectangle. X and Y is the offset (usually 0), Z and W is the size of the texture, in pixels.
outputOffset The offset of the output (usually 0). Add it to the coordinates when writing into OutputTexture.
sceneToInput0X A horizontal transform of the texture, where X is (1 / texture width) and Y is the horizontal offset.
sceneToInput0Y Same as sceneToInput0X, but in vertical form.

Then, we add the main method. The main method has an attribute that specifies the tile size, in pixels. For semantics, see page "List of available semantics".

[numthreads(24, 16, 1)] //tile size is 24×16.
void main(uint3 dispatchThreadId : SV_DispatchThreadID, uint3 groupThreadId : SV_GroupThreadID, uint3 groupId : SV_GroupID, uint groupIndex : SV_GroupIndex)
{
	OutputTexture[dispatchThreadId.xy + outputOffset.xy + resultRect.xy] = float4(1, 0, 1, 1); //magenta
}

This example method sets every pixel to a magenta color. resultRect and outputOffset are usually set to 0 but Microsoft recommends using them anyway.

Getting pixels from textures

Sampling commands in compute shaders are slightly altered. Instead of SV_POSITION, you use SV_DispatchThreadID and there is no such thing as TEXCOORD. Instead, divide the thread ID by resultRect.zw - resultRect.xy or multiply by float2(sceneToInput0X.x, sceneToInput0Y.x).

InputTexture.Sample() is replaced with InputTexture.SampleLevel(), which has an additional third parameter. Set it to 0.

float4 color = InputTexture.SampleLevel(InputSampler,
	(dispatchThreadId.xy + resultRect.xy)
	* float2(sceneToInput0X.x, sceneToInput0Y.x) //multiplying by (1 / size) results in a replica of TEXCOORD
	+ float2(sceneToInput0X.y, sceneToInput0Y.y), 0); //adding offsets at the end (just like matrix transformation)

InputTexture.Load() stays the same.

color = InputTexture.Load(int3(dispatchThreadId.xy
	+ float2(sceneToInput0X.y, sceneToInput0Y.y) //again, the offset
	+ resultRect.xy, 0));

Adding a constant buffer

Since the b0 buffer is already in use, the constants are passed to b1. C#-wise, everything stays the same as in pixel shaders, except that the constant buffer is managed by a ComputeInformation class instead of a DrawingInformation one.

cbuffer values : register(b1)
{
	float value1;
}
public class ComputeTest : ComputeShaderBase
{
	// ..the code from earlier

	[StructLayout(LayoutKind.Sequential)]
	public struct ComputeConstantBuffer
	{
		public float value1;
	}

	ComputeConstantBuffer constants;

	[PropertyBinding(0, "0", "0", "0")]
	public float Value1
	{
		get { return constants.value1; }
		set { constants.value1 = value; }
	}

	public override void PrepareForRender(ChangeType changeType) { cInfo?.SetConstantBuffer(ref constants); } //cInfo instead of dInfo
}

How it does it work?

ComputeShaderBase is an altered version of PVShaderBase where SetDrawingInformation was replaced by SetComputeInformation. The rest was about me figuring out how the coordinate system in compute shaders works. Again, Microsoft tried to explain it with overcomplicated groups and threads, but at the end it's just a tiled texture.

In this wiki you can find out all the information on how to use the Ensoftener library. For more information on how to add Ensoftener to your project, see "Installing and running". The rest is dedicated to the library's features.

Notice: This wiki shows information for Ensoftener 5.0 and is currently outdated.

Clone this wiki locally