Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] sokol_spritebatch.h #534

Open
wants to merge 15 commits into
base: master
Choose a base branch
from
Open

[WIP] sokol_spritebatch.h #534

wants to merge 15 commits into from

Conversation

nyalloc
Copy link
Contributor

@nyalloc nyalloc commented Jun 30, 2021

A work-in-progress XNA-style spritebatch library on top of sokol_gfx.

Sample PR
Sample video

The library iterates over all the sprites submitted and creates a batch when the texture of the current sprite is different from the previous sprite. Making sure the same texture is used as often as possible, via texture atlas and/or via appropriate use of sorting is good for performance.

Shaders and pipeline are currently taken care of internally, but eventually an API will be provided to allow users to override the default pipeline with a custom one. This will enable users to change things like the blend state or shader being used, which will be advantageous for developing custom effects like additive sprite-based lighting or applying post-processing to a fullscreen quad.

Like XNA, this library relies on premultiplied alpha for alpha blending. Ideally this would be handled offline by your project's content pipeline, however, a function is provided (sb_premultiply_alpha) to post-process texture data before it is used to create an sg_image.

An XNA-style spritebatch library on top of sokol_gfx. Relies on premultiplied alpha for blending.
Takes care of orthographic projection internally.
Sorts the sprites that have been submitted to ensure the fewest draw calls are made.
@lithiumtoast
Copy link

Hey @nyalloc,

Coming from XNA/MonoGame, I would advise against having a batcher which sorts draw calls. I would consider sorting a separate concern to merging/batching draw calls for textured quads.

Christer Ericson has blogged about sorting draw calls: https://realtimecollisiondetection.net/blog/?p=86. As you can see sorting draw calls is something that will be more or less unique to the needs of the game. I.e., it is likely to be more than just sorting by texture/depth. XNA's SpriteBatch is a good beginner friendly introduction API for getting 2D to "just work". However, developers often quickly get frustrated that the options for sorting don't really meet their needs in the end and thus will only use "Deferred" mode and fallback to do their own sorting. Not having a sort feature and focusing solely on merging draw calls of textured quads simplifies the code and the API significantly.

Additionally, some ideas floating around for some improvements for a better API for XNA's SpriteBatch:

  1. Use a 3x2 matrix for a transform simplifying the data needed for a "batch item". This removes position, rotation, origin, scale, etc. This also solves the problem of easily having parent relationships of sprites since it's just a problem of constructing the correct matrix. It also simplifies the code for transforming the 4 vertices of the quad significantly; e.g. less "if" checks.
  2. Caching the internal vertex buffer for re-use in later frames. When drawing more or less static geometry like maps in 2D games, the geometry does not change frame-per-frame, or at least it is unlikely to change frame-per-frame. It is a waste to fill the vertex buffer every frame with data that effectively does not change frequently; waste for the CPU to crunch through the data to enqueue the sprites and also a better hint could be directed to the GPU of using DYNAMIC vs STREAM.

Just my 2 cents.

@kariem2k
Copy link

kariem2k commented Jul 6, 2021

That looks great! The for your work on this.

A user-defined sorting key would be great (for shaders and other states). Another example of a unique spritebatcher is

https://github.com/RandyGaul/cute_headers/blob/master/cute_spritebatch.h

It builds the sprite atlas as well.

Reworked the API to use contexts. Removed internal sorting of sprites (for now, at least). Adjusted naming. Introduced push_sprite_rect which lets you create a sprite to be rendered at a specific destination rectangle.
@nyalloc
Copy link
Contributor Author

nyalloc commented Jul 23, 2021

Hello! Back on this now. As per @lithiumtoast feedback I've removed the sorting API, right now users will be capable of worting their sprites externally in a manner that is appropriate for their application. I can always reintroduce this if there is ever demand for it, but removing it does simplify the API and the implementation quite a bit.

New in these changes are the ability to render to different render targets + use different sg_pipelines. I'm still working on these parts of the API and I think it will be a little while before I get them right. Luckily there is plenty of examples of this in the other sokol utility headers.

Regarding the 3x2 matrices, I think this is an interesting idea. @NoelFB uses a 3x2 matrix stack for his spritebatcher and this is something I can see being quite handy without adding very much overhead (3x2 matrix multiply against 2d points is pretty trivial).

I'll continue working on this and throw up a sample application when it is ready.

Unfrotunately sokol's 0 initialise to default idiom does not play nicely with color data, as 0'd out sg_color would ideally be set to a sensible default, white. However, 0'd out sg_color is also a valid color, transparent, which can result in undesirable effects if you are lerping color to transparent.
@lithiumtoast
Copy link

lithiumtoast commented Jul 24, 2021

Btw, trying to do this myself with sokol-shdc. GLSL uses column major so it needs to be a mat2x3 not a mat3x2.

EDIT: OpenGL and DirectX use the same memory layout however, forget about row/column major order after that.

Here's a image visually explaining the affine transforms possible for 2D. Notice the greyed out last row which is always the same. This is why a 3x2 matrix is more compact for 2D instead of just a 3x3.

@nyalloc
Copy link
Contributor Author

nyalloc commented Jul 24, 2021

It shouldn't matter if sokol-shdc struggles with 3x2 matrices, the library won't need them for the shaders. If I introduce a 3x2 matrix stack to this, then that multiplication will be done CPU-side when generating the vertex buffer. This is fine, because the 3x2 matrix multiplication is really trivial. NoelFB's batcher demonstrates this pretty well.

@lithiumtoast
Copy link

lithiumtoast commented Jul 24, 2021

Ah, I was going for doing it all in the vertex shader:

...
layout(location = 0) in mat2x3 vs_Matrix3x2; // SG_VERTEXFORMAT_FLOAT3
// location = 1 is used by the 3x2 matrix // SG_VERTEXFORMAT_FLOAT3
layout(location = 2) in vec4 vs_SourceRectangle; // SG_VERTEXFORMAT_USHORT4N
...
const vec2 corners[4] =
{
    vec2(0, 0),
    vec2(0, 1),
    vec2(1, 1),
    vec2(1, 0)
};
...
mat4x4 modelMatrix = mat4x4(
        vec4(vs_Matrix3x2[0][0], vs_Matrix3x2[0][1], 0, 0),
        vec4(vs_Matrix3x2[0][2], vs_Matrix3x2[1][0], 0, 0),
        vec4(0, 0, 1, 0),
        vec4(vs_Matrix3x2[1][1], vs_Matrix3x2[1][2], 0, 1)
    );
...
gl_Position = ViewProjectionMatrix * modelMatrix * vec4(corners[gl_VertexIndex] * vs_SourceRectangle.zw * 65535, 0, 1);    
...

@nyalloc
Copy link
Contributor Author

nyalloc commented Jul 24, 2021

That is neat. The vertex buffer takes the form of an array of matrices that define the sprite dimensions / transform? I wonder how you would handle texture coordinates or other per-vertex state.

Right now it's not clear to me if there would be a meaningful performance benefit to this appraoch so I'm not going to change course. A worthwhile experiment though. Let me know how it goes.

@lithiumtoast
Copy link

lithiumtoast commented Jul 25, 2021

The vertex buffer takes the form of an array of matrices that define the sprite dimensions / transform?

By using instancing, the single vertex buffer is the "sprite batch items"; each vertex is the instance data for the textured quad.

I wonder how you would handle texture coordinates

It's done by using the source rectangle position and size along with a uniform for the texture size. However, the consequence is that sprite batching would have to be fragmented by texture. Thus, using a different texture requires stopping the current batch in process and starting a new batch. For me, this is acceptable because I would be using a texture atlas.

Here's the whole shader program I'm using with sokol-shdc. Note that I'm using bottom-left as origin for model/world positions. What's really nice is that by moving everything into the vertex shader, the code for the CPU is dead simple as just filling data into the "sprite batch items" array and then uploading it to the GPU via the vertex buffer.

@vs vs

layout(binding=0) uniform MatrixData
{
    uniform mat4x4 ViewProjectionMatrix; // [View-to-Projection]x[World-to-View]
}; 

layout(binding=1) uniform TextureData
{
    uniform vec2 TextureSize;
}; 

layout(location = 0) in mat2x3 vs_matrix3x2; // GLSL uses column-major meaning `a`x`b` is `a` columns by `b` rows. However, people usually think of a matrix as `a` rows by `b` columns when they say `a`x`b`.
// location = 1 is used by the 3x2 matrix because in sokol we can only use vectors to which the matrix is 2 x vec3.
layout(location = 2) in vec4 vs_source_rectangle;
layout(location = 3) in vec4 vs_color_tint;
layout(location = 4) in float vs_depth;

out vec4 fs_color_tint;
out vec2 fs_uv;

const vec2 corners[4] =
{
    /* Rectangle from two clockwise triangles:
        (0, 1, 2) and (0, 2, 3)
            1---------2
	    |       / |
	    │     /   |
	    │   /     │
	    │ /       │
	    0---------3
        below are the local space corner positions of the rectangle
        note that by using [-1, 1] the origin is (0, 0)
        this makes it so rotation by default is around the centre of the rectangle
	 */
    vec2(-1, -1), // bottom-left
    vec2(-1, +1), // top-left
    vec2(+1, +1), // top-right
    vec2(+1, -1)  // bottom-right
};

void main()
{
    /*  the 3x2 matrix has the data layout: m11, m12, m21, m22, m31, m32
        however, they are copied from CPU to GPU like so into vectors
        ----0---1---2-
        0| m11 m12 m21
        1| m22 m31 m32
        we actually want to use the 3x2 matrix as a 3x3 matrix like so (column-major order)
        ----0---1---2-
        0| m11 m12  0
        1| m21 m22  0 
        3| m31 m32  1
        so that we can build the 4x4 matrix from that 3x3 matrix like so (column-major order)
        ----0---1---2---3-
        0| m11 m12  0   0
        1| m21 m22  0   0
        3|  0   0   1   0
        4| m31 m32  0   1
        thus we have the following matrix in the code below for mapping between the blittled data
        with GLSL we index matrices by column then row starting at 0 not 1, e.g. m12 = m[1][0]
        ----------------------------------
        | d[0][0] d[0][1] 0______ 0______
        | d[0][2] d[1][0] 0______ 0______
        | 0______ 0______ 1______ 0______
        | d[1][1] d[1][2] 0______ 1______
    */

    mat4x4 model_matrix = mat4x4(
        vec4(vs_matrix3x2[0][0], vs_matrix3x2[0][1], 0, 0),
        vec4(vs_matrix3x2[0][2], vs_matrix3x2[1][0], 0, 0),
        vec4(0, 0, 1, 0),
        vec4(vs_matrix3x2[1][1], vs_matrix3x2[1][2], 0, 1)
    );

    // vs_source_rectangle is the subtexture region to use
    // however, when the values are fetched they are normalized into a float in the range of [0,1] from an unsigned 16-bit value
    // thus, we need to multiply by 65535 to get the original unsigned 16-bit value as it was
    // the reason for using normalized values instead of just integers is for compatibility accross all graphics APIs 
    vec2 src_size = vs_source_rectangle.zw * 65535;
    vec2 src_position = vs_source_rectangle.xy * 65535;

    // the corner position of the rectangle in local space
    vec2 corner = corners[gl_VertexIndex];
    // the vertex position in traditional sprite rendering
    vec4 position = vec4(corner * src_size, vs_depth, 1);
    // by convention we are using column-major matrices, thus the matrix multiplication is post-multiplication:
    // clip position = [View-To-Projection matrix]x[World-to-View matrix]x[Model-to-World matrix]*[vertex position]
    gl_Position = ViewProjectionMatrix * model_matrix * position;

    // corner is remapped from [-1, 1] to [0, 1] to be in texture coordinates space, bottom-left is origin
    vec2 corner_uv = corner * 0.5 + 0.5;
    // however, we want texture coordinates to use top-left origin because that's how people traditionally think of pixel space of an image
    // so we need to flip the Y-axis
    corner_uv.y = 1 - corner_uv.y;
    
    // calculate and pass texture coordinates to the fragment shader
    fs_uv = (src_position + corner_uv * src_size) / TextureSize;

    // pass the color which to tint the sprite to the fragment shader
    fs_color_tint = vs_color_tint;
}

@end

@fs fs

uniform sampler2D Texture;

layout(location = 0) in vec4 fs_color_tint;
layout(location = 1) in vec2 fs_uv;

out vec4 color;

void main()
{
    vec4 color_texture = texture(Texture, fs_uv);
    color = color_texture * fs_color_tint;
}

@end

@program spritebatch vs fs

the sbatch_pipeline helps users make a pipeline object that is correct and usable for the sbatch rendering API.
Copy link
Owner

@floooh floooh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some things I stumbled over when testing on macOS, also some minor things in sokol-samples (will comment on those now).

util/sokol_spritebatch.h Outdated Show resolved Hide resolved
util/sokol_spritebatch.h Outdated Show resolved Hide resolved
util/sokol_spritebatch.h Outdated Show resolved Hide resolved
util/sokol_spritebatch.h Show resolved Hide resolved
util/sokol_spritebatch.h Outdated Show resolved Hide resolved
return _sbatch_pack_color_bytes(r, g, b, a);
}

static int _sg_image_slot_index(uint32_t id) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy-paste bug? (shouldn't this be _sbatch_image_slot_index())

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't appear to be fixed yet?

util/sokol_spritebatch.h Outdated Show resolved Hide resolved
sg_draw(base_element, num_elements, 1);
}

static void _sbatch_matmul(sbatch_matrix* p, const sbatch_matrix* a, const sbatch_matrix* b) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function _sbatch_matmul() doesn't appear to be used anywhere?

Allow sprites to be optionally tilted 45 degrees for depth-based custom pipelines.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants