OpenGL on N64

This documents highlights how to best use OpenGL on Nintendo 64 via the libdragon unstable branch. This branch features a OpenGL 1.1/1.2 implementation that implements several official extensions, plus some N64-specific extensions.

This is not an OpenGL guide. Reading this page requires some previous knowledge of classic OpenGL programming, as it only underlines specific optimizations or tricks required for maximum performance on Nintendo 64.

Implemented extensions

Currently, we implement most of OpenGL 1.1, plus some bits of OpenGL 1.2 (specifically, VBOs). We also implement the following extensions:

GL_ARB_multisample. Use to activate the RDP/VI antialias, via glEnable(GL_MULTISAMPLE_ARB).
GL_EXT_packed_pixels
GL_ARB_vertex_buffer_object
GL_ARB_texture_mirrored_repeat. You can use glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_MIRRORED_REPEAT_ARB) to activate mirrored wrapping.
GL_ARB_texture_non_power_of_two. You can use textures of any size, including sizes which are not powers of two. Notice that this is allowed only on clamped textures though, because of a hardware limit of the RDP.
GL_ARB_vertex_array_object
GL_ARB_matrix_palette. This can be used to implement rigid skinning.

We also create two N64-specific extensions:

GL_N64_surface_image: this allows to define textures using libdragon's surface_t and sprite_t objects.
GL_N64_RDPQ_interop: this allows to mix-match OpenGL and the lower-level, native rdpq API in two specific areas (material definition and texturing).

Overview of an OpenGL render loop

OpenGL itself does not define a way to initialize the context, and that is instead normally left to other libraries (like SDL, GLUT, etc.). In our case, we use the rest of libdragon itself to configure the context.

This is some pseudo-code to achieve a working libdragon context, please refer to the gldemo examples for a fully working sample.

    // Initialize 320x240x16, triple buffering.
    // NOTE: anti-alias will only be active if activated later in GL.
    display_init(RESOLUTION_320x240, DEPTH_16_BPP, 3, GAMMA_NONE, ANTIALIAS_RESAMPLE_FETCH_ALWAYS);
 
    // Allocate a buffer that will be used as Z-Buffer.
    surface_t zbuffer = surface_alloc(FMT_IA16, 320, 240);

    // Initialize OpenGL
    gl_init();

    // Main loop
    while (1) {
        // Acquire a framebuffer. Wait until it's available.
        surface_t *fb = display_get();

        // Attach RDP to the framebuffer and the Z-Buffer.
        // From now on, the RDP will draw to the specified buffer(s).
        rdpq_attach(fb, &zbuffer);

        // Just as an example, use rdpq to fill half of the screen with the green color.
        // This is just to show that you can now issue rdpq commands.
        rdpq_set_mode_fill(RGBA32(0, 255, 0, 255));
        rdpq_fill_rectangle(0, 0, 320, 120);

        // Enter the OpenGL context. From now on, you can start using OpenGL
        // and you must NOT use rdpq to avoid conflicts.
        gl_context_begin();
        
        // Fill the other half of the screen, using OpenGL
        glScissor(0, 120, 320, 240);
        glClearColor(0, 0, 1, 1);
        glClear(GL_COLOR_BUFFER_BIT);
        
        // Close the OpenGL context. You can open/close the context as many times as
        // required. Now that the context is closed, you can call rdpq again.
        gl_context_end();

        // Detach RDP from the current attached buffer *and* flip it on the screen
        // as soon as it's ready. This call is non-blocking, so the RSP/RDP might continue
        // processing the issued commands in background.
        rdpq_detach_show();
    }

How to efficiently draw a mesh

Object allocations

Classic OpenGL manages several "objects" like textures, vertex arrays, display lists, etc. These objects are referenced by IDs (of type GLuint) which are normally allocated via a family of glGen* functions. For instance, to create a vertex array, this is how it is normally done:

    GLuint sphere_array;

    // Generate one vertex array ID into sphere_array.
    glGenVertexArrays(1, &sphere_array);
    
    // Bind the vertex array (= make it "current")
    glBindVertexArray(sphere_array);

    // Procede to configure it
    [...]

This is the standard code, which works on Nintendo 64 as well. Unfortunately, OpenGL specifications for 1.1/1.2 allow to use manually-allocated IDs for textures and display lists (as in OpenGL 1.0 when the glGen* functions did not exist yet):

    // Choose any ID I want
    GLuint texture_wall = 0x1234;

    // Bind the texture
    glBindTexture(GL_TEXTURE_2D, sphere_wall);

    // Load the texture image
    glTexImage2D(...)

so basically an application can have its own ID allocation mechanism. For instance, GlQuake has its own texture ID generation system, which is simply:

int texture_id = 1;  // next ID to allocate

int generate_texture_id(void) {
    return texture_id++;
}

We decided to not support this style of ID self-allocation on Nintendo 64. In fact, to support it, we would have to slow down the implementation by adding hash table lookups for each function in OpenGL that references an ID.

If you try to use an ID not allocated via glGen* functions, you will get an error screen like this:

Texture limits

N64 supports both square and rectangular textures. Width and height can be of any size when the texture is clamped. When using wrapping, instead, the texture size must be a power of two in the direction(s) in which wrapping is active.

In general, each texture (including all mipmaps, if present) must fit into TMEM, which is only 4096 bytes. This table shows the limits fo textures without mipmaps:

Format	Limit (square)	Limit (texels)	Description
RGBA16	44x44	2048	16-bit texels, only 1 bit of alpha.
RGBA32	32x32	1024	32-bit texels
CI8	44x44	2048	256 colors, with palette
CI4	64x64	4096	16 colors, with palette

Texture loading

In classic OpenGL, textures are managed via "texture objects":

at load time, you allocate one ID for each texture that you plan to use; then bind it (to make it current) and load the graphics for it, typically via glTexImage2D. This is also a good moment to configure texture attributes (like filtering or wrapping behavior).
at run time, you bind the texture again and draw triangles using it.

OpenGL was designed for an architecture where the GPU had its own video memory. Thus, glTexImage2D is defined to take a copy of the texture pixels. The idea is that the OpenGL implementation will copy the texture pixels into the GPU VRAM, and the CPU is then free to release the buffer right away.

Nintendo 64 has a UMA architecture so there is not a concept of Video RAM for exclusive access to the RDP. There is indeed a TMEM (texture memory) but that is more similar to a texture cache: it can contain just one texture and is basically the intermediate buffer where to load a texture immediately before drawing it. Thus, textures must reside in RDRAM. Implementing the actual glTexImage2D semantic is indeed possible (and we did it, to simplify porting) but it is wasteful because OpenGL has to allocate a new buffer and copy the texture pixels.

To implement a more lightweight semantic, taking advantage of the UMA architecture, we introduced an extension that comprehends two new functions: glSurfaceTexImageN64() and glSpriteTextureN64().

`glSpriteTextureN64()`

void glSpriteTextureN64(GLenum target, sprite_t *sprite, rdpq_texparms_t *texparms)

This is the highest level texture creation function, and the easiest to use. It uses a sprite_t which is the object created by loading a .sprite file, the native N64 image format generated by the mksprite tool. The easiest pipeline to import a texture from an image file is thus:

Prepare your texture in PNG format. Make sure it follows the limits described above in terms of texture size
Convert your PNG texture into .sprite using mksprite. This can be tested manually by running mksprite but it is normally run as part of the build system via the Makefile. mksprite supports automatic mipmap creation, and color format conversion (eg: it will quantize images to create a palletized version if asked to do so).
Load the sprite from ROM using sprite_load. This will allocate a sprite_t object.
Configure the OpenGL texture object specifying the sprite object via glSpriteTextureN64().

For instance, this is how to manually convert a texture to a .sprite.

$ $N64_INST/bin/mksprite --verbose --compress --mipmap BOX --format CI4 circle0.png
Converting: circle0.png -> ./circle0.sprite [fmt=CI4 tiles=0,0 mipmap=BOX dither=NONE]
loading image: circle0.png
loaded circle0.png (32x32, LCT_RGBA)
mipmap: generated 16x16
mipmap: generated 8x8
mipmap: generated 4x4
quantizing image(s) to 16 colors
auto detected hslices: 2 (w=32/16)
auto detected vslices: 2 (w=32/16)
compressed: ./circle0.sprite (848 -> 280, ratio 33.0%)

In this case, we started from a RGBA PNG, and we asked to convert it to CI4 (16 colors with palette), generate mipmaps, and also compress the resulting file using libdragon's builtin compression support.

Then, at runtime, we can load the texture like this:

    // Load the sprite from ROM (decompressing it transparently if it is compressed)
    sprite_t *circle = sprite_load("rom:/circle0.sprite");
    
    // Allocate texture object ID
    GLuint tcircle;
    glGenTextures(1, & tcircle);

    // Configure the texture, including all mipmaps
    glBindTexture(GL_TEXTURE_2D, tcircle);
    glSpriteTextureN64(GL_TEXTURE_2D, circle, NULL);

See below for the usage of the third parameter (rdpq_texparms_t*) to configure the texture sampler.

`glSurfaceTexImageN64()`

void glSurfaceTexImageN64(GLenum target, GLint level, surface_t *surface, rdpq_texparms_t *texparms);

While glSpriteTextureN64 allows to configure the whole texture object in one go (all images for all mimaps), glSurfaceTexImageN64 is more similar to glTexImage2D and allows to configure one image at a time. Thus, it is a lower-level function which is probably more useful while porting existing code bases that do not use .sprite files.

These are the main differences compared to glTexImage2D:

The input buffer is passed as a surface_t, which is Libdragon's data structure to define memory buffers used to store images.
There is no memory copy being performed nor change of ownership. OpenGL expected that the surface_t passed in will stay available during runtime. It is responsibility of the caller not to dispose the surface_t, as long as the texture object is being used.
The function accepts also an optional rdpq_texparms_t structure which can be used to configure the texture sampler parameters. See below for more information.

This function must be called one time per each mipmap level, specifying the mipmap level in the parameter level.

Texture sampler

RDP has a very peculiar and advanced texture sampler, that is capable of effects not commonly found in other GPUs. For instance, it is possible to add a translation and a scale to all texture coordinates while sampling (similar to applying a texture matrix), and it can be configured to both wrap (a finite, fractional amount of times) and then clamp. For instance, you can request a texture to repeat for two times and a half horizontally and then clamp the last pixel indefinitely.

To access all these features, both glSpriteTextureN64 and glSurfaceTexImageN64 accept also an optional rdpq_texparms_t structure. This structure is defined in rdpq (libdragon's native RDP library, upon which OpenGL is built), and exposes all the sampler functionalities:

typedef struct rdpq_texparms_s {
    int tmem_addr;           ///< TMEM address where to load the texture (default: 0)
    int palette;             ///< Palette number where TLUT is stored (used only for CI4 textures)

    struct {
        float   translate;    ///< Translation of the texture (in pixels)
        int     scale_log;    ///< Power of 2 scale modifier of the texture (default: 0). Eg: -2 = make the texture 4 times smaller

        float   repeats;      ///< Number of repetitions before the texture clamps (default: 1). Use #REPEAT_INFINITE for infinite repetitions (wrapping)
        bool    mirror;       ///< Repetition mode (default: MIRROR_NONE). If true (MIRROR_REPEAT), the texture mirrors at each repetition 
    } s, t; // S/T directions of texture parameters

} rdpq_texparms_t;

The first two fields are for very specific cases and can be generally ignored when using OpenGL (leaving them to 0). The sampler parameters can be specified for both s and t (horizontal and vertical). This is an example of usage:

    glSpriteTextureN64(GL_TEXTURE_2D, sprite, &(rdpq_textparms_t){
        .s.translate = 8, .s.repeates = 2.5, 
        .t.repeats = REPEAT_INFINITE, .t.mirror = true,
    });

In this case, we are configuring the s coordinate (horizontal) to repeat two and a half time before starting to clamp. Moreover the texture will be translated by 8 texels to the left (just as if all s coordinates in all vertices were added to 8). On the t coordinate (vertical) instead, the texture will repeat infinite times, mirroring at each repetition.

Notice the usage of C99 designated initializers, which are more readable and guarantee that all other fields are left to 0 (which is designed to be a good default for all the fields).

NOTE: if you use glTexSurfaceImageN64 to upload mipmaps one by one, and provide your own rdpq_texparms_t parameters, make sure to also update the scaling factor (fields .s.scale_log and .t.scale_log), increasing it by 1 for each subsequent level.

Texture sampler via mksprite

As an alternative to provide the texture sampler parameters in code, it is possible to embed the default sampler parameters in a .sprite file via mksprite. The is the relevant excerpt of the help:

Sampling flags:
   --texparms <x,s,r,m>          Sampling parameters:
                                 x=translation, s=scale, r=repetitions, m=mirror
   --texparms <x,x,s,s,r,r,m,m>  Sampling parameters (different for S/T)

The four parameters x,s,r,m corresponds respectively to the fields translate, scale_log, repeats, mirror of the rdpq_texparms_t structure. There are two accepted syntax: the first one can be used when the sampler must be configured both horizontally and vertically in the same way; the second instead can be used when the configuration is different.

This example embeds within the sprite file the same configuration shown in the above example:

$ $N64_INST/bin/mksprite --texparms 8,0,0,0,2.5,inf,0,1

in fact, from left to right:

8,0: translation parameters (8 for s, 0 for t)
0,0: scale parameters. Being the log2, this means a scale factor of 1 (= no scale).
2.5,inf: repetitions. The texture will repeat 2.5 times horizontally, and infinite times vertically.
0,1. mirror. The texture will mirror vertically, and repeat normally horizontally.

To use the values embed in the sprite, just pass NULL to glSpriteTextureN64:

   glSpriteTextureN64(GL_TEXTURE_2D, sprite, NULL);

In fact, the semantic of passing NULL to glSpriteTextureN64 is as follows:

If the sprite contains embedded parameters, use those.
Otherwise, if the user called glTexParameteri with values GL_TEXTURE_WRAP_S/GL_TEXTURE_WRAP_T on the texture object before calling glSpriteTextureN64, use those configurations.
Otherwise, fallback to OpenGL default which is making an infinite non-mirrored wrapping on both axis.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly