GitHub - cyhunter/hgiVk: Hydra Graphics Interface Vulkan

Prototype Vulkan backend for Pixar USD Hydra Graphics Interface (HGI).

[ kitchen scene running in vulkan / face normals, no subdivision ]

If you wish to run the test application, you can download it here: https://github.com/lumonix/hgiVk/releases

⚠️ These are debug builds with vulkan validation layers enabled. It will run significantly slower than the release builds and will assert on errors. There are known bugs / limitations. For example you might experience flickering / tearing. There are missing barriers between UI thread reads and hydra thread writes.

The Imgui-based test application uses a custom, minimal HdRenderDelegate. The end-goal is to be able to run Storm (HdSt), but with vulkan enabled.

The render delegate focusses on two parts of Hydra.

Resource Sync
Render Pass Execute

There is a third phase managed by the application:

EndFrame

During EndFrame we submit all GPU work recorded and proceed to the next CPU frame.

[ Nsight capture showing UI and Hydra rendering via same device ]

Resource Sync

When Hydra wants to create, destroy or update prims, such as a mesh, it will do so during Sync. The sync of many prims runs parallel via tbb parallel_for loops.

HgiVk can parallal record the resource changes immediately and lock free. It manages this via HgiVkCommandBufferManager (CBM) by ensuring there is a command buffer for each thread. CBM manages this via thread local storage (TLS). Each frame it will give each thread exclusive access to a command buffer.

During EndFrame the command buffers of each thread are submitted to the queue.

Note that new resources are allocated via AMD's VulkanMemoryAllocator and it may not be lock-free.

Deleting of resources is managed via HgiVkGarbageCollector, more on that in EndFrame.

⚠️ It is unclear to me if having a command buffer for every thread is the right approach. If we have 16 cores, it means we have 16 command buffers per frame. If we do parallel encoding, add another 16 secondary command buffers. Since Hydra uses tbb parallel_for (with grainSize=1) we have no direct control over how wide it goes. If this ends up being an unfavorable amount of command buffers, we can always take the approach Storm currently takes. During sync it collects the resource changes (tbb concurrent queue), but does not record GPU commands. The GPU commands (OpenGL) are generated during at the start of RenderPass execute where we could exactly control how many threads to use.

Render Pass Execute

A render pass in Hydra terms is the begin (and end) of one frame of rendering. Hydra provided a list of render targets to fill ('AOVs').

The basic responsibility is that during execute, for each rprim, we record one or more draw calls. This could be single thread or multi-threaded. It is entirely up to the RenderDelegate to implement.

In HgiVk we support parallel draw-call recording via HgiVkParallelCommandEncoder (PCE). Similar to resource sync, PCE ensures there is a (draw) command buffer for each thread.

However, vulkan requires that one 'render pass' (a set of attachments + draw calls) begins and ends in one primary command buffer. To do parallel command recording vulkan requires us to use secondary command buffers.

PCE will interface with HgiVkCommandBufferManager to ensure each draw-call-thread has exclusive access to a secondary command buffer. Again via TLS. When the PCE finishes rendering it 'executes' the secondary command buffers into the primary.

[ Toy soldier Apple-USDZ, showing primId buffer and parallel encoder in RenderDoc ]

EndFrame

When all sync and execute is completed a call to Hgi::EndFrame is called by the application. This ends recording on all (parallel) command buffers and submits them to the vulkan queue.

There are three frames that we alternate between (ring buffer).

Frame 0 is where the CPU records new draw calls and resource changes. Frame 2 is being consumed by the GPU. Frame 1 is a buffer between CPU and GPU to ensure they are not touching each others changes.

Hydra Sync may choose to destroy (deallocate) a prim at any time without considering if the GPU is still consuming this resource. For that reason we record the desire to destroy a resource into HgiVkGarbageCollector for each frame. When we are about to re-use a frame we permanently destroy the GPU resources in the garbage collector. This ensures the GPU is not currently using a resource we are about to destroy.

The garbage collector is lock free by using a vector-per-thread for collecting to-be-destroyed objects.

Areas explored:

Todo:

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
pxr/imaging		pxr/imaging
LICENSE		LICENSE
README.md		README.md
kitchen.png		kitchen.png
nsight_ui.png		nsight_ui.png
renderDocPrimId.png		renderDocPrimId.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Resource Sync

Render Pass Execute

EndFrame

References:

About

Releases

Packages

Languages

License

cyhunter/hgiVk

Folders and files

Latest commit

History

Repository files navigation

Resource Sync

Render Pass Execute

EndFrame

References:

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages