diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..37f4655 --- /dev/null +++ b/.gitignore @@ -0,0 +1,2 @@ +[Bb]in/ +[Bb]uild/ \ No newline at end of file diff --git a/README.md b/README.md index 20ee451..65b4fa4 100644 --- a/README.md +++ b/README.md @@ -1,12 +1,120 @@ Vulkan Grass Rendering ================================== -**University of Pennsylvania, CIS 565: GPU Programming and Architecture, Project 5** +**University of Pennsylvania, CIS 565: GPU Programming and Architecture, Project 4** -* (TODO) YOUR NAME HERE -* Tested on: (TODO) Windows 22, i7-2222 @ 2.22GHz 22GB, GTX 222 222MB (Moore 2222 Lab) +* Utkarsh Dwivedi + * [LinkedIn](https://www.linkedin.com/in/udwivedi/), [personal website](https://utkarshdwivedi.com/) +* Tested on: Windows 11 Home, AMD Ryzen 7 5800H @ 3.2GHz 16 GB, Nvidia GeForce RTX 3060 Laptop GPU 6 GB -### (TODO: Your README) +![](img/grassHighlight.gif) -*DO NOT* leave the README to the last minute! It is a crucial part of the -project, and we will not be able to grade you without a good README. +## Introduction + +This is a Vulkan based grass renderer heavily based on the paper [Responsive Real-Time Grass Rendering for General 3D Scenes](https://www.cg.tuwien.ac.at/research/publications/2017/JAHRMANN-2017-RRTG/JAHRMANN-2017-RRTG-draft.pdf). It also draws inspiration from certain elements of [Ghost of Tsushima's procedural grass rendering pipeline](https://www.youtube.com/watch?v=Ibe1JBF5i5Y). + +## Overview + +|Overview of the process| +|:-:| +|![](img/processDiagram.png)| + +### Representing a grass glade + +![](img/grassRepresentation.png) + +A grass blade is represented according to the above image from the grass rendering paper. Each blade struct contains: +- `v0`: position on a **2D plane** where it should be placed, +- `v1`: control point for a Bezier curve (this idea is also employed in Ghost of Tsushima), +- `v2`: end point of the blade +- `up`: vector containing the up direction, in this case `[0,1,0]`, but this can easily be modified to by dynamic based on the ground plane +- `width`: the width of the blade, +- `height`: the *length* of the blade (the paper's terminology calls this the height, but I think length makes more sense), +- `direction`: an angle that represents the orientation of the blade around the up-vector + +### Constructing a scene + +The paper uses poisson-disk sampling to generate blade positions on the plane. Ghost of Tsushima's pipeline uses a grid of multiple `512x512` artist-authored textures to define which blade of grass should be placed in which areas. + +This project generates the positions using a simple random number generator, but the Ghost of Tsushima method is an interesting direction for the future. + +### GPU compute pipeline + +This is where the physical model for the grass blades is evaluated and grass blades are culled. + +First, orientation and distance culling is applied as described in the paper. Orientation culling culls out blades that are parallel to the view vector, and distance culling culls out increasing number of blades by putting them in buckets of distance from the camera. + +|Orientation culling|Distance culling| +|:-:|:-:| +||| + +Next, physical forces are evaluated (wind and gravity). The wind force is a simple 2D perlin noise, similar to the way Ghost of Tsushima handles wind. + +|Perlin noise based wind| +|:-:| +|| + +A recovery force is applied based on a stiffness coefficient to return the grass blade to its initial pose. This uses Hooke's Law of elasticity to calculate the blade's position. + +`(initial pose - updated pose) * stiffness coefficient` + +Applying these forces could lead to invalid configurations for `v2`, the tip of the blade. This is corrected by clamping its position to always be above the ground plane. A final correction is applied to maintain the length of the blade by adjusting v1 and v2. + +Finally, frustum culling is applied to cull out blades that do not lie in the frustum and user-defined near and far clip planes. In the below image, the size of the frustum is *very slightly* reduced to show the effect of blades outside the frustum being culled. + +|Frustum culling| +|:-:| +|| + +The blades that remain after culling are sent to the graphics pipeline to be tessellated. + +### GPU graphics pipeline + +**Vertex Shader** + +The vertex shader is a simple pass-through shader that passes information on to the tessellation control shader after converting world-space positions to camera-space positions. + +**Tessellation Control Shader** + +The tessellation control shader defines the tessellation levels (blade LODs) on a grass blade based on the distance of the blade from the camera. + +|Blade LODs based on distance from camera| +|:-:| +|| + +**Tessellation Evaluation Shader** + +The tessellation evaluation shader does the actual tessellation of the vertices. This is done using the DeCastlejau algorithm to evaluate each vertex's position on along bezier curve and thickness value. Refer the paper for more details. + +**Fragment Shader** + +The fragment shader applies a simple lambertian shading model to colour the grass blades. + +## Performance Analysis + +For performance analysis, a scene resolution of 1280x720 was used. + +### Culling + +For analysing culling, the number of grass blades in the scene was kept at **213**. + +|Frame rate at different culling strategies| +|:-:| +|![](img/fpsVsCulling.png)| + +This is pretty expected. Each culling method is slightly more advanced and improves performance individually. When all culling is applied after physics calculations, the performance gain is improved further. Changing this to applying **orientation** and **distance** culling before computing physics, and only applying the **frustum** culling after computing physics, similar to Ghost of Tsushima, has an further improved performance. This has potential of improvement by implementing occlusion culling, which would really start showing its potential with very high blade counts. + +### Varying blade counts + +For analysing FPS with increasing number of grass blades, no culling was tested against "pre+post" culling. + +|Frame rate at increasing blade counts| +|:-:| +|![](img/fpsVsBlades.png)| + +The frame rate really starts to take a hit once the blade count increases logarithmically beyond a very small number. This is expected, and this is exactly where strategies like occlusion culling, tiling (ref. Ghost of Tsushima talk), etc. will help. + +## References + +- [Responsive Real-Time Grass Rendering for General 3D Scenes](https://www.cg.tuwien.ac.at/research/publications/2017/JAHRMANN-2017-RRTG/JAHRMANN-2017-RRTG-draft.pdf) +- [Sucker Punch Productions, Ghost of Tsushima's procedural grass rendering pipeline talk](https://www.youtube.com/watch?v=Ibe1JBF5i5Y) \ No newline at end of file diff --git a/img/LOD_tesc.gif b/img/LOD_tesc.gif new file mode 100644 index 0000000..dc28d02 Binary files /dev/null and b/img/LOD_tesc.gif differ diff --git a/img/distanceCulling.gif b/img/distanceCulling.gif new file mode 100644 index 0000000..13a8a5a Binary files /dev/null and b/img/distanceCulling.gif differ diff --git a/img/fpsVsBlades.png b/img/fpsVsBlades.png new file mode 100644 index 0000000..220ab7a Binary files /dev/null and b/img/fpsVsBlades.png differ diff --git a/img/fpsVsCulling.png b/img/fpsVsCulling.png new file mode 100644 index 0000000..a0bc651 Binary files /dev/null and b/img/fpsVsCulling.png differ diff --git a/img/frustumCulling.gif b/img/frustumCulling.gif new file mode 100644 index 0000000..ed02bbe Binary files /dev/null and b/img/frustumCulling.gif differ diff --git a/img/frustumCulling2.gif b/img/frustumCulling2.gif new file mode 100644 index 0000000..40726c7 Binary files /dev/null and b/img/frustumCulling2.gif differ diff --git a/img/grass.gif b/img/grass.gif index 78f008e..7d14d3b 100644 Binary files a/img/grass.gif and b/img/grass.gif differ diff --git a/img/grass1.png b/img/grass1.png new file mode 100644 index 0000000..6c39fa6 Binary files /dev/null and b/img/grass1.png differ diff --git a/img/grassHighlight.gif b/img/grassHighlight.gif new file mode 100644 index 0000000..e870944 Binary files /dev/null and b/img/grassHighlight.gif differ diff --git a/img/grassRepresentation.png b/img/grassRepresentation.png new file mode 100644 index 0000000..a83df80 Binary files /dev/null and b/img/grassRepresentation.png differ diff --git a/img/orientationCulling.gif b/img/orientationCulling.gif new file mode 100644 index 0000000..3886c90 Binary files /dev/null and b/img/orientationCulling.gif differ diff --git a/img/processDiagram.png b/img/processDiagram.png new file mode 100644 index 0000000..72690fb Binary files /dev/null and b/img/processDiagram.png differ diff --git a/img/wind_perlin.gif b/img/wind_perlin.gif new file mode 100644 index 0000000..c3fd0e7 Binary files /dev/null and b/img/wind_perlin.gif differ diff --git a/src/Blades.cpp b/src/Blades.cpp index 80e3d76..c7060cc 100644 --- a/src/Blades.cpp +++ b/src/Blades.cpp @@ -45,7 +45,8 @@ Blades::Blades(Device* device, VkCommandPool commandPool, float planeDim) : Mode indirectDraw.firstInstance = 0; BufferUtils::CreateBufferFromData(device, commandPool, blades.data(), NUM_BLADES * sizeof(Blade), VK_BUFFER_USAGE_STORAGE_BUFFER_BIT, bladesBuffer, bladesBufferMemory); - BufferUtils::CreateBuffer(device, NUM_BLADES * sizeof(Blade), VK_BUFFER_USAGE_STORAGE_BUFFER_BIT, VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT, culledBladesBuffer, culledBladesBufferMemory); + // This one is also going to be drawn so we need to also flag it as vertex buffer bit + BufferUtils::CreateBuffer(device, NUM_BLADES * sizeof(Blade), VK_BUFFER_USAGE_VERTEX_BUFFER_BIT | VK_BUFFER_USAGE_STORAGE_BUFFER_BIT, VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT, renderableBladesBuffer, renderableBladesBufferMemory); BufferUtils::CreateBufferFromData(device, commandPool, &indirectDraw, sizeof(BladeDrawIndirect), VK_BUFFER_USAGE_STORAGE_BUFFER_BIT | VK_BUFFER_USAGE_INDIRECT_BUFFER_BIT, numBladesBuffer, numBladesBufferMemory); } @@ -53,8 +54,8 @@ VkBuffer Blades::GetBladesBuffer() const { return bladesBuffer; } -VkBuffer Blades::GetCulledBladesBuffer() const { - return culledBladesBuffer; +VkBuffer Blades::GetRenderableBladesBuffer() const { + return renderableBladesBuffer; } VkBuffer Blades::GetNumBladesBuffer() const { @@ -64,8 +65,8 @@ VkBuffer Blades::GetNumBladesBuffer() const { Blades::~Blades() { vkDestroyBuffer(device->GetVkDevice(), bladesBuffer, nullptr); vkFreeMemory(device->GetVkDevice(), bladesBufferMemory, nullptr); - vkDestroyBuffer(device->GetVkDevice(), culledBladesBuffer, nullptr); - vkFreeMemory(device->GetVkDevice(), culledBladesBufferMemory, nullptr); + vkDestroyBuffer(device->GetVkDevice(), renderableBladesBuffer, nullptr); + vkFreeMemory(device->GetVkDevice(), renderableBladesBufferMemory, nullptr); vkDestroyBuffer(device->GetVkDevice(), numBladesBuffer, nullptr); vkFreeMemory(device->GetVkDevice(), numBladesBufferMemory, nullptr); } diff --git a/src/Blades.h b/src/Blades.h index 9bd1eed..582c80e 100644 --- a/src/Blades.h +++ b/src/Blades.h @@ -7,10 +7,10 @@ constexpr static unsigned int NUM_BLADES = 1 << 13; constexpr static float MIN_HEIGHT = 1.3f; constexpr static float MAX_HEIGHT = 2.5f; -constexpr static float MIN_WIDTH = 0.1f; -constexpr static float MAX_WIDTH = 0.14f; +constexpr static float MIN_WIDTH = 0.05f; +constexpr static float MAX_WIDTH = 0.07f; constexpr static float MIN_BEND = 7.0f; -constexpr static float MAX_BEND = 13.0f; +constexpr static float MAX_BEND = 10.0f; struct Blade { // Position and direction @@ -72,17 +72,17 @@ struct BladeDrawIndirect { class Blades : public Model { private: VkBuffer bladesBuffer; - VkBuffer culledBladesBuffer; + VkBuffer renderableBladesBuffer; VkBuffer numBladesBuffer; VkDeviceMemory bladesBufferMemory; - VkDeviceMemory culledBladesBufferMemory; + VkDeviceMemory renderableBladesBufferMemory; VkDeviceMemory numBladesBufferMemory; public: Blades(Device* device, VkCommandPool commandPool, float planeDim); VkBuffer GetBladesBuffer() const; - VkBuffer GetCulledBladesBuffer() const; + VkBuffer GetRenderableBladesBuffer() const; VkBuffer GetNumBladesBuffer() const; ~Blades(); }; diff --git a/src/Camera.cpp b/src/Camera.cpp index 3afb5b8..aa5d063 100644 --- a/src/Camera.cpp +++ b/src/Camera.cpp @@ -15,6 +15,7 @@ Camera::Camera(Device* device, float aspectRatio) : device(device) { cameraBufferObject.viewMatrix = glm::lookAt(glm::vec3(0.0f, 1.0f, 10.0f), glm::vec3(0.0f, 1.0f, 0.0f), glm::vec3(0.0f, 1.0f, 0.0f)); cameraBufferObject.projectionMatrix = glm::perspective(glm::radians(45.0f), aspectRatio, 0.1f, 100.0f); cameraBufferObject.projectionMatrix[1][1] *= -1; // y-coordinate is flipped + cameraBufferObject.invViewMatrix = glm::inverse(cameraBufferObject.viewMatrix); BufferUtils::CreateBuffer(device, sizeof(CameraBufferObject), VK_BUFFER_USAGE_UNIFORM_BUFFER_BIT, VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT | VK_MEMORY_PROPERTY_HOST_COHERENT_BIT, buffer, bufferMemory); vkMapMemory(device->GetVkDevice(), bufferMemory, 0, sizeof(CameraBufferObject), 0, &mappedData); @@ -37,7 +38,7 @@ void Camera::UpdateOrbit(float deltaX, float deltaY, float deltaZ) { glm::mat4 finalTransform = glm::translate(glm::mat4(1.0f), glm::vec3(0.0f)) * rotation * glm::translate(glm::mat4(1.0f), glm::vec3(0.0f, 1.0f, r)); cameraBufferObject.viewMatrix = glm::inverse(finalTransform); - + cameraBufferObject.invViewMatrix = finalTransform; memcpy(mappedData, &cameraBufferObject, sizeof(CameraBufferObject)); } diff --git a/src/Camera.h b/src/Camera.h index 6b10747..0a4071c 100644 --- a/src/Camera.h +++ b/src/Camera.h @@ -7,6 +7,7 @@ struct CameraBufferObject { glm::mat4 viewMatrix; glm::mat4 projectionMatrix; + glm::mat4 invViewMatrix; }; class Camera { diff --git a/src/Instance.cpp b/src/Instance.cpp index 7f6b01c..e1c0d3f 100644 --- a/src/Instance.cpp +++ b/src/Instance.cpp @@ -258,6 +258,9 @@ void Instance::PickPhysicalDevice(std::vector deviceExtensions, Que } } + VkPhysicalDeviceProperties props; + vkGetPhysicalDeviceProperties(device, &props); + if (requiredQueues[QueueFlags::Present]) { // Get basic surface capabilities vkGetPhysicalDeviceSurfaceCapabilitiesKHR(device, surface, &surfaceCapabilities); diff --git a/src/Renderer.cpp b/src/Renderer.cpp index b445d04..4a28714 100644 --- a/src/Renderer.cpp +++ b/src/Renderer.cpp @@ -195,9 +195,33 @@ void Renderer::CreateTimeDescriptorSetLayout() { } void Renderer::CreateComputeDescriptorSetLayout() { - // TODO: Create the descriptor set layout for the compute pipeline + // Create the descriptor set layout for the compute pipeline // Remember this is like a class definition stating why types of information // will be stored at each binding + std::vector bindings = {}; + + for (int i = 0; i < 3; i++) + { + // 1 for each type + VkDescriptorSetLayoutBinding layoutBinding = {}; + layoutBinding.binding = i; + layoutBinding.descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER; + layoutBinding.descriptorCount = 1; + layoutBinding.stageFlags = VK_SHADER_STAGE_COMPUTE_BIT; + layoutBinding.pImmutableSamplers = nullptr; + + bindings.push_back(layoutBinding); + } + + VkDescriptorSetLayoutCreateInfo layoutInfo = {}; + layoutInfo.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_CREATE_INFO; + layoutInfo.bindingCount = static_cast(bindings.size()); + layoutInfo.pBindings = bindings.data(); + + if (vkCreateDescriptorSetLayout(logicalDevice, &layoutInfo, nullptr, &computeDescriptorSetLayout) != VK_SUCCESS) + { + throw std::runtime_error("Failed to create compute descriptor set layout"); + } } void Renderer::CreateDescriptorPool() { @@ -215,7 +239,10 @@ void Renderer::CreateDescriptorPool() { // Time (compute) { VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER , 1 }, - // TODO: Add any additional types and counts of descriptors you will need to allocate + // Storage buffers (compute) + { VK_DESCRIPTOR_TYPE_STORAGE_BUFFER, static_cast(3 * scene->GetBlades().size())}, + + // Add any additional types and counts of descriptors you will need to allocate }; VkDescriptorPoolCreateInfo poolInfo = {}; @@ -318,8 +345,44 @@ void Renderer::CreateModelDescriptorSets() { } void Renderer::CreateGrassDescriptorSets() { - // TODO: Create Descriptor sets for the grass. + // Create Descriptor sets for the grass. // This should involve creating descriptor sets which point to the model matrix of each group of grass blades + grassDescriptorSets.resize(scene->GetBlades().size()); + + // Describe the desciptor set + VkDescriptorSetLayout layouts[] = { modelDescriptorSetLayout }; + VkDescriptorSetAllocateInfo allocInfo = {}; + allocInfo.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_ALLOCATE_INFO; + allocInfo.descriptorPool = descriptorPool; + allocInfo.descriptorSetCount = static_cast(grassDescriptorSets.size()); + allocInfo.pSetLayouts = layouts; + + if (vkAllocateDescriptorSets(logicalDevice, &allocInfo, grassDescriptorSets.data()) != VK_SUCCESS) + { + throw std::runtime_error("Failed to allocate grass descriptor sets"); + } + + std::vector descriptorWrites(grassDescriptorSets.size()); + + for (uint32_t i = 0; i < scene->GetBlades().size(); ++i) { + VkDescriptorBufferInfo grassModelBufferInfo = {}; + grassModelBufferInfo.buffer = scene->GetBlades()[i]->GetModelBuffer(); + grassModelBufferInfo.offset = 0; + grassModelBufferInfo.range = sizeof(ModelBufferObject); + + descriptorWrites[i].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET; + descriptorWrites[i].dstSet = grassDescriptorSets[i]; + descriptorWrites[i].dstBinding = 0; + descriptorWrites[i].dstArrayElement = 0; + descriptorWrites[i].descriptorType = VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER; + descriptorWrites[i].descriptorCount = 1; + descriptorWrites[i].pBufferInfo = &grassModelBufferInfo; + descriptorWrites[i].pImageInfo = nullptr; + descriptorWrites[i].pTexelBufferView = nullptr; + } + + // Update descriptor sets + vkUpdateDescriptorSets(logicalDevice, static_cast(descriptorWrites.size()), descriptorWrites.data(), 0, nullptr); } void Renderer::CreateTimeDescriptorSet() { @@ -358,8 +421,77 @@ void Renderer::CreateTimeDescriptorSet() { } void Renderer::CreateComputeDescriptorSets() { - // TODO: Create Descriptor sets for the compute pipeline - // The descriptors should point to Storage buffers which will hold the grass blades, the culled grass blades, and the output number of grass blades + // Create Descriptor sets for the compute pipeline + // The descriptors should point to Storage buffers which will hold the grass blades, the culled grass blades, and the output number of grass blades + computeDescriptorSets.resize(scene->GetBlades().size()); + + // Describe the desciptor set + VkDescriptorSetLayout layouts[] = { computeDescriptorSetLayout }; + VkDescriptorSetAllocateInfo allocInfo = {}; + allocInfo.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_ALLOCATE_INFO; + allocInfo.descriptorPool = descriptorPool; + allocInfo.descriptorSetCount = static_cast(computeDescriptorSets.size()); + allocInfo.pSetLayouts = layouts; + + // Allocate descriptor sets + if (vkAllocateDescriptorSets(logicalDevice, &allocInfo, computeDescriptorSets.data()) != VK_SUCCESS) + { + throw std::runtime_error("Failed to allocate compute descriptor sets"); + } + + std::vector descriptorWrites(3 * computeDescriptorSets.size()); + + for (uint32_t i = 0; i < scene->GetBlades().size(); i++) + { + // Blades + VkDescriptorBufferInfo bladesBufferInfo = {}; + bladesBufferInfo.buffer = scene->GetBlades()[i]->GetBladesBuffer(); + bladesBufferInfo.offset = 0; + bladesBufferInfo.range = NUM_BLADES * sizeof(Blade); + + descriptorWrites[3 * i + 0].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET; + descriptorWrites[3 * i + 0].dstSet = computeDescriptorSets[i]; + descriptorWrites[3 * i + 0].dstBinding = 0; + descriptorWrites[3 * i + 0].dstArrayElement = 0; + descriptorWrites[3 * i + 0].descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER; + descriptorWrites[3 * i + 0].descriptorCount = 1; + descriptorWrites[3 * i + 0].pBufferInfo = &bladesBufferInfo; + descriptorWrites[3 * i + 0].pImageInfo = nullptr; + descriptorWrites[3 * i + 0].pTexelBufferView = nullptr; + + VkDescriptorBufferInfo renderableBladesBufferInfo = {}; + renderableBladesBufferInfo.buffer = scene->GetBlades()[i]->GetRenderableBladesBuffer(); + renderableBladesBufferInfo.offset = 0; + renderableBladesBufferInfo.range = NUM_BLADES * sizeof(Blade); + + descriptorWrites[3 * i + 1].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET; + descriptorWrites[3 * i + 1].dstSet = computeDescriptorSets[i]; + descriptorWrites[3 * i + 1].dstBinding = 1; + descriptorWrites[3 * i + 1].dstArrayElement = 0; + descriptorWrites[3 * i + 1].descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER; + descriptorWrites[3 * i + 1].descriptorCount = 1; + descriptorWrites[3 * i + 1].pBufferInfo = &renderableBladesBufferInfo; + descriptorWrites[3 * i + 1].pImageInfo = nullptr; + descriptorWrites[3 * i + 1].pTexelBufferView = nullptr; + + VkDescriptorBufferInfo nBladesBufferInfo = {}; + nBladesBufferInfo.buffer = scene->GetBlades()[i]->GetNumBladesBuffer(); + nBladesBufferInfo.offset = 0; + nBladesBufferInfo.range = sizeof(BladeDrawIndirect); + + descriptorWrites[3 * i + 2].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET; + descriptorWrites[3 * i + 2].dstSet = computeDescriptorSets[i]; + descriptorWrites[3 * i + 2].dstBinding = 2; + descriptorWrites[3 * i + 2].dstArrayElement = 0; + descriptorWrites[3 * i + 2].descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER; + descriptorWrites[3 * i + 2].descriptorCount = 1; + descriptorWrites[3 * i + 2].pBufferInfo = &nBladesBufferInfo; + descriptorWrites[3 * i + 2].pImageInfo = nullptr; + descriptorWrites[3 * i + 2].pTexelBufferView = nullptr; + } + + // Update descriptor sets + vkUpdateDescriptorSets(logicalDevice, static_cast(descriptorWrites.size()), descriptorWrites.data(), 0, nullptr); } void Renderer::CreateGraphicsPipeline() { @@ -716,8 +848,8 @@ void Renderer::CreateComputePipeline() { computeShaderStageInfo.module = computeShaderModule; computeShaderStageInfo.pName = "main"; - // TODO: Add the compute dsecriptor set layout you create to this list - std::vector descriptorSetLayouts = { cameraDescriptorSetLayout, timeDescriptorSetLayout }; + // Add the compute dsecriptor set layout you create to this list + std::vector descriptorSetLayouts = { cameraDescriptorSetLayout, timeDescriptorSetLayout, computeDescriptorSetLayout }; // Create pipeline layout VkPipelineLayoutCreateInfo pipelineLayoutInfo = {}; @@ -883,7 +1015,12 @@ void Renderer::RecordComputeCommandBuffer() { // Bind descriptor set for time uniforms vkCmdBindDescriptorSets(computeCommandBuffer, VK_PIPELINE_BIND_POINT_COMPUTE, computePipelineLayout, 1, 1, &timeDescriptorSet, 0, nullptr); - // TODO: For each group of blades bind its descriptor set and dispatch + // For each group of blades bind its descriptor set and dispatch + for (int i = 0; i < computeDescriptorSets.size(); i++) + { + vkCmdBindDescriptorSets(computeCommandBuffer, VK_PIPELINE_BIND_POINT_COMPUTE, computePipelineLayout, 2, 1, &computeDescriptorSets[i], 0, nullptr); + vkCmdDispatch(computeCommandBuffer, (NUM_BLADES + WORKGROUP_SIZE - 1)/ WORKGROUP_SIZE, 1, 1); + } // ~ End recording ~ if (vkEndCommandBuffer(computeCommandBuffer) != VK_SUCCESS) { @@ -973,16 +1110,14 @@ void Renderer::RecordCommandBuffers() { vkCmdBindPipeline(commandBuffers[i], VK_PIPELINE_BIND_POINT_GRAPHICS, grassPipeline); for (uint32_t j = 0; j < scene->GetBlades().size(); ++j) { - VkBuffer vertexBuffers[] = { scene->GetBlades()[j]->GetCulledBladesBuffer() }; + VkBuffer vertexBuffers[] = { scene->GetBlades()[j]->GetRenderableBladesBuffer() }; VkDeviceSize offsets[] = { 0 }; - // TODO: Uncomment this when the buffers are populated - // vkCmdBindVertexBuffers(commandBuffers[i], 0, 1, vertexBuffers, offsets); - - // TODO: Bind the descriptor set for each grass blades model + vkCmdBindVertexBuffers(commandBuffers[i], 0, 1, vertexBuffers, offsets); + // Bind the descriptor set for each grass blades model + vkCmdBindDescriptorSets(commandBuffers[i], VK_PIPELINE_BIND_POINT_GRAPHICS, grassPipelineLayout, 1, 1, &grassDescriptorSets[j], 0, nullptr); // Draw - // TODO: Uncomment this when the buffers are populated - // vkCmdDrawIndirect(commandBuffers[i], scene->GetBlades()[j]->GetNumBladesBuffer(), 0, 1, sizeof(BladeDrawIndirect)); + vkCmdDrawIndirect(commandBuffers[i], scene->GetBlades()[j]->GetNumBladesBuffer(), 0, 1, sizeof(BladeDrawIndirect)); } // End render pass @@ -1057,6 +1192,7 @@ Renderer::~Renderer() { vkDestroyDescriptorSetLayout(logicalDevice, cameraDescriptorSetLayout, nullptr); vkDestroyDescriptorSetLayout(logicalDevice, modelDescriptorSetLayout, nullptr); vkDestroyDescriptorSetLayout(logicalDevice, timeDescriptorSetLayout, nullptr); + vkDestroyDescriptorSetLayout(logicalDevice, computeDescriptorSetLayout, nullptr); vkDestroyDescriptorPool(logicalDevice, descriptorPool, nullptr); diff --git a/src/Renderer.h b/src/Renderer.h index 95e025f..36caa9b 100644 --- a/src/Renderer.h +++ b/src/Renderer.h @@ -56,12 +56,15 @@ class Renderer { VkDescriptorSetLayout cameraDescriptorSetLayout; VkDescriptorSetLayout modelDescriptorSetLayout; VkDescriptorSetLayout timeDescriptorSetLayout; + VkDescriptorSetLayout computeDescriptorSetLayout; VkDescriptorPool descriptorPool; VkDescriptorSet cameraDescriptorSet; std::vector modelDescriptorSets; VkDescriptorSet timeDescriptorSet; + std::vector grassDescriptorSets; + std::vector computeDescriptorSets; VkPipelineLayout graphicsPipelineLayout; VkPipelineLayout grassPipelineLayout; diff --git a/src/SwapChain.cpp b/src/SwapChain.cpp index 711fec0..4017431 100644 --- a/src/SwapChain.cpp +++ b/src/SwapChain.cpp @@ -74,14 +74,17 @@ SwapChain::SwapChain(Device* device, VkSurfaceKHR vkSurface, unsigned int numBuf } } -void SwapChain::Create() { +void SwapChain::Create(int w, int h) { auto* instance = device->GetInstance(); const auto& surfaceCapabilities = instance->GetSurfaceCapabilities(); VkSurfaceFormatKHR surfaceFormat = chooseSwapSurfaceFormat(instance->GetSurfaceFormats()); VkPresentModeKHR presentMode = chooseSwapPresentMode(instance->GetPresentModes()); - VkExtent2D extent = chooseSwapExtent(surfaceCapabilities, GetGLFWWindow()); + VkExtent2D extent{ w , h }; + if (w == 0 || h == 0) { + extent = chooseSwapExtent(surfaceCapabilities, GetGLFWWindow()); + } uint32_t imageCount = surfaceCapabilities.minImageCount + 1; imageCount = numBuffers > imageCount ? numBuffers : imageCount; @@ -188,9 +191,9 @@ VkSemaphore SwapChain::GetRenderFinishedVkSemaphore() const { return renderFinishedSemaphore; } -void SwapChain::Recreate() { +void SwapChain::Recreate(int w, int h) { Destroy(); - Create(); + Create(w, h); } bool SwapChain::Acquire() { @@ -199,13 +202,12 @@ bool SwapChain::Acquire() { vkQueueWaitIdle(device->GetQueue(QueueFlags::Present)); } VkResult result = vkAcquireNextImageKHR(device->GetVkDevice(), vkSwapChain, std::numeric_limits::max(), imageAvailableSemaphore, VK_NULL_HANDLE, &imageIndex); - if (result != VK_SUCCESS && result != VK_SUBOPTIMAL_KHR) { - throw std::runtime_error("Failed to acquire swap chain image"); - } if (result == VK_ERROR_OUT_OF_DATE_KHR) { Recreate(); return false; + } else if (result != VK_SUCCESS) { + throw std::runtime_error("Failed to present swap chain image"); } return true; diff --git a/src/SwapChain.h b/src/SwapChain.h index dbafcf0..318b41b 100644 --- a/src/SwapChain.h +++ b/src/SwapChain.h @@ -17,14 +17,14 @@ class SwapChain { VkSemaphore GetImageAvailableVkSemaphore() const; VkSemaphore GetRenderFinishedVkSemaphore() const; - void Recreate(); + void Recreate(int w = 0, int h = 0); bool Acquire(); bool Present(); ~SwapChain(); private: SwapChain(Device* device, VkSurfaceKHR vkSurface, unsigned int numBuffers); - void Create(); + void Create(int w = 0, int h = 0); void Destroy(); Device* device; diff --git a/src/main.cpp b/src/main.cpp index 8bf822b..34c1540 100644 --- a/src/main.cpp +++ b/src/main.cpp @@ -1,4 +1,5 @@ #include +#include #include "Instance.h" #include "Window.h" #include "Renderer.h" @@ -16,7 +17,7 @@ namespace { if (width == 0 || height == 0) return; vkDeviceWaitIdle(device->GetVkDevice()); - swapChain->Recreate(); + swapChain->Recreate(width, height); renderer->RecreateFrameResources(); } @@ -143,7 +144,26 @@ int main() { glfwSetMouseButtonCallback(GetGLFWWindow(), mouseDownCallback); glfwSetCursorPosCallback(GetGLFWWindow(), mouseMoveCallback); + double fps = 0.0; + double lastTime = 0.0f; + int frames = 0; + + std::string s; + while (!ShouldQuit()) { + frames++; + double curTime = glfwGetTime(); + + if (curTime - lastTime >= 1.0) // 1 second passed + { + fps = static_cast(frames) / (curTime - lastTime); + lastTime = curTime; + frames = 0; + } + + s = ("Vulkan Grass Renderer | FPS: ") + std::to_string(fps); + glfwSetWindowTitle(GetGLFWWindow(), s.c_str()); + glfwPollEvents(); scene->UpdateTime(); renderer->Frame(); diff --git a/src/shaders/compute.comp b/src/shaders/compute.comp index 0fd0224..a73e801 100644 --- a/src/shaders/compute.comp +++ b/src/shaders/compute.comp @@ -7,6 +7,7 @@ layout(local_size_x = WORKGROUP_SIZE, local_size_y = 1, local_size_z = 1) in; layout(set = 0, binding = 0) uniform CameraBufferObject { mat4 view; mat4 proj; + mat4 invView; } camera; layout(set = 1, binding = 0) uniform Time { @@ -21,36 +22,304 @@ struct Blade { vec4 up; }; -// TODO: Add bindings to: -// 1. Store the input blades -// 2. Write out the culled blades +// 1. All blades + +layout(set = 2, binding = 0) buffer AllBlades +{ + Blade allBlades[]; +}; +// 2. Blades that are remaining after culling +layout(set = 2, binding = 1) buffer RenderableBlades +{ + Blade renderableBlades[]; +}; // 3. Write the total number of blades remaining +layout(set = 2, binding = 2) buffer NumBlades { + uint vertexCount; // Write the number of blades remaining here + uint instanceCount; // = 1 + uint firstVertex; // = 0 + uint firstInstance; // = 0 +} numBlades; + +/* ============================= */ +/* ====== NOISE FUNCTIONS ====== */ +/* ============================= */ + +// Most of these functions are from CIS 560 slides, slightly tweaked for this use case + +vec2 random2(vec2 p) +{ + return fract(sin(vec2(dot(p, vec2(127.1f, 311.7f)), + dot(p, vec2(269.5f,183.3f)))) + * 43758.5453f); +} + +float surflet2D(vec2 p, vec2 gridPoint) +{ + // Compute falloff function by converting linear distance to a polynomial + float distX = abs(p.x - gridPoint.x); + float distY = abs(p.y - gridPoint.y); + float tX = 1 - 6 * pow(distX, 5.f) + 15 * pow(distX, 4.f) - 10 * pow(distX, 3.f); + float tY = 1 - 6 * pow(distY, 5.f) + 15 * pow(distY, 4.f) - 10 * pow(distY, 3.f); + // Get the random vector for the grid point + vec2 rand = random2(gridPoint); + vec2 gradient = normalize(2.f * rand - vec2(1.f)); + // Get the vector from the grid point to P + vec2 diff = p - gridPoint; + // Get the value of our height field by dotting grid->P with our gradient + float height = dot(diff, gradient); + // Scale our height field (i.e. reduce it) by our polynomial falloff function + return height * tX * tY; +} + +float perlinNoise2D(vec2 p) +{ + float surfletSum = 0.f; + // Iterate over the four integer corners surrounding uv + for(int dx = 0; dx <= 1; ++dx) { + for(int dy = 0; dy <= 1; ++dy) { + surfletSum += surflet2D(p, floor(p) + vec2(dx, dy)); + } + } + return surfletSum; +} + +/* ==================== */ +/* ====== FORCES ====== */ +/* ==================== */ + +#define BLADE_MASS 0.01f +#define GRAVITY_DIRECTION vec3(0.0, 1.0, 0.0) +#define GRAVITY_ACCELERATION -1.0f +#define WIND_STRENGTH vec2(15.0, 0.0) + +#define WIND_SCROLL_SPEED vec2(1.0, 0.2) + + +// This is inspired from Ghost of Tsushima's procedural grass system +// source: https://www.youtube.com/watch?v=Ibe1JBF5i5Y +vec3 getWindForce(const vec3 v0, const vec3 v2, const vec3 up, const float height) +{ + // We don't care about y value here, only scrolling perlin noise in the 2D XZ plane + vec2 uv = v0.xz; + + // Scroll UV + uv += WIND_SCROLL_SPEED * totalTime; + + // Get wind noise + float noise = perlinNoise2D(uv * 0.5f) * 0.5f; + + vec2 wind2d = WIND_STRENGTH * noise; + + vec3 windInfluence = vec3(wind2d.x, 0.0f, wind2d.y); + + // Refer section 5.1 of paper + vec3 dir = v2 - v0; + float fd = 1.0 - abs(dot(normalize(windInfluence), normalize(dir))); + float fr = dot(dir, up) / height; + float theta = fd * fr; + + return windInfluence * theta; +} + +vec3 getExternalForces(const vec3 v0, const vec3 v1, const vec3 v2, const vec3 up, vec3 bitangent, + const float angle, const float width, const float height, const float stiffnessCoeff) +{ + // This is how we initialize v2 in Blades.cpp + // If we change that initialization this will have to be changed! + const vec3 initialV2 = v0 + height * up; + + // Apply forces on every blade and update the vertices in the buffer + + // Recovery force + vec3 recoveryForce = (initialV2 - v2) * stiffnessCoeff * 0.5f; // Hooke's Law + + // Gravitational forces (main + front) + vec3 gravityEnvironmental = BLADE_MASS * GRAVITY_DIRECTION * GRAVITY_ACCELERATION; // f = ma + vec3 front = normalize(cross(bitangent, up)); + vec3 gravityFront = 0.25f * gravityEnvironmental * front; + + vec3 gravity = gravityEnvironmental + gravityFront; + + vec3 wind = getWindForce(v0, v2, up, height); + + vec3 totalForce = (recoveryForce + gravity + wind) * deltaTime; + return totalForce; +} + +void validateState(const vec3 v0, inout vec3 v1, inout vec3 v2, const vec3 up, const float height) +{ + vec3 dir = v2 - v0; + float dotDirUp = dot(dir, up); + v2 = v2 - up * min(dotDirUp, 0); // maintain v2 above ground plane level + float lProj = length(dir - up * dotDirUp); + + float lProjOverHeight = lProj / height; + v1 = v0 + height * up * max(1.0 - lProjOverHeight, 0.05f * max(lProjOverHeight, 1.0f)); + + float L0 = distance(v0, v2); // distance between first and last control pt + float L1 = distance(v0, v1) + distance(v1, v2); // sum of distance between all consecutive pts + + // formula in paper is generalized L = (2 * L0 + (n-1) * L1) / (n+1) + // n is 3 in our case, so this simplifies to (2 * L0 + 2 * L1) / 4 + // which simplifies to 2 * (L0 + L1) / 4 = (L0 + L1) / 2 + float L = (L0 + L1) * 0.5f; + float r = height / L; -// The project is using vkCmdDrawIndirect to use a buffer as the arguments for a draw call -// This is sort of an advanced feature so we've showed you what this buffer should look like -// -// layout(set = ???, binding = ???) buffer NumBlades { -// uint vertexCount; // Write the number of blades remaining here -// uint instanceCount; // = 1 -// uint firstVertex; // = 0 -// uint firstInstance; // = 0 -// } numBlades; + v1 = v0 + r * (v1 - v0); + v2 = v1 + r * (v2 - v1); +} + + +/* ==================== */ +/* ====== CULLING ===== */ +/* ==================== */ + +#define ORIENTATION_CULLING 1 +#define FRUSTUM_CULLING 1 +#define DISTANCE_CULLING 1 +#define DO_ALL_CULLING_POST_PHYSICS 0 + +#define ORIENTATION_CULLING_THRESHOLD 0.97 +#define FRUSTUM_CULLING_TOLERANCE -0.1f +#define FRUSTUM_NEAR_CLIP 0.5 +#define FRUSTUM_FAR_CLIP 20 // this is also the "max distance" in distance culling. +#define DISTANCE_CULLING_BUCKETS 40 + +bool shouldOrientationCull(const vec3 bitangent) +{ + vec3 viewDir = vec3(camera.view[0][2], camera.view[1][2], camera.view[2][2]); + if (abs(dot(viewDir, bitangent)) > ORIENTATION_CULLING_THRESHOLD) + { + return true; + } + + return false; +} bool inBounds(float value, float bounds) { return (value >= -bounds) && (value <= bounds); } +bool isPtInsideFrustum(vec3 p) +{ + vec4 pDash = camera.proj * camera.view * vec4(p, 1.0f); + float h = pDash.w + FRUSTUM_CULLING_TOLERANCE; + + return inBounds(pDash.x, h) && inBounds(pDash.y, h) && pDash.z > FRUSTUM_NEAR_CLIP && pDash.z < FRUSTUM_FAR_CLIP; +} + +bool shouldFrustrumCull(const vec3 v0, const vec3 v1, const vec3 v2) +{ + vec3 midPt = 0.25f * v0 + 0.5f * v1 + 0.25f * v2; + return !isPtInsideFrustum(v0) && !isPtInsideFrustum(midPt) && !isPtInsideFrustum(v2); +} + +bool shouldDistanceCull(const uint idx, const vec3 v0, const vec3 up) +{ + vec3 c = camera.invView[3].xyz; + vec3 dir = v0 - c; + float dProj = length(dir - up * dot(dir, up)); + return dProj > FRUSTUM_FAR_CLIP; + return idx % DISTANCE_CULLING_BUCKETS < floor(DISTANCE_CULLING_BUCKETS * (1.0f - dProj / FRUSTUM_FAR_CLIP)); +} + +// Some culling should happen before even evaluating physical forces +// There's no reason to evaluate physics compute if the blade will be culled! +// This is also something Ghost of Tsushima does +bool shouldCullPrePhysics(const uint idx, const vec3 bitangent, const vec3 v0, const vec3 up) +{ +#if DO_ALL_CULLING_POST_PHYSICS + return false; +#endif + + bool shouldCull = false; + + #if ORIENTATION_CULLING + shouldCull = shouldCull || shouldOrientationCull(bitangent); + #endif + + #if DISTANCE_CULLING + shouldCull = shouldCull || shouldDistanceCull(idx, v0, up); + #endif + + return shouldCull; +} + +bool shouldCullPostPhysics(const uint idx, const vec3 bitangent, const vec3 v0, const vec3 v1, const vec3 v2, const vec3 up) +{ + // Since frustum culling in the paper uses v2 and a midpoint, we can only do this AFTER the physical model has been evaluated and v2 been updated + bool shouldCull = false; + +#if DO_ALL_CULLING_POST_PHYSICS + #if ORIENTATION_CULLING + shouldCull = shouldCull || shouldOrientationCull(bitangent); + #endif + + #if DISTANCE_CULLING + shouldCull = shouldCull || shouldDistanceCull(idx, v0, up); + #endif +#endif + + #if FRUSTUM_CULLING + shouldCull = shouldCull || shouldFrustrumCull(v0, v1, v2); + #endif + + return shouldCull; +} + +/* ==================== */ +/* ======= MAIN ======= */ +/* ==================== */ void main() { + const uint currBladeIdx = gl_GlobalInvocationID.x; + // Reset the number of blades to 0 - if (gl_GlobalInvocationID.x == 0) { - // numBlades.vertexCount = 0; + if (currBladeIdx == 0) { + numBlades.vertexCount = 0; } barrier(); // Wait till all threads reach this point - // TODO: Apply forces on every blade and update the vertices in the buffer + // Parameters + Blade currBlade = allBlades[currBladeIdx]; + vec3 v0 = currBlade.v0.xyz; + vec3 v1 = currBlade.v1.xyz; + vec3 v2 = currBlade.v2.xyz; + const vec3 up = currBlade.up.xyz; + const float angle = currBlade.v0.w; + const float height = currBlade.v1.w; + const float width = currBlade.v2.w; + const float stiffnessCoeff = currBlade.up.w; + vec3 bitangent = normalize(vec3(cos(angle), 0.0f, sin(angle))); + + // Cull blades before the physical model + if (shouldCullPrePhysics(currBladeIdx, bitangent, v0, up)) + { + return; + } + + // Get total force + vec3 totalForce = getExternalForces(v0, v1, v2, up, bitangent, angle, width, height, stiffnessCoeff); + v2 += totalForce; - // TODO: Cull blades that are too far away or not in the camera frustum and write them + // State validation (section 5.2 of paper) + validateState(v0, v1, v2, up, height); + + currBlade.v1.xyz = v1; + currBlade.v2.xyz = v2; + allBlades[currBladeIdx] = currBlade; + + // Cull blades that are too far away or not in the camera frustum and write them // to the culled blades buffer // Note: to do this, you will need to use an atomic operation to read and update numBlades.vertexCount // You want to write the visible blades to the buffer without write conflicts between threads + + // Cull blades + if (shouldCullPostPhysics(currBladeIdx, bitangent, v0, v1, v2, up)) + { + return; + } + + uint idx = atomicAdd(numBlades.vertexCount, 1); + renderableBlades[idx] = currBlade; } diff --git a/src/shaders/graphics.vert b/src/shaders/graphics.vert index fb9bf8e..ff87c80 100644 --- a/src/shaders/graphics.vert +++ b/src/shaders/graphics.vert @@ -4,6 +4,7 @@ layout(set = 0, binding = 0) uniform CameraBufferObject { mat4 view; mat4 proj; + mat4 invView; } camera; layout(set = 1, binding = 0) uniform ModelBufferObject { diff --git a/src/shaders/grass.frag b/src/shaders/grass.frag index c7df157..1842c73 100644 --- a/src/shaders/grass.frag +++ b/src/shaders/grass.frag @@ -4,14 +4,24 @@ layout(set = 0, binding = 0) uniform CameraBufferObject { mat4 view; mat4 proj; + mat4 invView; } camera; // TODO: Declare fragment shader inputs +layout(location = 0) in vec4 fs_pos; +layout(location = 1) in vec4 fs_nor; +layout(location = 2) in vec2 fs_UV; layout(location = 0) out vec4 outColor; +const vec3 lightPos = vec3(10, 50, 10); + void main() { - // TODO: Compute fragment color + vec3 lightDir = normalize(fs_pos.xyz - lightPos); + float diffuseTerm = clamp(dot(lightDir, fs_nor.xyz), 0, 1); + + float ambientTerm = 0.2; + float lightIntensity = diffuseTerm + ambientTerm; - outColor = vec4(1.0); + outColor = vec4(0.0, lightIntensity, 0.0, 1.0); } diff --git a/src/shaders/grass.tesc b/src/shaders/grass.tesc index f9ffd07..611755d 100644 --- a/src/shaders/grass.tesc +++ b/src/shaders/grass.tesc @@ -1,26 +1,50 @@ #version 450 #extension GL_ARB_separate_shader_objects : enable +#define LOD_HIGH 10 +#define LOD_LOW 1 + layout(vertices = 1) out; layout(set = 0, binding = 0) uniform CameraBufferObject { mat4 view; mat4 proj; + mat4 invView; } camera; -// TODO: Declare tessellation control shader inputs and outputs +// Declare tessellation control shader inputs and outputs +layout(location = 0) in vec4 tcs_v1[]; +layout(location = 1) in vec4 tcs_v2[]; +layout(location = 2) in vec4 tcs_up[]; + +layout(location = 0) out vec4 tes_v1[]; +layout(location = 1) out vec4 tes_v2[]; +layout(location = 2) out vec4 tes_up[]; + +in gl_PerVertex +{ + vec4 gl_Position; +} gl_in[gl_MaxPatchVertices]; void main() { // Don't move the origin location of the patch gl_out[gl_InvocationID].gl_Position = gl_in[gl_InvocationID].gl_Position; - // TODO: Write any shader outputs + // Write any shader outputs + tes_v1[gl_InvocationID] = tcs_v1[gl_InvocationID]; + tes_v2[gl_InvocationID] = tcs_v2[gl_InvocationID]; + tes_up[gl_InvocationID] = tcs_up[gl_InvocationID]; + + float dist = distance(gl_out[gl_InvocationID].gl_Position.xyz, camera.invView[3].xyz); + float t = smoothstep(15, 1, dist); + + int tesselationLevel = int(ceil(mix(LOD_LOW, LOD_HIGH, t))); - // TODO: Set level of tesselation - // gl_TessLevelInner[0] = ??? - // gl_TessLevelInner[1] = ??? - // gl_TessLevelOuter[0] = ??? - // gl_TessLevelOuter[1] = ??? - // gl_TessLevelOuter[2] = ??? - // gl_TessLevelOuter[3] = ??? + // Set level of tesselation + gl_TessLevelInner[0] = tesselationLevel; + gl_TessLevelInner[1] = tesselationLevel; + gl_TessLevelOuter[0] = tesselationLevel; + gl_TessLevelOuter[1] = tesselationLevel; + gl_TessLevelOuter[2] = tesselationLevel; + gl_TessLevelOuter[3] = tesselationLevel; } diff --git a/src/shaders/grass.tese b/src/shaders/grass.tese index 751fff6..38a129b 100644 --- a/src/shaders/grass.tese +++ b/src/shaders/grass.tese @@ -1,18 +1,75 @@ #version 450 #extension GL_ARB_separate_shader_objects : enable -layout(quads, equal_spacing, ccw) in; +layout(triangles, equal_spacing, ccw) in; layout(set = 0, binding = 0) uniform CameraBufferObject { mat4 view; mat4 proj; + mat4 invView; } camera; -// TODO: Declare tessellation evaluation shader inputs and outputs +// Declare tessellation evaluation shader inputs and outputs +layout(location = 0) in vec4 tes_v1[]; +layout(location = 1) in vec4 tes_v2[]; +layout(location = 2) in vec4 tes_up[]; + +layout(location = 0) out vec4 fs_pos; +layout(location = 1) out vec4 fs_nor; +layout(location = 2) out vec2 fs_UV; + +// These t functions are described in section 6.3 of the paper +float getQuadT(const float u, const float v) +{ + return u; +} + +float getTriT(const float u, const float v) +{ + return u + 0.5 * v + u * v; +} + +float getQuadraticT(const float u, const float v) +{ + return u - u * v * v; +} + +float getTriTipT(const float u, const float v) +{ + float tau = 0.3; + return 0.5 + (u - 0.5) * (1 - (max(v - tau, 0.0)/1.0 - tau)); +} void main() { float u = gl_TessCoord.x; float v = gl_TessCoord.y; + + // first element of arrays because we only have 1 vertex in patch! + vec3 v0 = gl_in[0].gl_Position.xyz; + vec3 v1 = tes_v1[0].xyz; + vec3 v2 = tes_v2[0].xyz; + + float angle = gl_in[0].gl_Position.w; + float height = tes_v1[0].w; + float width = tes_v2[0].w; + float stiff_coeff = tes_up[0].w; + + // From section 6.3 in the paper + vec3 a = v0 + v * (v1 - v0); + vec3 b = v1 + v * (v2 - v1); + vec3 c = a + v * (b - a); + + vec3 t1 = normalize(vec3(cos(angle), 0.0f, sin(angle))); // fast matrix multiplication for y rotation of vec3.right (1,0,0) + vec3 c0 = c - width * t1; + vec3 c1 = c + width * t1; - // TODO: Use u and v to parameterize along the grass blade and output positions for each vertex of the grass blade + vec3 t0 = normalize(b - a); + fs_nor = vec4(normalize(cross(t0, t1)), 0.0); + + float t = getTriTipT(u, v); + + vec4 pos = vec4(mix(c0, c1, t), 1.0); + gl_Position = camera.proj * camera.view * pos; + fs_pos = pos; + fs_UV = vec2(u, v); } diff --git a/src/shaders/grass.vert b/src/shaders/grass.vert index db9dfe9..cba95a8 100644 --- a/src/shaders/grass.vert +++ b/src/shaders/grass.vert @@ -2,16 +2,34 @@ #version 450 #extension GL_ARB_separate_shader_objects : enable +layout(set = 0, binding = 0) uniform CameraBufferObject { + mat4 view; + mat4 proj; + mat4 invView; +} camera; + layout(set = 1, binding = 0) uniform ModelBufferObject { mat4 model; }; -// TODO: Declare vertex shader inputs and outputs +layout(location = 0) in vec4 vs_v0; // .w = orientation +layout(location = 1) in vec4 vs_v1; // .w = height +layout(location = 2) in vec4 vs_v2; // .w = width +layout(location = 3) in vec4 vs_up; // .w = stiffness coeff + +layout(location = 0) out vec4 tcs_v1; +layout(location = 1) out vec4 tcs_v2; +layout(location = 2) out vec4 tcs_up; out gl_PerVertex { vec4 gl_Position; }; void main() { - // TODO: Write gl_Position and any other shader outputs + // position of blade is v0 + gl_Position = vec4(vec3(model * vec4(vs_v0.xyz, 1.0)), vs_v0.w); + + tcs_v1 = vec4(vec3(model * vec4(vs_v1.xyz, 1.0)), vs_v1.w); + tcs_v2 = vec4(vec3(model * vec4(vs_v2.xyz, 1.0)), vs_v2.w); + tcs_up = vec4(vec3(model * vec4(vs_up.xyz, 0.0)), vs_up.w); }