Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Project 3 #19

Open
wants to merge 23 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,8 @@ set(headers
src/sceneStructs.h
src/preview.h
src/utilities.h
src/common.h
src/efficient.h
)

set(sources
Expand All @@ -84,6 +86,8 @@ set(sources
src/scene.cpp
src/preview.cpp
src/utilities.cpp
src/common.cu
src/efficient.cu
)

list(SORT headers)
Expand Down
127 changes: 121 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,128 @@
CUDA Path Tracer
================

**University of Pennsylvania, CIS 565: GPU Programming and Architecture, Project 3**
**University of Pennsylvania, CIS 565: GPU Programming and Architecture,
Project 3 - CUDA Path Tracer**

* (TODO) YOUR NAME HERE
* Tested on: (TODO) Windows 22, i7-2222 @ 2.22GHz 22GB, GTX 222 222MB (Moore 2222 Lab)
* Srinath Rajagopalan
* [LinkedIn](https://www.linkedin.com/in/srinath-rajagopalan-07a43155), [twitter](https://twitter.com/srinath132)
* Tested on: Windows 10, i7-6700 @ 3.4GHz 16GB, Nvidia Quadro P1000 4GB (Moore 100B Lab)

### (TODO: Your README)
## Path Tracer
<p align='center'>
<img src="data/final_render.png">
</p>

*DO NOT* leave the README to the last minute! It is a crucial part of the
project, and we will not be able to grade you without a good README.
### Introduction

Given the fundamental building blocks of a scene, how do we create it? That is the basic purpose of a path tracer. The building blocks will be the various objects present in the scene, along with their material properties. Is it a reflective material? refractive? diffusive? Or both? Is the material a light source? If so what is its illuminance? These properties are explicitly specified in the scene file.

We assume there is a camera present in the scene and project the 3D scene onto the 2D image screen of the camera. A basic path tracing algorithm is as follows:

1) Initialize a HxW image and construct Hxw rays generating from the camera _through_ the image. We have one ray per pixel. The rays serve as messenger vehicles that figures out the color of its corresponding pixel. The rays are initialized to RGB(1,1,1). So if you render this image, we expect to get a white screen.
2) For each ray, we find which object in the scene they intersect. If the ray does not intersect any object, the ray is assigned a black color.Since each ray of a pixel is independent of one another, we can perform this computation for all the rays in parallel.
3) For a given ray, if the object of interesction is a light source, the ray is assigned the color of the light source. If the object is not a light source, then we check if we can either reflect or refract from the object. The nature of reflection/refraction is determined by the material properties of the object. From this, a _new_ ray is constructed which absorbs the color of the object. This process is done for all rays in parallel.
4) We can recursively perform steps 2) and 3) for a fixed number of bounces. After each bounce, the ray absorbs the color of the material it bounced from. A ray "dies" after a fixed number of bounces OR if it hits a light source before. The final color of the ray is the product of colors of all the material it intersected. The color of the pixel in our final 2D image is the color of its corresponding ray.

The above process is a 10000-ft view of a basic path-tracer. In reality, when we see an object, a light ray from some light source strikes the object, reflects a million times, and finally enters the retina. A path tracer tries to simulate this but in the _reverse_ direction. For each final pixel in our retina/camera, we ask where did this come from by shooting a ray from the pixel and observing its life.


## BSDF Computation

We explore three different possibilities for calculation of the new ray.
1) Ideal Diffuse Reflection - the new ray does not depend on the incident ray, but can scatter in all the directions. We approximate this by probabilistically choosing a ray from the hemisphere enclosing the surface and point of intersection. The probability of a ray in a given direction is weighted by the cosine of the angle between that ray and the normal to the surface. So most of the diffused rays will be in the direction closer to the normal.
2) Ideal Specular Reflection - The new ray perfectly reflects from the surface.
3) Refraction - A ray can pass _through_ the object and we can calculate it from Snell's law.

For the Cornell scene, three different cases illustrating the above are included below

Diffusion | Reflection | Transmission
:-------------------------:|:-------------------------:|:-------------------------:
![](data/full_diffuse.png)| ![](data/full_reflect.png) |![](data/full_refract.png)

We can also have a material to be a _combination_ of reflection, refraction, and diffusion. If reflection coeffecient is `p1`, refraction coefficient is `p2`, then for a given a ray we sample a specular reflected ray with probability `p1`, refracted ray with probability `p2`, and diffused ray with probability `1 - p1 - p2`.

<p align='center'>
<img src="data/reflect_refract.png" width=500>
</p>

We can see the red wall and the green wall on the corresponding side of the sphere. This highlights the reflective property of the material. But, more importantly, notice the shiny surface at the bottom of the sphere and the shadow sprinkled with a speck of white light. This is _because_ of refraction. The white light passed through the sphere by refracting from air to glass and came out by refracting from glass to air.

## Deeper the better

We can control the the number of bounces for each ray using a `depth` parameter. Higher the depth, more realistic we can expect our image to be but, like everything else in life, that comes at the tradeoff of expensive computation.

Depth 1 | Depth 3 | Depth 8
:-------------------------:|:-------------------------:|:-------------------------:
![](data/depth_0.png)| ![](data/depth_1.png) |![](data/full_refract.png)


The effect on realism can be better visualized by completely removing the light source. If there is no light source, everything should be black. But if depth is low enough, it won't be.

No Light Depth 0 | No Light Depth 8 | No Light Depth 15
:-------------------------:|:-------------------------:|:-------------------------:
![](data/no_light_depth_0.png)| ![](data/no_light_depth_8.png) |![](data/no_light_depth_15.png)

## Anti-Aliasing

As explained before, we use one ray per pixel, and the ray passes through the center of the pixel. This can lead to zagged edges which are clearly evident when we zoom in. We can limit this by jittering the ray through the pixel. Instead of the ray always passing through the center, we randomize the ray location within the pixel. This helps in diminishing the discontinuity when jumping between pixels and therefore renders an overall smoother image. The effects are illustrated below:


Aliased | Anti-Aliased
:-------------------------:|:-------------------------:
![](data/aliased.png)| ![](data/anti_aliased.png)

The jagged edges bordering the green reflection is smoother in the anti-aliased image.

## Stream Compaction

Since we are parallelizing by rays, and after each depth, some rays will terminate because they didn't intersect any object or terminated at a light source, it is better to launch threads for number of rays still alive. By stream compacting on alive rays and bringing them ttogeher, we can reduce warp divergence as we are only dealing with the active threads _that are grouped together_.

Performance for a complete shader call after each depth is included below

<p align="center">
<img src="data/perf_depth.png">
</p>

We can see that the time taken reduces, unsurprisingly. But this by itself doesn't justify why stream compacttion is better. For that we analyze the time taken for an entire iteration (across all depth). The performance graph is given below.

<p align="center">
<img src="data/perf_stream.png">
</p>

For smaller image resolutions, the effect of stream compaction is neglible. In fact, it might even be slower because the overhead might not be worth it. But as we scale to larger image resolutions, we have a clear winner.

For a relatively open scene, more rays die out aftter each bounce, so the effectt of stream compaction on performance will not be as significant as it is on a closed scene. The following plot, taken on an image of `1500x1500` resolution, illustrates this.

<p align="center">
<img src="data/closed_open.png">
</p>

## Overall performance analysis

Sorting by material id, though a good idea to limit warp divergence, did not turn out to be a good idea in the present case because our scenes are not big enough to justify the sorting overhead. The performance was for a screen resolution `700x700`. As discussed above, the image resolution is not big enough for the stream compactiton to do its magic, which is why we get comparable performance to the implementation without one.

<p align="center">
<img src="data/per_bar.png">
</p>

The work-efficient stream compaction which I implemented uses shared memory to perform the scan over varying number of blocks. Though this passed the test cases provided in Project 2 (working upto array sizes of `2^27`), I could not scale it for the path tracer beyond image resolution of `720x720`. The performance is comparable to thrust's implementatiton though a bit slower. I atttribute this to not accounting for bank-conflicts within shared-memory and sloppy use of `cudaMemcpy` to transfer data to-and-fro so as to fit in well with the previously written API.

## Bloopers

While attempting motion blur, if the updates were too quick, or if the transform matrix was not updated properly, I got some funny results



Motion Blur Rip Through | Motion Blur Shaved
:-------------------------:|:-------------------------:
![](data/blooper_motion_blur.png)| ![](data/blooper_motion_blur_2.png)



While calculatitng the refracted ray using Snell's law, if the normal to the surface was not flipped correctly, we get a doughnout.


<p align="center">
<img src="data/blooper_refract.png" width=500>
</p>
Binary file added data/aliased.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added data/anti_aliased.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added data/blooper_motion_blur.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added data/blooper_motion_blur_2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added data/blooper_refract.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added data/closed_open.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added data/depth_0.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added data/depth_1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added data/final_render.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added data/full_diffuse.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added data/full_reflect.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added data/full_refract.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added data/no_light_depth_0.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added data/no_light_depth_15.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added data/no_light_depth_8.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added data/path_tracer_book.xlsx
Binary file not shown.
Binary file added data/per_bar.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added data/perf_depth.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added data/perf_stream.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added data/reflect_refract.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
165 changes: 165 additions & 0 deletions scenes/cornell_mod.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,165 @@
// Emissive material (light)
MATERIAL 0
RGB 1 1 1
SPECEX 0
SPECRGB 0 0 0
REFL 0
REFR 0
REFRIOR 0
EMITTANCE 3

// Diffuse white
MATERIAL 1
RGB .98 .98 .98
SPECEX 0
SPECRGB 0 0 0
REFL 0
REFR 0
REFRIOR 0
EMITTANCE 0

// Diffuse red
MATERIAL 2
RGB .85 .35 .35
SPECEX 0
SPECRGB 0 0 0
REFL 0
REFR 0
REFRIOR 0
EMITTANCE 0

// Diffuse green
MATERIAL 3
RGB .35 .85 .35
SPECEX 0
SPECRGB 0 0 0
REFL 0
REFR 0
REFRIOR 0
EMITTANCE 0

// Specular white
MATERIAL 4
RGB .98 .98 .98
SPECEX 0
SPECRGB .98 .98 .98
REFL 0.8
REFR 0.2
REFRIOR 1.33
EMITTANCE 0

MATERIAL 5
RGB .98 .98 0
SPECEX 0
SPECRGB 0 .9 0.9
REFL 0.3
REFR 0.8
REFRIOR 1.5
EMITTANCE 0

MATERIAL 6
RGB .98 .98 0
SPECEX 0
SPECRGB .98 .98 0
REFL 0.5
REFR 0.5
REFRIOR 1.5
EMITTANCE 0

MATERIAL 7
RGB 1 0.576 0.160
SPECEX 0
SPECRGB 0 0 0
REFL 0
REFR 0
REFRIOR 0
EMITTANCE 3

// Camera
CAMERA
RES 800 800
FOVY 45
ITERATIONS 5000
DEPTH 10
FILE cornell
EYE 0.0 5 10.5
LOOKAT 0 5 0
UP 0 1 0


// Ceiling light
OBJECT 0
cube
material 0
TRANS 0 10 0
ROTAT 0 0 0
SCALE 3 .3 3

// Floor
OBJECT 1
cube
material 1
TRANS 0 0 0
ROTAT 0 0 0
SCALE 10 .01 10

// Ceiling
OBJECT 2
cube
material 1
TRANS 0 10 0
ROTAT 0 0 90
SCALE .01 10 10

// Back wall
OBJECT 3
cube
material 1
TRANS 0 5 -5
ROTAT 0 90 0
SCALE .01 10 10

// Left wall
OBJECT 4
cube
material 2
TRANS -5 5 0
ROTAT 0 0 0
SCALE .01 10 10

// Right wall
OBJECT 5
cube
material 3
TRANS 5 5 0
ROTAT 0 0 0
SCALE .01 10 10

// Sphere
OBJECT 6
sphere
material 4
TRANS -2 4 0
ROTAT 0 0 0
SCALE 3 3 3

OBJECT 7
sphere
material 5
TRANS 2 4 0
ROTAT 0 0 0
SCALE 3 3 3

OBJECT 8
cube
material 6
TRANS 2 7 0
ROTAT 45 45 0
SCALE 1 1 1

OBJECT 9
cube
material 7
TRANS 4 0 3
ROTAT 0 0 0
SCALE 0.5 .3 0.5
Loading