CIS565-Fall-2019 · thegyro · Sep 26, 2019 · Sep 28, 2019 · Sep 28, 2019 · Sep 29, 2019
diff --git a/CMakeLists.txt b/CMakeLists.txt
@@ -73,6 +73,8 @@ set(headers
     src/sceneStructs.h
     src/preview.h
     src/utilities.h
+    src/common.h
+    src/efficient.h
     )
 
 set(sources
@@ -84,6 +86,8 @@ set(sources
     src/scene.cpp
     src/preview.cpp
     src/utilities.cpp
+    src/common.cu
+    src/efficient.cu
     )
 
 list(SORT headers)

diff --git a/README.md b/README.md
@@ -1,13 +1,128 @@
 CUDA Path Tracer
 ================
 
-**University of Pennsylvania, CIS 565: GPU Programming and Architecture, Project 3**
+**University of Pennsylvania, CIS 565: GPU Programming and Architecture,
+Project 3 - CUDA Path Tracer**
 
-* (TODO) YOUR NAME HERE
-* Tested on: (TODO) Windows 22, i7-2222 @ 2.22GHz 22GB, GTX 222 222MB (Moore 2222 Lab)
+* Srinath Rajagopalan
+  * [LinkedIn](https://www.linkedin.com/in/srinath-rajagopalan-07a43155), [twitter](https://twitter.com/srinath132)
+* Tested on: Windows 10, i7-6700 @ 3.4GHz 16GB, Nvidia Quadro P1000 4GB (Moore 100B Lab)
 
-### (TODO: Your README)
+## Path Tracer
+<p align='center'>
+<img src="data/final_render.png">
+</p>
 
-*DO NOT* leave the README to the last minute! It is a crucial part of the
-project, and we will not be able to grade you without a good README.
+### Introduction
 
+Given the fundamental building blocks of a scene, how do we create it? That is the basic purpose of a path tracer. The building blocks will be the various objects present in the scene, along with their material properties. Is it a reflective material? refractive? diffusive? Or both? Is the material a light source? If so what is its illuminance? These properties are explicitly specified in the scene file.
+
+We assume there is a camera present in the scene and project the 3D scene onto the 2D image screen of the camera. A basic path tracing algorithm is as follows:
+
+1) Initialize a HxW image and construct Hxw rays generating from the camera _through_ the image. We have one ray per pixel. The rays serve as messenger vehicles that figures out the color of its corresponding pixel. The rays are initialized to RGB(1,1,1). So if you render this image, we expect to get a white screen. 
+2) For each ray, we find which object in the scene they intersect. If the ray does not intersect any object, the ray is assigned a black color.Since each ray of a pixel is independent of one another, we can perform this computation for all the rays in parallel.
+3) For a given ray, if the object of interesction is a light source, the ray is assigned the color of the light source. If the object is not a light source, then we check if we can either reflect or refract from the object. The nature of reflection/refraction is determined by the material properties of the object. From this, a _new_ ray is constructed which absorbs the color of the object. This process is done for all rays in parallel.
+4) We can recursively perform steps 2) and 3) for a fixed number of bounces. After each bounce, the ray absorbs the color of the material it bounced from. A ray "dies" after a fixed number of bounces OR if it hits a light source before. The final color of the ray is the product of colors of all the material it intersected. The color of the pixel in our final 2D image is the color of its corresponding ray.
+
+The above process is a 10000-ft view of a basic path-tracer. In reality, when we see an object, a light ray from some light source strikes the object, reflects a million times, and finally enters the retina. A path tracer tries to simulate this but in the _reverse_ direction. For each final pixel in our retina/camera, we ask where did this come from by shooting a ray from the pixel and observing its life.
+
+
+## BSDF Computation
+
+We explore three different possibilities for calculation of the new ray.
+1) Ideal Diffuse Reflection - the new ray does not depend on the incident ray, but can scatter in all the directions. We approximate this by probabilistically choosing a ray from the hemisphere enclosing the surface and point of intersection. The probability of a ray in a given direction is weighted by the cosine of the angle between that ray and the normal to the surface. So most of the diffused rays will be in the direction closer to the normal.
+2) Ideal Specular Reflection - The new ray perfectly reflects from the surface.
+3) Refraction - A ray can pass _through_ the object and we can calculate it from Snell's law. 
+
+For the Cornell scene, three different cases illustrating the above are included below
+
+Diffusion                    |  Reflection               | Transmission
+:-------------------------:|:-------------------------:|:-------------------------:
+![](data/full_diffuse.png)| ![](data/full_reflect.png) |![](data/full_refract.png)
+
+We can also have a material to be a _combination_ of reflection, refraction, and diffusion. If reflection coeffecient is `p1`, refraction coefficient is `p2`, then for a given a ray we sample a specular reflected ray with probability `p1`, refracted ray with probability `p2`, and diffused ray with probability `1 - p1 - p2`.
+
+<p align='center'>
+	<img src="data/reflect_refract.png" width=500>
+</p>
+
+We can see the red wall and the green wall on the corresponding side of the sphere. This highlights the reflective property of the material. But, more importantly, notice the shiny surface at the bottom of the sphere and the shadow sprinkled with a speck of white light. This is _because_ of refraction. The white light passed through the sphere by refracting from air to glass and came out by refracting from glass to air. 
+
+## Deeper the better
+
+We can control the the number of bounces for each ray using a `depth` parameter.  Higher the depth, more realistic we can expect our image to be but, like everything else in life, that comes at the tradeoff of expensive computation. 
+
+Depth 1                    |  Depth 3               | Depth 8
+:-------------------------:|:-------------------------:|:-------------------------:
+![](data/depth_0.png)| 		![](data/depth_1.png) 	|![](data/full_refract.png)
+
+
+The effect on realism can be better visualized by completely removing the light source. If there is no light source, everything should be black. But if depth is low enough, it won't be.
+
+No Light Depth 0                    |  No Light Depth 8               | No Light Depth 15
+:-------------------------:|:-------------------------:|:-------------------------:
+![](data/no_light_depth_0.png)| 		![](data/no_light_depth_8.png) 	|![](data/no_light_depth_15.png)
+
+## Anti-Aliasing
+
+As explained before, we use one ray per pixel, and the ray passes through the center of the pixel. This can lead to zagged edges which are clearly evident when we zoom in. We can limit this by jittering the ray through the pixel. Instead of the ray always passing through the center, we randomize the ray location within the pixel. This helps in diminishing the discontinuity when jumping between pixels and therefore renders an overall smoother image. The effects are illustrated below:
+
+
+Aliased                    |  Anti-Aliased               
+:-------------------------:|:-------------------------:
+![](data/aliased.png)| 		![](data/anti_aliased.png)
+
+The jagged edges bordering the green reflection is smoother in the anti-aliased image.
+
+## Stream Compaction
+
+Since we are parallelizing by rays, and after each depth, some rays will terminate because they didn't intersect any object or terminated at a light source, it is better to launch threads for number of rays still alive. By stream compacting on alive rays and bringing them ttogeher, we can reduce warp divergence as we are only dealing with the active threads _that are grouped together_.  
+
+Performance for a complete shader call after each depth is included below
+
+<p align="center">
+<img src="data/perf_depth.png">
+</p>
+
+We can see that the time taken reduces, unsurprisingly. But this by itself doesn't justify why stream compacttion is better. For that we analyze the time taken for an entire iteration (across all depth). The performance graph is given below.
+
+<p align="center">
+<img src="data/perf_stream.png">
+</p>
+
+For smaller image resolutions, the effect of stream compaction is neglible. In fact, it might even be slower because the overhead might not be worth it. But as we scale to larger image resolutions, we have a clear winner.
+
+For a relatively open scene, more rays die out aftter each bounce, so the effectt of stream compaction on performance will not be as significant as it is on a closed scene. The following plot, taken on an image of `1500x1500` resolution, illustrates this.
+
+<p align="center">
+<img src="data/closed_open.png">
+</p>
+
+## Overall performance analysis
+
+Sorting by material id, though a good idea to limit warp divergence, did not turn out to be a good idea in the present case because our scenes are not big enough to justify the sorting overhead. The performance was for a screen resolution `700x700`. As discussed above, the image resolution is not big enough for the stream compactiton to do its magic, which is why we get comparable performance to the implementation without one. 
+
+<p align="center">
+<img src="data/per_bar.png">
+</p>
+
+The work-efficient stream compaction which I implemented uses shared memory to perform the scan over varying number of blocks. Though this passed the test cases provided in Project 2 (working upto array sizes of `2^27`), I could not scale it for the path tracer beyond image resolution of `720x720`. The performance is comparable to thrust's implementatiton though a bit slower. I atttribute this to not accounting for bank-conflicts within shared-memory and sloppy use of `cudaMemcpy` to transfer data to-and-fro so as to fit in well with the previously written API.
+
+## Bloopers
+
+While attempting motion blur, if the updates were too quick, or if the transform matrix was not updated properly, I got some funny results
+
+
+
+Motion Blur Rip Through                    |  Motion Blur Shaved               
+:-------------------------:|:-------------------------:
+![](data/blooper_motion_blur.png)| 		![](data/blooper_motion_blur_2.png)
+
+
+
+While calculatitng the refracted ray using Snell's law, if the normal to the surface was not flipped correctly, we get a doughnout. 
+
+
+<p align="center">
+<img src="data/blooper_refract.png" width=500>
+</p>
diff --git a/data/aliased.png b/data/aliased.png
diff --git a/data/anti_aliased.png b/data/anti_aliased.png
diff --git a/data/blooper_motion_blur.png b/data/blooper_motion_blur.png
diff --git a/data/blooper_motion_blur_2.png b/data/blooper_motion_blur_2.png
diff --git a/data/blooper_refract.png b/data/blooper_refract.png
diff --git a/data/closed_open.png b/data/closed_open.png
diff --git a/data/depth_0.png b/data/depth_0.png
diff --git a/data/depth_1.png b/data/depth_1.png
diff --git a/data/final_render.png b/data/final_render.png
diff --git a/data/full_diffuse.png b/data/full_diffuse.png
diff --git a/data/full_reflect.png b/data/full_reflect.png
diff --git a/data/full_refract.png b/data/full_refract.png
diff --git a/data/no_light_depth_0.png b/data/no_light_depth_0.png
diff --git a/data/no_light_depth_15.png b/data/no_light_depth_15.png
diff --git a/data/no_light_depth_8.png b/data/no_light_depth_8.png
diff --git a/data/path_tracer_book.xlsx b/data/path_tracer_book.xlsx
diff --git a/data/per_bar.png b/data/per_bar.png
diff --git a/data/perf_depth.png b/data/perf_depth.png
diff --git a/data/perf_stream.png b/data/perf_stream.png
diff --git a/data/reflect_refract.png b/data/reflect_refract.png
diff --git a/scenes/cornell_mod.txt b/scenes/cornell_mod.txt
@@ -0,0 +1,165 @@
+// Emissive material (light)
+MATERIAL 0
+RGB         1 1 1
+SPECEX      0
+SPECRGB     0 0 0
+REFL        0
+REFR        0
+REFRIOR     0
+EMITTANCE   3
+
+// Diffuse white
+MATERIAL 1
+RGB         .98 .98 .98
+SPECEX      0
+SPECRGB     0 0 0
+REFL        0
+REFR        0
+REFRIOR     0
+EMITTANCE   0
+
+// Diffuse red
+MATERIAL 2
+RGB         .85 .35 .35
+SPECEX      0
+SPECRGB     0 0 0
+REFL        0
+REFR        0
+REFRIOR     0
+EMITTANCE   0
+
+// Diffuse green
+MATERIAL 3
+RGB         .35 .85 .35
+SPECEX      0
+SPECRGB     0 0 0
+REFL        0
+REFR        0
+REFRIOR     0
+EMITTANCE   0
+
+// Specular white
+MATERIAL 4
+RGB         .98 .98 .98
+SPECEX      0
+SPECRGB     .98 .98 .98
+REFL        0.8
+REFR        0.2
+REFRIOR     1.33
+EMITTANCE   0
+
+MATERIAL 5
+RGB         .98 .98 0
+SPECEX      0
+SPECRGB     0 .9 0.9
+REFL        0.3
+REFR        0.8
+REFRIOR     1.5
+EMITTANCE   0
+
+MATERIAL 6
+RGB         .98 .98 0
+SPECEX      0
+SPECRGB     .98 .98 0
+REFL        0.5
+REFR        0.5
+REFRIOR     1.5
+EMITTANCE   0
+
+MATERIAL 7
+RGB         1 0.576 0.160
+SPECEX      0
+SPECRGB     0 0 0
+REFL        0
+REFR        0
+REFRIOR     0
+EMITTANCE   3
+
+// Camera
+CAMERA
+RES         800 800
+FOVY        45
+ITERATIONS  5000
+DEPTH       10
+FILE        cornell
+EYE         0.0 5 10.5
+LOOKAT      0 5 0
+UP          0 1 0
+
+
+// Ceiling light
+OBJECT 0
+cube
+material 0
+TRANS       0 10 0
+ROTAT       0 0 0
+SCALE       3 .3 3
+
+// Floor
+OBJECT 1
+cube
+material 1
+TRANS       0 0 0
+ROTAT       0 0 0
+SCALE       10 .01 10
+
+// Ceiling
+OBJECT 2
+cube
+material 1
+TRANS       0 10 0
+ROTAT       0 0 90
+SCALE       .01 10 10
+
+// Back wall
+OBJECT 3
+cube
+material 1
+TRANS       0 5 -5
+ROTAT       0 90 0
+SCALE       .01 10 10
+
+// Left wall
+OBJECT 4
+cube
+material 2
+TRANS       -5 5 0
+ROTAT       0 0 0
+SCALE       .01 10 10
+
+// Right wall
+OBJECT 5
+cube
+material 3
+TRANS       5 5 0
+ROTAT       0 0 0
+SCALE       .01 10 10
+
+// Sphere
+OBJECT 6
+sphere
+material 4
+TRANS       -2 4 0
+ROTAT       0 0 0
+SCALE       3 3 3
+
+OBJECT 7
+sphere
+material 5
+TRANS       2 4 0
+ROTAT       0 0 0
+SCALE       3 3 3
+
+OBJECT 8
+cube
+material 6
+TRANS       2 7 0
+ROTAT       45 45 0
+SCALE       1 1 1
+
+OBJECT 9
+cube
+material 7
+TRANS       4 0 3
+ROTAT       0 0 0
+SCALE       0.5 .3 0.5