Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Project 3: Rhuta Joshi #18

Open
wants to merge 90 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
90 commits
Select commit Hold shift + click to select a range
3364bd2
white banded output for basic diffused implementation
rcj9719 Sep 23, 2022
bfbd4c1
WHite band issue resolved by changing rng seed per iteration
rcj9719 Sep 24, 2022
eab2670
Basic diffused FINALLY working, use of new_num_paths in pathtrace and…
rcj9719 Sep 26, 2022
1cd8aac
Basic transmissive material with refraction
rcj9719 Sep 28, 2022
6bd2add
total internal reflection
rcj9719 Sep 28, 2022
a26fd21
total internal reflection corrected
rcj9719 Sep 28, 2022
397f6be
total internal reflection corrected
rcj9719 Sep 28, 2022
e9d2afb
total internal reflection corrected
rcj9719 Sep 28, 2022
ddf6b04
added tiny obj loader header file
rcj9719 Sep 28, 2022
dd69613
cached intersections
rcj9719 Sep 28, 2022
65bbac6
Loading triangles
rcj9719 Sep 29, 2022
17899ed
Loading objs as triangle geoms
rcj9719 Sep 29, 2022
82f67a7
incorrect object loading implementation
rcj9719 Sep 29, 2022
60d3ebe
incorrect obj mesh load
rcj9719 Sep 29, 2022
f5bab9e
obj mesh loading works with incorrect normals
rcj9719 Sep 30, 2022
0edba5c
mesh obj loader with some artifacts
rcj9719 Oct 1, 2022
f8823c2
antialiasing implemented
rcj9719 Oct 1, 2022
1f9731d
macros added
rcj9719 Oct 1, 2022
1a5afd6
Mesh obj loading working for bunny
rcj9719 Oct 3, 2022
af7cd6b
Basic ray marching for implicit surface implemented
rcj9719 Oct 3, 2022
bb12772
Implicit surface corrected normals
rcj9719 Oct 3, 2022
635eacc
sdf procedural shapes
rcj9719 Oct 6, 2022
5d6c655
cleanup for sdfs
rcj9719 Oct 6, 2022
e58977e
latest updates
rcj9719 Oct 6, 2022
33268d6
sdfs for light
rcj9719 Oct 7, 2022
a2640c0
testing
Oct 7, 2022
e78bb27
adding objs
rcj9719 Oct 7, 2022
8f0a15b
obj
rcj9719 Oct 7, 2022
e6372aa
obj loading scenes
Oct 7, 2022
ce7feb2
dof amd procedural scenes
Oct 7, 2022
e296393
removed objs
Oct 7, 2022
6c65e1e
Merge branch 'CIS565-Fall-2022:main' into main
rcj9719 Oct 7, 2022
36eb46f
adding demo scene
Oct 7, 2022
c9ca0cd
adding objs
Oct 7, 2022
74edb2e
material types
Oct 8, 2022
afb6871
updated readme
rcj9719 Oct 8, 2022
ac7f40f
Merge branch 'main' of https://github.com/rcj9719/GPU-Project3-CUDA-P…
rcj9719 Oct 8, 2022
8daa51d
material types scene update
Oct 8, 2022
8d5854f
Merge branch 'main' of https://github.com/rcj9719/GPU-Project3-CUDA-P…
Oct 8, 2022
4ab28bd
antialiasing scenes
Oct 8, 2022
894bb11
readme update
rcj9719 Oct 8, 2022
98d088e
adding objs
rcj9719 Oct 8, 2022
a955b74
adding antialiasing scenes
rcj9719 Oct 8, 2022
1ff2dd4
pearls
Oct 8, 2022
d6b0380
pearls final
Oct 8, 2022
ded305a
dof toggle settings for scene
Oct 8, 2022
a9e1248
adding dof focal dist images
Oct 8, 2022
3c2a50d
dof lens radius change scenes
Oct 8, 2022
9aa500c
obj loading scenes and more dof renders
Oct 8, 2022
3ef88b8
demosceneannotate updated
rcj9719 Oct 8, 2022
a41184c
Merge branch 'main' of https://github.com/rcj9719/GPU-Project3-CUDA-P…
rcj9719 Oct 8, 2022
c7fa14c
dof readme update
rcj9719 Oct 8, 2022
6d4bbc2
dof readme update
rcj9719 Oct 8, 2022
bf373ef
updated dof
rcj9719 Oct 8, 2022
2e20454
Update README.md
rcj9719 Oct 8, 2022
ee8c67e
objimage
Oct 8, 2022
e45009b
updated readme
rcj9719 Oct 8, 2022
0684dcd
Update README.md
rcj9719 Oct 8, 2022
e84c954
dof image updates
rcj9719 Oct 8, 2022
401a34d
Merge branch 'main' of https://github.com/rcj9719/GPU-Project3-CUDA-P…
rcj9719 Oct 8, 2022
ff32bc4
Update README.md
rcj9719 Oct 8, 2022
6828af1
procedural scene added
Oct 8, 2022
1539430
adding procedural annotation
rcj9719 Oct 8, 2022
c5a0fc6
Update README.md
rcj9719 Oct 8, 2022
7bf8ab0
Update README.md
rcj9719 Oct 8, 2022
6a89b6a
Update README.md
rcj9719 Oct 8, 2022
5a8f980
adding basic cornell box
Oct 8, 2022
ca2c950
cache performance analysis and folder structuring
Oct 9, 2022
7c978a4
Updated README.md
Oct 9, 2022
8892a7b
Update README.md
rcj9719 Oct 9, 2022
b0b265e
Update README.md
rcj9719 Oct 9, 2022
9f621ea
Updated graphics specifications
Oct 9, 2022
41b7841
Merge branch 'main' of https://github.com/rcj9719/GPU-Project3-CUDA-P…
Oct 9, 2022
fb94408
antialiasing performance
Oct 9, 2022
c69c7fe
antialiasing performance
Oct 9, 2022
1053af4
Update README.md
rcj9719 Oct 9, 2022
c8289e3
stream compaction analysis
Oct 9, 2022
e600f73
antialiasing performance
Oct 9, 2022
961e436
basic cornell added
Oct 9, 2022
7e59d80
basic cornell added
Oct 9, 2022
dd5c6b5
sphere marching image added
rcj9719 Oct 9, 2022
cb399db
Update README.md
rcj9719 Oct 9, 2022
148664c
sphere marching image added
rcj9719 Oct 9, 2022
2f52a4b
img added
rcj9719 Oct 9, 2022
626a47e
Update README.md
rcj9719 Oct 9, 2022
85a10f3
updated stream compaction analysis for closed box renders
Oct 10, 2022
07d6003
Update README.md
rcj9719 Oct 10, 2022
4b799b0
sorting rays analysis
Oct 10, 2022
4cf2143
added bloopers
Oct 10, 2022
91a240a
adding blooper and removing extra scenes
rcj9719 Oct 10, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 8 additions & 3 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,7 @@ endif(UNIX)

set(GLM_ROOT_DIR "external")
find_package(GLM REQUIRED)

include_directories(${GLM_INCLUDE_DIRS})

set(headers
Expand All @@ -74,6 +75,10 @@ set(headers
src/preview.h
src/utilities.h
src/ImGui/imconfig.h
src/tiny_obj_loader.h
src/noise.h
src/common.h
src/testing_helpers.hpp

src/ImGui/imgui.h
src/ImGui/imconfig.h
Expand Down Expand Up @@ -111,11 +116,11 @@ list(SORT sources)
source_group(Headers FILES ${headers})
source_group(Sources FILES ${sources})

#add_subdirectory(src/ImGui)
#add_subdirectory(stream_compaction) # TODO: uncomment if using your stream compaction
add_subdirectory(src/ImGui)
add_subdirectory(stream_compaction) # TODO: uncomment if using your stream compaction

cuda_add_executable(${CMAKE_PROJECT_NAME} ${sources} ${headers})
target_link_libraries(${CMAKE_PROJECT_NAME}
${LIBRARIES}
#stream_compaction # TODO: uncomment if using your stream compaction
stream_compaction # TODO: uncomment if using your stream compaction
)
158 changes: 152 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,159 @@
CUDA Path Tracer
================

**University of Pennsylvania, CIS 565: GPU Programming and Architecture, Project 3**
**University of Pennsylvania, CIS 565: GPU Programming and Architecture, Project 2**

* (TODO) YOUR NAME HERE
* Tested on: (TODO) Windows 22, i7-2222 @ 2.22GHz 22GB, GTX 222 222MB (Moore 2222 Lab)
* RHUTA JOSHI
* [LinkedIn](https://www.linkedin.com/in/rcj9719/)
* [Website](https://sites.google.com/view/rhuta-joshi)

### (TODO: Your README)
* Tested on: Windows 10 - 21H2, i7-12700 CPU @ 2.10 GHz, NVIDIA T1000 4096 MB
* GPU Compatibility: 7.5

*DO NOT* leave the README to the last minute! It is a crucial part of the
project, and we will not be able to grade you without a good README.

## Introduction

Ray-tracing is a computer graphics technique to generate 3-dimensional scenes in which we calculate the exact path of reflection or refraction of each ray and trace them all the way back to one or more light sources. Path tracing is a specific form of ray tracing that simulates the way light scatters off surfaces and through media, by generating multiple rays for each pixel(sampling) and bouncing off those rays based on material properties.

Since we cast many rays per pixel in order to get enough light information, we can get effects like caustics, soft shadows, anti-aliasing, and depth of field. Since this technique involves computing a large number of rays independently, it can be highly parallelized to converge images incredibly faster on a GPU as compared to a path tracer implementation on CPU. In this project, I have used CUDA to compute intersections and shading per iteration for multiple rays parallelly.

## Features

|Implemented features|
|---|
|![](img/demoSceneAnnotate.png)|

Some of the visual improvements implemented include:
- [Specular refraction and reflection](#specular-refraction-and-reflection)
- [Physically based depth of field](#physically-based-depth-of-field)
- [Stochastic sampled antialiasing](#stochastic-sampled-antialiasing)
- [Procedural shapes and textures](#procedural-shapes-and-textures)
- [Aritrary obj mesh loading](#aritrary-obj-mesh-loading)

Some performance improvements implemented include:
- [First bounce cached intersections](#first-bounce-cached-intersections)
- [Path continuation/termination using stream compaction](#path-continuationtermination-using-stream-compaction)
- [Sorting rays by material](#sorting-rays-by-material)

## Visual Improvements

### Specular refraction and reflection

Perfectly smooth surfaces exhibit specular reflection and transmission of incident light. The outgoing direction for reflecting surfaces is the incident direction reflected against the surface normal. For transmissive materials, it is determined by the refractive index of incoming and outgoing medium of the light. [Schlick Approximation](https://en.wikipedia.org/wiki/Schlick%27s_approximation) gives us a formula for approximating the contribution of the Fresnel factor in the specular reflection of light from a non-conducting interface (surface) between two media.

Since the operations for this approximation are primitive arithmetic operations without multiple conditions, it can be easily parallellizeed on the GPU. A CPU implementation can be significantly less efficient in comparison.

The following image obtained from the path tracer shows a fully specular transmissive material, specular reflective material and a lambertian diffused material.
|Left to Right - Transmissive (ROI=1.5), Mirror-like, Diffused|
|---|
| ![](img/renders/materialTypes3.png) |

The following render has a cube placed inside spheres of varying refractive index. Note how the size distortion of the refracted sphere varies
|Schlick's approximation for varying index of refraction|
|---|
|![](img/materialTypes2Annotate.png)|

### Physically based depth of field

Depth of field (DOF) gives a sense of how near or far an object is by adjusting two parameters- focal length, and aperture (2 x lens radius of camera). Focal length determines how far objects must be from the camera to be in focus; this number should always be positive. Aperture size will determine how blurry objects that are out of focus will appear.

![](img/renders/dof_focaldist12.png)

Generally, renders without this effect are generated by a pin-point camera (It has a lens radius of 0). This limits origin of the rays generated from the camera. If we set an aperture for the camera, rays can originate from anywhere within its aperature area. So, to implement depth of field, the ray origin is obtained by sampling a point within a circle of lens radius around the camera position.
The following renders of a string of pearls show depth of field as cameras with different lens radius generate varying depth of field.
|As lens radius increases, the blur/jitter on non-focused objects increases|
|---|
| ![](img/dof_lensRadius.png) |

The focal distance sets how far the rays go. To get the focal point, we can just multiply ray.direction by focal length of the camera. We can set the focal length of our camera to focus on foreground, mid-ground or background. The render comparisons are as follows.
|Lesser focal length focuses on objects closer to camera and blurs the rest|
|---|
|![](img/dof_focalDist.png)|

### Stochastic sampled antialiasing

To implement antialasing by this method in every iteration for each pixel of the path-tracer, a small random offset or jitter less than the pixel's dimensions is added, effectively inserting noise to each created ray which smooths (and slightly blurs) the image.
|Observe aliasing along the surface of the sphere in the left image|
|---|
|![](img/antialiasingAnnotate.png)|

Anti-aliasing can slightly decrease the performance of our path tracer, a detailed analysis for which is presented [below](#first-bounce-cached-intersections).


### Procedural shapes and textures

Implicit surfaces are the set of all solutions to some function F(x, y, z) = 0, where x or y or z are unsolved. To generate implicit surfaces instead of tracing a ray and calculating intersection, we march along the ray. Given a ray and an implicit surface function F(x, y, z), test points along the ray to see if they solve F = 0.

|Using SDF operations by [Inigo Quilez](https://iquilezles.org/articles/distfunctions/) and noise functions|
|---|
|![](img/proceduralAnnotate.png)|

An SDF is a function that takes a point and returns a distance to a shape’s surface. SDF evaluates to 0 on the surface, > 0 outside, and < 0 inside. An SDF must always be linear.
Since we know the distance to any SDF in the scene, we know that if the ray marches the distance of the closest SDF, it will never overshoot anything.
Therefore we can take big, variable steps instead of small uniform ones.
![](img/sphereMarching.png)

In terms of performance, ray marching is generally more efficient for standard primitive structures or analytic intersection. As in the above render example, boolean operations on simple primitive sdfs can be used to generate objects. If these were loaded as meshes, it is possible that it will consist of many triangles resulting in a large number of intersection calculations. More on this [linked here](https://graphicscodex.courses.nvidia.com/app.html?page=_rn_rayMrch).

### Aritrary obj mesh loading

OBJ meshes are triangulated and all triangles are stores in contiguous memory locations. The mesh is rendered by performing intersection test against each of the triangles. To make this more efficient, we store a bounding box around the mesh by calculating the minimum and maximum coordinates in 3 dimensions.

|OBJ mesh loading with bounding box intersection culling|
|---|
|![](img/renders/objLoading2.png)|

The bounding box method eliminates many unnecessary checks but much more precise optimizations could be made on this feature. Especially with hierarchical data structures such as kd-Tree, octTree or BVH implementation.

## Performance Optimizations

All performance analysis has been performed using the following cornell box with a single material.
![](img/performanceAnalysis/cornell.png)

### First bounce cached intersections

Since the target pixel coordinates don't change with each iteration, the camera shoots the same rays into the scene and hits the same geometry every time, before using random sampling to scatter into different parts of the scene. Thus, the rays' intersections from the first bounce can be stored to improve performance, since they won't be repeatedly calculated with each iteration.
As you can see in the graph below, while caching the intersection can take some time initially, in most cases following the initial caching time is less as compared to implementation without caching.

|Tested on basic cornell box with 1 diffused sphere in the center|
|---|
|![](img/performanceAnalysis/cacheIntersectionsCol.png)|

For the above observations, anti-aliasing was turned off. Anti-aliasing is an expensive operation as you can see from the following chart, especially as the number of iterations increases.
Note, it becomes even more expensive if implemented along with intersection caching.

|Anti-aliasing tested with intersection caching|
|---|
|![](img/performanceAnalysis/antialiasingCol.png)|

### Path continuation/termination using stream compaction

In an open cornell box used above, stream compaction greatly improves performance. We can see how stream compaction efficiently removes the rays that are terminated which extremely reduces the time taken per bounce. Since the number of rays getting terminated per depth would be very large in case of an open cornell box, we also need to observe if it improves performance in a closed cornell box. The following graph shows the number of remaining paths per bounce when stream compaction is off against observations when it is turned on for a open box as well as an closed cornell box. Number of paths in case of both open and closed box remains constant when stream compaction is off.

| Number of paths remaining versus depth (1 iteration cycle)|
|---|
|![](img/performanceAnalysis/streamcompaction_path.png)|

While the number of remaining rays decreases in case of both open and closed box, the following graph shows us that stream compaction may be very efficient for an open cornell box because many rays get terminated per iteration when they do not hit anything. In case of closed cornell box however, even though time taken is decreasing per bounce, the cost of computation for stream compaction increases the overall time taken.

| Time-taken by GPU versus depth (1 iteration cycle)|
|---|
|![](img/performanceAnalysis/streamcompaction_time.png)|


### Sorting rays by material

Each material in the scene has a unique ID that scene intersections reference whenever they collide with a material. Continguous intersections in memory can have different materials between them, thus leading to random memory access in the materials' global memory bank. To improve memory access, intersections who share the same material can be sorted based on material ID, so intersections with the same materials are coalesced in memory.

![](img/performanceAnalysis/sorting.png)


## Bloopers

![](img/bloopers/blooper1.png)

![](img/bloopers/blooper2.png)

![](img/bloopers/blooper3.png)

![](img/bloopers/blooper4.png)
Binary file added img/antialiasingAnnotate.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/bloopers/blooper1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/bloopers/blooper2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/bloopers/blooper3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/bloopers/blooper4.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/demoSceneAnnotate.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/dof_focalDist.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/dof_focaldist12.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/dof_focaldist20.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/dof_focaldist6.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/dof_lensRadius.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/materialTypes2Annotate.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/performanceAnalysis/antialiasing.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/performanceAnalysis/antialiasingCol.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/performanceAnalysis/cacheIntersections.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/performanceAnalysis/cacheIntersectionsCol.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/performanceAnalysis/cornell.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/performanceAnalysis/cornell_material.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/performanceAnalysis/sorting.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/performanceAnalysis/streamcompaction_path.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/proceduralAnnotate.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/renders/demoScene.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/renders/dof.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/renders/dof_F12_R1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/renders/dof_F12_Rhalf.png
Binary file added img/renders/dof_F20_R1.png
Binary file added img/renders/dof_F20_Rhalf.png
Binary file added img/renders/dof_F6_R1.png
Binary file added img/renders/dof_F6_Rhalf.png
Binary file added img/renders/dof_focaldist12.png
Binary file added img/renders/dof_focaldist20.png
Binary file added img/renders/dof_focaldist6.png
Binary file added img/renders/materialTypes.png
Binary file added img/renders/materialTypes2.png
Binary file added img/renders/materialTypes3.png
Binary file added img/renders/objLoading.png
Binary file added img/renders/objLoading2.png
Binary file added img/renders/procedural.png
Binary file added img/renders/procedural2.png
Binary file added img/renders/withAntialiasing.png
Binary file added img/renders/withoutAntialiasing.png
Binary file added img/sphereMarching.png
Loading