Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Project4: Annie Qiu #24

Open
wants to merge 24 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
116 changes: 103 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,30 +3,120 @@ WebGL Forward+ and Clustered Deferred Shading

**University of Pennsylvania, CIS 565: GPU Programming and Architecture, Project 4**

* (TODO) YOUR NAME HERE
* Tested on: (TODO) **Google Chrome 222.2** on
Windows 22, i7-2222 @ 2.22GHz 22GB, GTX 222 222MB (Moore 2222 Lab)
* Annie Qiu
* [LinkedIn](https://github.com/AnnieQiuuu/Project0-Getting-Started/blob/main/www.linkedin.com/in/annie-qiu-30531921a)
* Tested on: Windows 11, i9-12900H @2500 Mhz, 16GB, RTX 3070 Ti 8GB (Personal)

### Live Demo
## Overview
This project implements Naive, Forward+ and Clustered Deferred Shading techniques using WebGPU. It showcases the Sponza Atrium model with a large number of point lights. A GUI is provided to switch between the different rendering modes for comparison.

[![](img/thumb.png)](http://TODO.github.io/Project4-WebGPU-Forward-Plus-and-Clustered-Deferred)
### Features
Naive
- The Naive rendering is the simple forward rendering where each object is rendered directly using the same lighting calculation for every fragment.

### Demo Video/GIF
Forward+
- The Forward+ is the optimized forward rendering. It divides the frustrum into clusters and assigns lights to these clusters in the compute shader.
- Only lights that affect a specific cluster are considered when shading fragments in that cluster, so this method reduces unnecessary light computations and improves performance in scenes with many lights.

[![](img/video.mp4)](TODO)
Clustered Deferred
- A rendering technique that stores intermediate shading information (like colors, normals, and positions) in multiple G-buffers during the first pass
- In the second pass, lighting is calculated by reading from the G-buffers, and similar to the foward+, only relevant lights within each 3D cluster will be used.

### (TODO: Your README)
## Screenshot
![](img/foward.png)
- Number of Lights: 500
- Mode: Forward+
- FPS: 165 (6.06ms)
- Cluster Size: 16 X 9 X 24

*DO NOT* leave the README to the last minute! It is a crucial part of the
project, and we will not be able to grade you without a good README.
![](img/deferred.png)
- Number of Lights: 2526
- Mode: Clustered Deferred
- FPS: 120 (83.33ms)
- Cluster Size: 16 X 9 X 24

This assignment has a considerable amount of performance analysis compared
to implementation work. Complete the implementation early to leave time!
## Live Demo
[Live Demo Link](https://annieqiuuu.github.io/Project4-WebGPU-Forward-Plus-and-Clustered-Deferred/)

### Credits
## Demo Video/GIF
[4K Demo Video Link](https://youtu.be/UlBPg0pRh2A)

### Naive
- Mode: Naive
- Number of lights: 500
![](./img/naive.gif)

### Forward+
- Mode: Forward +
- Number of lights: 500
![](./img/forwardplus.gif)

### Clustered Deferred
- Mode: Clustered Deferred
- Number of lights: 500
![](./img/deferred.gif)

## Performance Analysis

### Number of Lights Chart
![](img/chart.png)
- X axis: ms
- Y aixs( Number of lights): [100, 200, 500, 1000, 2000, 3000, 5000]
- Blue Line: Naive
- Red Line: Forward+
- Yellow Line: Clustered Deferred
- Cluster size: 16 X 9 X 24
- Compute pass dispatch Workgroup: (4, 3, 6)
- Cluster wrokgroupsize: [4, 4, 4]

As shown in the chart image, the millisecond increased as the number of lights increased, which means the performace decreased. Naive is the slowest.
Clustered Deferred is the fastest and followed by the Forward+. As the number of lights lower than 500, both Forward+ and Deferred reach the refresh rate limitation and stay with 6.06ms(165 fps).

### Cluster Size Form
| Cluster Size | 16 X 9 X 24 | 16 X 9 X 12 | 16 X 9 X 6 | 16 X 9 X 3 |
|:------------------:|:----------------:|:----------------:|:----------------:|:----------------:|
| Forward+ | 6.06ms | 10ms | 15.87ms | 29.41ms |
| Deferred | 6.06ms | 6.06ms | 6.45ms | 8.20ms |

- Cluster wrokgroupsize: [4, 4, 4]
- Number of Lights: 500

Larger clusters (with fewer Z slices) mean more lights are grouped into each cluster. It results in more lights being processed per fragment, which increases computation time.
In Forward+ shading, the performance drops significantly when the cluster size gets smaller because each fragment ends up processing more lights, which slows things down. On the other hand, Clustered Deferred shading handles the changes in cluster size much better. It keeps the performance steady since it calculates lighting more efficiently using G-buffer data.

### Performance Overview:
Clustered Deferred is the fastest implementation, followed by Forward+ as the second fastest. The Naive method is the slowest.
Due to refresh rate limitations, both Forward+ and Clustered Deferred can achieve up to 165 fps when the number of lights is fewer than 500.

### Performace Difference:
Forward+ may be faster in simpler scenes with fewer lights or transparent objects, as it avoids multiple G-buffer passes and uses less memory bandwidth.
Clustered Deferred excels in complex scenes with more geometry and lights, efficiently handling shading by processing lights only once per fragment in each cluster.

### Trade offs
- Forward+ Shading:
- Benefits:
- Easier to handle transparency and MSAA.
- Lower memory usage by avoiding multiple G-buffers, reducing memory bandwidth usage.
- Tradeoffs:
- Suffers from overdraw, as occluded fragments are still shaded.
- Performance drops in scenes with many lights due to recalculating the full lighting equation for each fragment.

- Clustered Deferred Shading:
- Benefits:
- Reduces overdraw by performing depth testing before lighting calculations.
- Well-suited for complex scenes with many lights and detailed geometry.
- Tradeoffs:
- Higher memory bandwidth consumption due to multiple G-buffer reads.
- Challenging to implement MSAA and transparency, often requiring extra passes.
- More complex pipeline and higher memory usage.

## Bloopers & Debug
- [Fixed] The Forward didn't work as expected in the beginning. It was really slow previously. I fixed by not using `let cluster = clusterSet.clusters[clusterIdx]` but using `clusterSet.clusters[clusterIdx]` directly in fragment shader. It is because when I use `let cluster = clusterSet.clusters[clusterIdx]`, I created a copy of the entire cluster at the specified index. And in a fragment shader, this operation is performed per-pixel, which can lead to millions of copies, so to caused a large memory overhead. Before fixed: 10 FPS. After fixed: 165 FPS.

## Credits
- [Vite](https://vitejs.dev/)
- [loaders.gl](https://loaders.gl/)
- [dat.GUI](https://github.com/dataarts/dat.gui)
- [stats.js](https://github.com/mrdoob/stats.js)
- [wgpu-matrix](https://github.com/greggman/wgpu-matrix)
- [Clustered-method](https://github.com/DaveH355/clustered-shading/tree/main/img)
Binary file added img/chart.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/deferred.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/deferred.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/forwardplus.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/foward.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/naive.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion src/main.ts
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ function setRenderer(mode: string) {
}

const renderModes = { naive: 'naive', forwardPlus: 'forward+', clusteredDeferred: 'clustered deferred' };
let renderModeController = gui.add({ mode: renderModes.naive }, 'mode', renderModes);
let renderModeController = gui.add({ mode: renderModes.forwardPlus }, 'mode', renderModes);
renderModeController.onChange(setRenderer);

setRenderer(renderModeController.getValue());
9 changes: 9 additions & 0 deletions src/renderer.ts
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,8 @@ export async function initWebGPU() {
const devicePixelRatio = window.devicePixelRatio;
canvas.width = canvas.clientWidth * devicePixelRatio;
canvas.height = canvas.clientHeight * devicePixelRatio;
console.log("InitWebGPU: The canvas width is: ", canvas.width);
console.log("InitWebGPU: The canvas height is: ", canvas.height);

aspectRatio = canvas.width / canvas.height;

Expand Down Expand Up @@ -51,6 +53,13 @@ export async function initWebGPU() {
});

console.log("WebGPU init successsful");
//check device limits
const limits = device.limits;
console.log("Max workgroup size X:", limits.maxComputeWorkgroupSizeX);
console.log("Max workgroup size Y:", limits.maxComputeWorkgroupSizeY);
console.log("Max workgroup size Z:", limits.maxComputeWorkgroupSizeZ);
console.log("Max total workgroup size:", limits.maxComputeInvocationsPerWorkgroup);
console.log("Max workgroups per dimension (X, Y, Z):", device.limits.maxComputeWorkgroupsPerDimension);

modelBindGroupLayout = device.createBindGroupLayout({
label: "model bind group layout",
Expand Down
Loading