Skip to content

Commit 17ea1fb

Browse files
committed
README: pipeline; code: small cleanup
1 parent 4e379a7 commit 17ea1fb

File tree

6 files changed

+130
-104
lines changed

6 files changed

+130
-104
lines changed

README.md

Lines changed: 51 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@ A rasterizer is **NOT**:
3535
(... unless you do some fancy raytraced effects in your fragment shader).
3636
This project will let you generate graphics WITHOUT the need for ray casting!
3737
* An OpenGL rendering engine. You shouldn't write any new OpenGL code - think
38-
of your project as a reimplementation of a few of OpenGL's features.
38+
of your project as a reimplementation of OpenGL's core pipeline.
3939

4040
Finally, note that, while this base code is meant to serve as a strong starting
4141
point for a CUDA path tracer, you are not required to use it if you don't want
@@ -86,10 +86,10 @@ You will need to implement the following features/pipeline stages:
8686
* Vertex shading.
8787
* (Vertex shader) perspective transformation.
8888
* Primitive assembly with support for triangle VBOs/IBOs.
89-
* Rasterization through either a scanline or a tiled approach.
89+
* Rasterization: **either** a scanline or a tiled approach.
9090
* Fragment shading.
9191
* A depth buffer for storing and depth testing fragments.
92-
* Fragment to framebuffer writing (**with** atomics for race avoidance).
92+
* Fragment to depth buffer writing (**with** atomics for race avoidance).
9393
* (Fragment shader) simple lighting scheme, such as Lambert or Blinn-Phong.
9494

9595
See below for more guidance.
@@ -128,50 +128,83 @@ For each extra feature, please provide the following analysis:
128128
* How might this feature be optimized beyond your current implementation?
129129

130130

131-
## Minimal Rasterization Pipeline
131+
## Rasterization Pipeline
132132

133133
**INSTRUCTOR TODO**: update README to explain a minimal pipeline to see a
134134
triangle, e.g., no depth test, draw in NDC, etc.
135135

136-
* Vertex shading.
136+
Possible pipelines are described below. Pseudo-type-signatures are given.
137+
Not all of the pseudocode arrays will necessarily actually exist in practice.
138+
139+
### Minimal Pipeline
140+
141+
This describes a minimal version *one possible* graphics pipeline, similar to
142+
modern hardware (DX/OpenGL). Yours need not match precisely. To begin, try to
143+
write a minimal amount of code as described here. This will reduce the
144+
necessary time spent debugging.
145+
146+
* Vertex shading:
147+
* `VertexIn[n] vs_input -> VertexOut[n] vs_output`
137148
* A minimal vertex shader will apply no transformations at all - it draws
138149
directly in normalized device coordinates (NDC).
139150
* Primitive assembly.
151+
* `vertexOut[n] vs_output -> triangle[n/3] primitives`
140152
* Start by supporting ONLY triangles.
141153
* Rasterization.
154+
* `triangle[n/3] primitives -> fragmentIn[m] fs_input`
142155
* Scanline: TODO
143-
* Optimization: scissor around rasterized triangle
144156
* Tiled: TODO
145157
* Fragment shading.
146-
* A test fragment shader can produce the same color for every fragment.
147-
* Try displaying various debug views (normals, etc.)
158+
* `fragmentIn[m] fs_input -> fragmentOut[m] fs_output`
159+
* A super-simple test fragment shader: output same color for every fragment.
160+
* Also try Tdisplaying various debug views (normals, etc.)
161+
* Fragments to depth buffer.
162+
* `fragmentOut[m] -> fragmentOut[resolution]`
163+
* Can really be done inside the fragment shader.
164+
* Results in race conditions - don't bother to fix these until it works!
148165
* A depth buffer for storing and depth testing fragments.
166+
* `fragmentOut[resolution] depthbuffer`
149167
* An array of `fragment` objects.
150168
* At the end of a frame, it should contain the fragments drawn to the screen.
151169
* Fragment to framebuffer writing.
152-
* You will need to use atomics for race avoidance, to prevent different
153-
primitives from overwriting each other in the wrong order.
154-
* You can ignore this when starting! The race conditions will only cause
155-
visual artifacts.
156-
* TODO
170+
* `fragmentOut[resolution] depthbuffer -> vec3[resolution] framebuffer`
171+
* Simply copies the colors out of the depth buffer into the framebuffer
172+
(to be displayed on the screen).
173+
174+
### Better Pipeline
175+
176+
INSTRUCTOR TODO
177+
178+
* Rasterization.
179+
* Scanline:
180+
* Optimization: scissor around rasterized triangle
181+
182+
* Fragments to depth buffer.
183+
* `fragmentOut[m] -> fragmentOut[resolution]`
184+
* Can really be done inside the fragment shader.
185+
* This allows you to do depth tests before spending execution time in
186+
complex fragment shader code.
187+
* When writing to the depth buffer, you will need to use atomics for race
188+
avoidance, to prevent different primitives from overwriting each other in
189+
the wrong order.
157190

158191

159192
## Base Code Tour
160193

161-
**INSTRUCTOR TODO:** update according to any code changes. LOOK -> CHECKITOUT.
194+
**INSTRUCTOR TODO:** update according to any code changes.
162195
TODO: simple structs for every part of the pipeline, intended to be changed?
163196
(e.g. vertexPre, vertexPost, triangle = vertexPre[3], fragment).
164197
TODO: autoformat code
165198
TODO: pragma once
166199
TODO: doxygen
167200

168-
You will be working primarily in two files: `rasterizeKernel.cu`, and
169-
`rasterizerTools.h`. Within these files, areas that you need to complete are
201+
You will be working primarily in two files: `rasterize.cu`, and
202+
`rasterizeTools.h`. Within these files, areas that you need to complete are
170203
marked with a `TODO` comment. Areas that are useful to and serve as hints for
171204
optional features are marked with `TODO (Optional)`. Functions that are useful
172-
for reference are marked with the comment `LOOK`.
205+
for reference are marked with the comment `CHECKITOUT`.
173206

174-
* `src/rasterizeKernels.cu` contains the core rasterization pipeline.
207+
* `src/rasterize.cu` contains the core rasterization pipeline.
175208
* A suggested sequence of kernels exists in this file, but you may choose to
176209
alter the order of this sequence or merge entire kernels if you see fit.
177210
For example, if you decide that doing has benefits, you can choose to merge

src/checkCUDAError.h

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
#define ERRORCHECK 1
2+
3+
#define FILENAME (strrchr(__FILE__, '/') ? strrchr(__FILE__, '/') + 1 : __FILE__)
4+
#define checkCUDAError(msg) checkCUDAErrorFn(msg, FILENAME, __LINE__)
5+
void checkCUDAErrorFn(const char *msg, const char *file, int line) {
6+
#if ERRORCHECK
7+
cudaDeviceSynchronize();
8+
cudaError_t err = cudaGetLastError();
9+
if (cudaSuccess == err) {
10+
return;
11+
}
12+
13+
fprintf(stderr, "CUDA error");
14+
if (file) {
15+
fprintf(stderr, " (%s:%d)", file, line);
16+
}
17+
fprintf(stderr, ": %s: %s\n", msg, cudaGetErrorString(err));
18+
# ifdef _WIN32
19+
getchar();
20+
# endif
21+
exit(EXIT_FAILURE);
22+
#endif
23+
}

src/rasterizeKernels.cu renamed to src/rasterize.cu

Lines changed: 38 additions & 47 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,10 @@
1-
// CIS565 CUDA Rasterizer: A simple rasterization pipeline for Patrick Cozzi's CIS565: GPU Computing at the University of Pennsylvania
2-
// Written by Yining Karl Li, Copyright (c) 2012 University of Pennsylvania
1+
#include "rasterize.h"
32

4-
#include <stdio.h>
5-
#include <cuda.h>
63
#include <cmath>
4+
#include <cstdio>
5+
#include <cuda.h>
76
#include <thrust/random.h>
8-
#include "rasterizeKernels.h"
7+
#include "checkCUDAError.h"
98
#include "rasterizeTools.h"
109

1110
glm::vec3* framebuffer;
@@ -15,15 +14,7 @@ float* device_cbo;
1514
int* device_ibo;
1615
triangle* primitives;
1716

18-
void checkCUDAError(const char *msg) {
19-
cudaError_t err = cudaGetLastError();
20-
if( cudaSuccess != err) {
21-
fprintf(stderr, "Cuda error: %s: %s.\n", msg, cudaGetErrorString( err) );
22-
exit(EXIT_FAILURE);
23-
}
24-
}
25-
26-
//Handy dandy little hashing function that provides seeds for random number generation
17+
// Handy dandy little hashing function that provides seeds for random number generation
2718
__host__ __device__ unsigned int hash(unsigned int a){
2819
a = (a+0x7ed55d16) + (a<<12);
2920
a = (a^0xc761c23c) ^ (a>>19);
@@ -34,15 +25,15 @@ __host__ __device__ unsigned int hash(unsigned int a){
3425
return a;
3526
}
3627

37-
//Writes a given fragment to a fragment buffer at a given location
28+
// Writes a given fragment to a fragment buffer at a given location
3829
__host__ __device__ void writeToDepthbuffer(int x, int y, fragment frag, fragment* depthbuffer, glm::vec2 resolution){
3930
if(x<resolution.x && y<resolution.y){
4031
int index = (y*resolution.x) + x;
4132
depthbuffer[index] = frag;
4233
}
4334
}
4435

45-
//Reads a fragment from a given location in a fragment buffer
36+
// Reads a fragment from a given location in a fragment buffer
4637
__host__ __device__ fragment getFromDepthbuffer(int x, int y, fragment* depthbuffer, glm::vec2 resolution){
4738
if(x<resolution.x && y<resolution.y){
4839
int index = (y*resolution.x) + x;
@@ -53,15 +44,15 @@ __host__ __device__ fragment getFromDepthbuffer(int x, int y, fragment* depthbuf
5344
}
5445
}
5546

56-
//Writes a given pixel to a pixel buffer at a given location
47+
// Writes a given pixel to a pixel buffer at a given location
5748
__host__ __device__ void writeToFramebuffer(int x, int y, glm::vec3 value, glm::vec3* framebuffer, glm::vec2 resolution){
5849
if(x<resolution.x && y<resolution.y){
5950
int index = (y*resolution.x) + x;
6051
framebuffer[index] = value;
6152
}
6253
}
6354

64-
//Reads a pixel from a pixel buffer at a given location
55+
// Reads a pixel from a pixel buffer at a given location
6556
__host__ __device__ glm::vec3 getFromFramebuffer(int x, int y, glm::vec3* framebuffer, glm::vec2 resolution){
6657
if(x<resolution.x && y<resolution.y){
6758
int index = (y*resolution.x) + x;
@@ -71,7 +62,7 @@ __host__ __device__ glm::vec3 getFromFramebuffer(int x, int y, glm::vec3* frameb
7162
}
7263
}
7364

74-
//Kernel that clears a given pixel buffer with a given color
65+
// Kernel that clears a given pixel buffer with a given color
7566
__global__ void clearImage(glm::vec2 resolution, glm::vec3* image, glm::vec3 color){
7667
int x = (blockIdx.x * blockDim.x) + threadIdx.x;
7768
int y = (blockIdx.y * blockDim.y) + threadIdx.y;
@@ -81,7 +72,7 @@ __global__ void clearImage(glm::vec2 resolution, glm::vec3* image, glm::vec3 col
8172
}
8273
}
8374

84-
//Kernel that clears a given fragment buffer with a given fragment
75+
// Kernel that clears a given fragment buffer with a given fragment
8576
__global__ void clearDepthBuffer(glm::vec2 resolution, fragment* buffer, fragment frag){
8677
int x = (blockIdx.x * blockDim.x) + threadIdx.x;
8778
int y = (blockIdx.y * blockDim.y) + threadIdx.y;
@@ -94,7 +85,7 @@ __global__ void clearDepthBuffer(glm::vec2 resolution, fragment* buffer, fragmen
9485
}
9586
}
9687

97-
//Kernel that writes the image to the OpenGL PBO directly.
88+
// Kernel that writes the image to the OpenGL PBO directly.
9889
__global__ void sendImageToPBO(uchar4* PBOpos, glm::vec2 resolution, glm::vec3* image){
9990

10091
int x = (blockIdx.x * blockDim.x) + threadIdx.x;
@@ -128,29 +119,29 @@ __global__ void sendImageToPBO(uchar4* PBOpos, glm::vec2 resolution, glm::vec3*
128119
}
129120
}
130121

131-
//TODO: Implement a vertex shader
122+
// TODO: Implement a vertex shader
132123
__global__ void vertexShadeKernel(float* vbo, int vbosize){
133124
int index = (blockIdx.x * blockDim.x) + threadIdx.x;
134125
if(index<vbosize/3){
135126
}
136127
}
137128

138-
//TODO: Implement primative assembly
129+
// TODO: Implement primative assembly
139130
__global__ void primitiveAssemblyKernel(float* vbo, int vbosize, float* cbo, int cbosize, int* ibo, int ibosize, triangle* primitives){
140131
int index = (blockIdx.x * blockDim.x) + threadIdx.x;
141132
int primitivesCount = ibosize/3;
142133
if(index<primitivesCount){
143134
}
144135
}
145136

146-
//TODO: Implement a rasterization method, such as scanline.
137+
// TODO: Implement a rasterization method, such as scanline.
147138
__global__ void rasterizationKernel(triangle* primitives, int primitivesCount, fragment* depthbuffer, glm::vec2 resolution){
148139
int index = (blockIdx.x * blockDim.x) + threadIdx.x;
149140
if(index<primitivesCount){
150141
}
151142
}
152143

153-
//TODO: Implement a fragment shader
144+
// TODO: Implement a fragment shader
154145
__global__ void fragmentShadeKernel(fragment* depthbuffer, glm::vec2 resolution){
155146
int x = (blockIdx.x * blockDim.x) + threadIdx.x;
156147
int y = (blockIdx.y * blockDim.y) + threadIdx.y;
@@ -159,7 +150,7 @@ __global__ void fragmentShadeKernel(fragment* depthbuffer, glm::vec2 resolution)
159150
}
160151
}
161152

162-
//Writes fragment colors to the framebuffer
153+
// Writes fragment colors to the framebuffer
163154
__global__ void render(glm::vec2 resolution, fragment* depthbuffer, glm::vec3* framebuffer){
164155

165156
int x = (blockIdx.x * blockDim.x) + threadIdx.x;
@@ -179,15 +170,15 @@ void cudaRasterizeCore(uchar4* PBOpos, glm::vec2 resolution, float frame, float*
179170
dim3 threadsPerBlock(tileSize, tileSize);
180171
dim3 fullBlocksPerGrid((int)ceil(float(resolution.x)/float(tileSize)), (int)ceil(float(resolution.y)/float(tileSize)));
181172

182-
//set up framebuffer
173+
// set up framebuffer
183174
framebuffer = NULL;
184175
cudaMalloc((void**)&framebuffer, (int)resolution.x*(int)resolution.y*sizeof(glm::vec3));
185176

186-
//set up depthbuffer
177+
// set up depthbuffer
187178
depthbuffer = NULL;
188179
cudaMalloc((void**)&depthbuffer, (int)resolution.x*(int)resolution.y*sizeof(fragment));
189180

190-
//kernel launches to black out accumulated/unaccumlated pixel buffers and clear our scattering states
181+
// kernel launches to black out accumulated/unaccumlated pixel buffers and clear our scattering states
191182
clearImage<<<fullBlocksPerGrid, threadsPerBlock>>>(resolution, framebuffer, glm::vec3(0,0,0));
192183

193184
fragment frag;
@@ -196,9 +187,9 @@ void cudaRasterizeCore(uchar4* PBOpos, glm::vec2 resolution, float frame, float*
196187
frag.position = glm::vec3(0,0,-10000);
197188
clearDepthBuffer<<<fullBlocksPerGrid, threadsPerBlock>>>(resolution, depthbuffer,frag);
198189

199-
//------------------------------
200-
//memory stuff
201-
//------------------------------
190+
// ------------------------------
191+
// memory stuff
192+
// ------------------------------
202193
primitives = NULL;
203194
cudaMalloc((void**)&primitives, (ibosize/3)*sizeof(triangle));
204195

@@ -217,34 +208,34 @@ void cudaRasterizeCore(uchar4* PBOpos, glm::vec2 resolution, float frame, float*
217208
tileSize = 32;
218209
int primitiveBlocks = ceil(((float)vbosize/3)/((float)tileSize));
219210

220-
//------------------------------
221-
//vertex shader
222-
//------------------------------
211+
// ------------------------------
212+
// vertex shader
213+
// ------------------------------
223214
vertexShadeKernel<<<primitiveBlocks, tileSize>>>(device_vbo, vbosize);
224215

225216
cudaDeviceSynchronize();
226-
//------------------------------
227-
//primitive assembly
228-
//------------------------------
217+
// ------------------------------
218+
// primitive assembly
219+
// ------------------------------
229220
primitiveBlocks = ceil(((float)ibosize/3)/((float)tileSize));
230221
primitiveAssemblyKernel<<<primitiveBlocks, tileSize>>>(device_vbo, vbosize, device_cbo, cbosize, device_ibo, ibosize, primitives);
231222

232223
cudaDeviceSynchronize();
233-
//------------------------------
234-
//rasterization
235-
//------------------------------
224+
// ------------------------------
225+
// rasterization
226+
// ------------------------------
236227
rasterizationKernel<<<primitiveBlocks, tileSize>>>(primitives, ibosize/3, depthbuffer, resolution);
237228

238229
cudaDeviceSynchronize();
239-
//------------------------------
240-
//fragment shader
241-
//------------------------------
230+
// ------------------------------
231+
// fragment shader
232+
// ------------------------------
242233
fragmentShadeKernel<<<fullBlocksPerGrid, threadsPerBlock>>>(depthbuffer, resolution);
243234

244235
cudaDeviceSynchronize();
245-
//------------------------------
246-
//write fragments to framebuffer
247-
//------------------------------
236+
// ------------------------------
237+
// write fragments to framebuffer
238+
// ------------------------------
248239
render<<<fullBlocksPerGrid, threadsPerBlock>>>(resolution, depthbuffer, framebuffer);
249240
sendImageToPBO<<<fullBlocksPerGrid, threadsPerBlock>>>(PBOpos, resolution, framebuffer);
250241

src/rasterize.h

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
#pragma once
2+
3+
#include <glm/glm.hpp>
4+
5+
void kernelCleanup();
6+
void cudaRasterizeCore(uchar4* pos, glm::vec2 resolution, float frame, float* vbo, int vbosize, float* cbo, int cbosize, int* ibo, int ibosize);

src/rasterizeKernels.h

Lines changed: 0 additions & 16 deletions
This file was deleted.

0 commit comments

Comments
 (0)