Skip to content

Commit b3a7a70

Browse files
committed
Add gpu shader debug injec idea
1 parent 904ea34 commit b3a7a70

File tree

1 file changed

+116
-0
lines changed

1 file changed

+116
-0
lines changed

docs/own/gpu-shader-debug.md

+116
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,116 @@
1+
# Shader debugging via GPU shader patching
2+
3+
Idea: instead of CPU emulation, do shader debugging directly on the GPU.
4+
Patching the shaders to store state, where it is interesting.
5+
6+
Assumptions:
7+
- shader/pipeline recompilation is fast. Like <1s
8+
- For the beginning, we can just assume buffer device address support
9+
Saves us from the hustle with pipeline layouts, can pass the buffer dst
10+
address via specialization constant.
11+
- we need to know/support all pipeline extensions to make this work,
12+
allow proper recompliation.
13+
SPIRV extensions are only a minor concern but *might* cause issues when
14+
not supported.
15+
16+
Flow:
17+
- user selects shader debugging in UI
18+
- the shader source code is shown
19+
- then, the user selects a line for a breakpoint.
20+
And can select the thread/vertex/pixel to be debugged.
21+
(could potentially be implicitly selected by "debug this vertex/pixel")
22+
- at this point, the UI installs a hook target
23+
- (future opt: already start shader patching, pipeline recompilation
24+
at this point, async)
25+
- work is submitted with hook:
26+
- when the target command is executed, instead use the patched pipeline.
27+
(block for the first time it is submitted? idk might be ok.
28+
But then need a caching mechanism. Hash by shader+breakpointLine or smth)
29+
- in the hook state, a buffer and struct layout + names is returned
30+
or nullopt, when breakpoint was not hit.
31+
(future: Could make this work for multiple breakpoints at once,
32+
returning multiple such pairs)
33+
34+
Shader patching:
35+
- Probably easiest just by hand, without a framework.
36+
- Remember some header information
37+
- Iterate over instructions
38+
- Remember the OpLine where the breakpoint is set.
39+
- Remember all OpVariables (later: named instructions? if anyone is using it)
40+
and their function owner?
41+
- ideally we built the CFG here and check which OpVariables stores
42+
came before the breakpoint line. But that is for later, just
43+
consider every variable in the function scope now I think.
44+
- what about callstacks? for now, do not capture anything I guess.
45+
Later on: before any function call of the breakpoint function,
46+
write additional data (at least opLine, possibly also local state?).
47+
Repeat recursively for those functions, too.
48+
- hm just opline should be enough. If that call is selected in UI,
49+
just switch breakpoint to that position then.
50+
- Then, we know all variables at the breakpoint line.
51+
- build a buffer layout: just a linear list of all the (local?) variables
52+
known and accessible at the breakpoint position. Also capture global
53+
state, e.g. stage inputs?
54+
55+
- allocate a device_address buffer with the needed size (known via type layout)
56+
- that buffer is hard-connected then to the patched module,
57+
same lifetime
58+
- Patch shader
59+
- Insert constant global value with the device address
60+
- After breakpoint OpLine, insert ops to copy all variables into
61+
that buffer device address at their respective offsets
62+
- create the shader module, compile the pipeline, if needed
63+
- what about shader objects?
64+
- draw/dispatch/traceRays with new module/pipeline
65+
- afterwards, copy from the associated buffer to hook-specific, host_local
66+
one
67+
- we have to make sure that not two queues can use the shader-buffer at the
68+
same time. But we disallow this shader debugging in local captures ->
69+
it can only be one queue. So shouldn't be an issue.
70+
71+
```
72+
// declare type of struct to save holding all variables to save
73+
%DstStruct = OpTypeStruct ... /* the variable types to store */
74+
// declare physicalStorageBuffer pointer type for struct
75+
%PhysicalBufferPointerDst = OpTypePointer PhysicalStorageBuffer %DstStruct
76+
77+
// Create struct of variables to save
78+
%mem1 = OpLoad %var1
79+
...
80+
%structSrc = OpCompositeConstruct %DstStruct %mem1 ...
81+
82+
// Create variable of type PhysicalBufferPointerDst with hard-coded buffer address
83+
%bufAddress = OpConstant %PhysicalBufferPointerDst /* hard-coded address */
84+
// Create access chain for storing to buffer struct
85+
// TODO: not sure if int_0 is needed. Meh, types here look weird.
86+
%structDst = OpAccessChain %PhysicalBufferPointerDst %bufAddress %int_0
87+
OpStore %structDst %structSrc
88+
89+
```
90+
91+
I dislike the hardcoded address a bit.
92+
Maybe at least use the same (fixed size, idk, 64k) buffer for all patched
93+
modules? can't be active at the same time.
94+
95+
---
96+
97+
WAIT, can we create a pointer with OpConstant? Not sure.
98+
99+
---
100+
101+
I guess for cpu-side representation of the struct, we can use buffmt.
102+
It becomes just one vil::Type (and a LinearAllocator).
103+
So, we have in CommandHookState:
104+
105+
```
106+
struct CopiedShaderData {
107+
Type type; // always a struct
108+
OwnBuffer data;
109+
};
110+
```
111+
112+
Later on, can add additional metadata (e.g. callstack) to dst data buffer.
113+
We build this Type during shader patching.
114+
For OpLine candidates for the breakpoint, remember the start of their
115+
function. When we have our selected candidate, evaluate all OpVariable
116+
values in that function (that came before the OpLine instruction itself).

0 commit comments

Comments
 (0)