-
Notifications
You must be signed in to change notification settings - Fork 910
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
D3D11 Direct draw batching #4706
Conversation
It's a cool idea and the implementation looks more straightforward than I might have supposed. My mild and probably worrying-over-nothing concern with anything that defers real work (rather than antiwork like barriers and binding 😀) is that it increases the risk of the GPU falling idle when it could be doing something useful-albeit-suboptimally-batched. Perhaps only a real-world risk in something like a shadow pass where practically a whole scene worth of geom might be deferred; I don't see a batching limit in my cursory reading of the PR diffs. |
The batching limit is implied by the CS chunk size, which is 16 kiB (which translates to up to 1022 non-indexed draws and 817 indexed draws, respectively). This really doesn't change much for our submission logic since we end up filling up CS chunks either way, and the GPU submission logic is (loosely) based on that. |
Also, again, this is a relatively rare scenario in general, most of the time there will be some sort of state change going on between draws even in a shadow pass (game binds a vertex shader, changes vertex/index buffer, updates a constant buffer, you name it). It is actually uncommon for games to run into this on a regular basis. One thing this covers though is that e.g. Atelier Sophie 2 renders particles one at a time (as in, literally |
ccb874c
to
6038895
Compare
Allows us to allocate a (potentially growing) array of arbitrary data structures for a CS command.
Not super useful without backend support though.
Otherwise we'll count the HUD by accident. Only keep the barrier counter since there are so many different places where we issue pipeline barriers, and they are interesting anyway.
6038895
to
c9ffa30
Compare
Because @DadSchoorse bullied me into it.
This uses
VK_EXT_multi_draw
to batch consecutive draws with no state changes, in much the same way we already batch consecutive indirect draws into a single indirect multidraw. Even if the extension isn't supported, this may slightly reduce CPU overhead because we're no longer redundantly checking dirty states all the time.Games that see a notable reduction in draw calls include the Atelier series, Yakuza 0 / Kiwami, Nier Automata, Watch Dogs 2.
Based on #4699 to avoid rebase hell.