Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

D3D11 Direct draw batching #4706

Merged
merged 7 commits into from
Feb 23, 2025
Merged

D3D11 Direct draw batching #4706

merged 7 commits into from
Feb 23, 2025

Conversation

doitsujin
Copy link
Owner

@doitsujin doitsujin commented Feb 21, 2025

Because @DadSchoorse bullied me into it.

This uses VK_EXT_multi_draw to batch consecutive draws with no state changes, in much the same way we already batch consecutive indirect draws into a single indirect multidraw. Even if the extension isn't supported, this may slightly reduce CPU overhead because we're no longer redundantly checking dirty states all the time.

Games that see a notable reduction in draw calls include the Atelier series, Yakuza 0 / Kiwami, Nier Automata, Watch Dogs 2.

Based on #4699 to avoid rebase hell.

@adamdmoss
Copy link

It's a cool idea and the implementation looks more straightforward than I might have supposed.

My mild and probably worrying-over-nothing concern with anything that defers real work (rather than antiwork like barriers and binding 😀) is that it increases the risk of the GPU falling idle when it could be doing something useful-albeit-suboptimally-batched. Perhaps only a real-world risk in something like a shadow pass where practically a whole scene worth of geom might be deferred; I don't see a batching limit in my cursory reading of the PR diffs.

@doitsujin
Copy link
Owner Author

doitsujin commented Feb 21, 2025

The batching limit is implied by the CS chunk size, which is 16 kiB (which translates to up to 1022 non-indexed draws and 817 indexed draws, respectively). This really doesn't change much for our submission logic since we end up filling up CS chunks either way, and the GPU submission logic is (loosely) based on that.

@doitsujin
Copy link
Owner Author

doitsujin commented Feb 21, 2025

Also, again, this is a relatively rare scenario in general, most of the time there will be some sort of state change going on between draws even in a shadow pass (game binds a vertex shader, changes vertex/index buffer, updates a constant buffer, you name it). It is actually uncommon for games to run into this on a regular basis.

One thing this covers though is that e.g. Atelier Sophie 2 renders particles one at a time (as in, literally Draw(4, ...) for every single particle) with different vertex offsets each time.

@doitsujin doitsujin force-pushed the direct-draw-batching branch 4 times, most recently from ccb874c to 6038895 Compare February 22, 2025 00:44
Allows us to allocate a (potentially growing) array of
arbitrary data structures for a CS command.
Not super useful without backend support though.
Otherwise we'll count the HUD by accident. Only keep the barrier counter
since there are so many different places where we issue pipeline barriers,
and they are interesting anyway.
@doitsujin doitsujin force-pushed the direct-draw-batching branch from 6038895 to c9ffa30 Compare February 23, 2025 11:07
@doitsujin doitsujin marked this pull request as ready for review February 23, 2025 11:15
@doitsujin doitsujin merged commit c69dbc4 into master Feb 23, 2025
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants