You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Merge horiz. and vert. passes in HBD Neon 2D avg convolution
The current Neon approach to 2D convolution is:
1) Filter horizontally, storing to an intermediate buffer.
2) Filter vertically, average with the dst block and store the final
output.
This patch merges the two phases for high bitdepth 2D convolution to
avoid the storing and re-loading from the intermediate buffer. This
provides a small gain (<5%) for large block sizes but the benefit
increases for small block sizes - as the proportion of compute to
memory access decreases. These effects are amplified further when
considering little (in-order) core performance.
Change-Id: I84f1cafcfbbfa48b2cfe4b20881da9c4bc3b56ac
0 commit comments