Skip to content

Commit 491c16a

Browse files
committed
Merge horiz. and vert. passes in HBD Neon 2D avg convolution
The current Neon approach to 2D convolution is: 1) Filter horizontally, storing to an intermediate buffer. 2) Filter vertically, average with the dst block and store the final output. This patch merges the two phases for high bitdepth 2D convolution to avoid the storing and re-loading from the intermediate buffer. This provides a small gain (<5%) for large block sizes but the benefit increases for small block sizes - as the proportion of compute to memory access decreases. These effects are amplified further when considering little (in-order) core performance. Change-Id: I84f1cafcfbbfa48b2cfe4b20881da9c4bc3b56ac
1 parent 364326c commit 491c16a

File tree

1 file changed

+209
-264
lines changed

1 file changed

+209
-264
lines changed

0 commit comments

Comments
 (0)