Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[ NEON ] Integrate multithread and general in GEMM, GEMV
- With precise indexing, we can integrate multithread function and general case function for GEMM and GEMV - For non-8-divisible case, an additional code block will be added, and it is expected to work as the same way for simd, but with value-by-value iteration to avoid segementation fault - For 8-divisible case, we need to save room for last 8-block because: 1. To apply openMP, we need for-loop only iterating with a single indexing variable 2. This will not deteriorate the function latency, but might make the code length longer than before. Need to discuss after. **Self evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: skykongkong8 <[email protected]>
- Loading branch information