Skip to content

Commit

Permalink
[ NEON ] Integrate multithread and general in GEMM, GEMV
Browse files Browse the repository at this point in the history
- With precise indexing, we can integrate multithread function and general case function for GEMM and GEMV
- For non-8-divisible case, an additional code block will be added, and it is expected to work as the same way for simd, but with value-by-value iteration to avoid segementation fault
- For 8-divisible case, we need to save room for last 8-block because:
	1. To apply openMP, we need for-loop only iterating with a single indexing variable
	2. This will not deteriorate the function latency, but might make the code length longer than before. Need to discuss after.

**Self evaluation:**
1. Build test:     [X]Passed [ ]Failed [ ]Skipped
2. Run test:     [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: skykongkong8 <[email protected]>
  • Loading branch information
skykongkong8 committed Feb 5, 2024
1 parent d88a34b commit d18cf56
Show file tree
Hide file tree
Showing 2 changed files with 425 additions and 745 deletions.
Loading

0 comments on commit d18cf56

Please sign in to comment.