-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ neon ] Apply openMP in HGEMM, HGEMV in neon for multithreading #2415
Conversation
📝 TAOS-CI Version: 1.5.20200925. Thank you for submitting PR #2415. Please a submit 1commit/1PR (one commit per one PR) policy to get comments quickly from reviewers. Your PR must pass all verificiation processes of cibot before starting a review process from reviewers. If you are new member to join this project, please read manuals in documentation folder and wiki page. In order to monitor a progress status of your PR in more detail, visit http://ci.nnstreamer.ai/. |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@skykongkong8, 💯 All CI checkers are successfully verified. Thanks.
71b54b5
to
e7e20a1
Compare
|
e7e20a1
to
902366d
Compare
|
3b40a41
to
ccfe651
Compare
ccfe651
to
2a6609f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@skykongkong8, 💯 All CI checkers are successfully verified. Thanks.
b549840
to
83f65c5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@skykongkong8, 💯 All CI checkers are successfully verified. Thanks.
83f65c5
to
d88a34b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@skykongkong8, 💯 All CI checkers are successfully verified. Thanks.
d18cf56
to
32b896f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@skykongkong8, 💯 All CI checkers are successfully verified. Thanks.
63d6e56
to
45a5fbc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@skykongkong8, 💯 All CI checkers are successfully verified. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
test/include/nntrainer_test_util.h
Outdated
for (int j = 0; j < channel; ++j) { \ | ||
for (int k = 0; k < height_b; ++k) { \ | ||
for (int l = 0; l < width_b; ++l) { \ | ||
float val = equation_i_j_k_l; \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
float val = equation_i_j_k_l; \ | |
float val = (equation_i_j_k_l); \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what difference would this make? just curious
will do it anyway
bba28d9
to
a476037
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@skykongkong8, 💯 All CI checkers are successfully verified. Thanks.
- get omp thread num option when build, and allocate thread accordingly. - getting system core num will be added in the near future - With precise indexing, we can integrate multithread function and general case function for GEMM and GEMV - For non-8-divisible case, an additional code block will be added, and it is expected to work as the same way for simd, but with value-by-value iteration to avoid segementation fault - For 8-divisible case, we need to save room for last 8-block because: 1. To apply openMP, we need for-loop only iterating with a single indexing variable 2. This will not deteriorate the function latency, but might make the code length longer than before. Need to discuss after. **Self evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: skykongkong8 <[email protected]>
- Previously, there were only few half-precision GEMM, GEMV unittest cases with small dimensions. - In this commit, I would like to add more cases to evaluate current GEMM GEMV performance in practical use. - This commit might be helpful for: - Verification of custom-made GEMM, GEMV functions - Determining OMP_THRESHOLD - User can modify the distribution of Tensor values by modifying MOD and alpha for their own use - Additionally, GEN_TEST_INPUT_B is added. Previous TCs were praticing wrong input data generation when A (op) B = C. **Self evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: skykongkong8 <[email protected]>
- With current version of HGEMV transpose, there is no need to modify how-much-batches-to partial sum **Self evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: skykongkong8 <[email protected]>
- Redundant usage of int instead of unsigned int gives compiler warning. **Self evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: skykongkong8 <[email protected]>
- For better intuistic code **Self evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: skykongkong8 <[email protected]>
a476037
to
8a8b7a9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@skykongkong8, 💯 All CI checkers are successfully verified. Thanks.
FYI) For GEMM, an updated version is on WIP |
Self evaluation: