Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ neon ] Apply openMP in HGEMM, HGEMV in neon for multithreading #2415

Merged
merged 5 commits into from
Feb 23, 2024

Conversation

skykongkong8
Copy link
Member

  • In special occasion, we can enjoy computational profit with multithreading
  • Settings in multithreading might differ.

Self evaluation:

  1. Build test: [X]Passed [ ]Failed [ ]Skipped
  2. Run test: [X]Passed [ ]Failed [ ]Skipped

@taos-ci
Copy link

taos-ci commented Jan 18, 2024

📝 TAOS-CI Version: 1.5.20200925. Thank you for submitting PR #2415. Please a submit 1commit/1PR (one commit per one PR) policy to get comments quickly from reviewers. Your PR must pass all verificiation processes of cibot before starting a review process from reviewers. If you are new member to join this project, please read manuals in documentation folder and wiki page. In order to monitor a progress status of your PR in more detail, visit http://ci.nnstreamer.ai/.

@taos-ci
Copy link

taos-ci commented Jan 18, 2024

:octocat: cibot: @skykongkong8, nntrainer/tensor/omp_setting.h does not include Doxygen tags such as @file @brief @author @bug. You must include the Doxygen tags in the source code. Please refer to a Doxygen manual at http://github.com/nnstreamer/TAOS-CI/blob/main/ci/doc/doxygen-documentation.md

Copy link

@taos-ci taos-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@skykongkong8, 💯 All CI checkers are successfully verified. Thanks.

@taos-ci
Copy link

taos-ci commented Jan 18, 2024

:octocat: cibot: @skykongkong8, nntrainer/tensor/omp_setting.h does not include Doxygen tags such as @file @brief @author @bug. You must include the Doxygen tags in the source code. Please refer to a Doxygen manual at http://github.com/nnstreamer/TAOS-CI/blob/main/ci/doc/doxygen-documentation.md

@taos-ci
Copy link

taos-ci commented Jan 18, 2024

:octocat: cibot: @skykongkong8, nntrainer/tensor/omp_setting.h does not include Doxygen tags such as @file @brief @author @bug. You must include the Doxygen tags in the source code. Please refer to a Doxygen manual at http://github.com/nnstreamer/TAOS-CI/blob/main/ci/doc/doxygen-documentation.md

@skykongkong8 skykongkong8 force-pushed the omp_neon branch 2 times, most recently from 3b40a41 to ccfe651 Compare January 18, 2024 10:26
@skykongkong8 skykongkong8 changed the title [ WIP ] [ neon ] Apply use openMP in HGEMM, HGEMV in neon [ WIP ] [ neon ] Apply openMP in HGEMM, HGEMV in neon for multithreading Jan 18, 2024
Copy link

@taos-ci taos-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@skykongkong8, 💯 All CI checkers are successfully verified. Thanks.

@skykongkong8 skykongkong8 force-pushed the omp_neon branch 2 times, most recently from b549840 to 83f65c5 Compare January 22, 2024 02:36
Copy link

@taos-ci taos-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@skykongkong8, 💯 All CI checkers are successfully verified. Thanks.

Copy link

@taos-ci taos-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@skykongkong8, 💯 All CI checkers are successfully verified. Thanks.

Copy link

@taos-ci taos-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@skykongkong8, 💯 All CI checkers are successfully verified. Thanks.

@skykongkong8 skykongkong8 force-pushed the omp_neon branch 2 times, most recently from 63d6e56 to 45a5fbc Compare February 14, 2024 02:45
Copy link

@taos-ci taos-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@skykongkong8, 💯 All CI checkers are successfully verified. Thanks.

Copy link
Contributor

@djeong20 djeong20 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@jihochu jihochu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

for (int j = 0; j < channel; ++j) { \
for (int k = 0; k < height_b; ++k) { \
for (int l = 0; l < width_b; ++l) { \
float val = equation_i_j_k_l; \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
float val = equation_i_j_k_l; \
float val = (equation_i_j_k_l); \

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what difference would this make? just curious
will do it anyway

Copy link

@taos-ci taos-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@skykongkong8, 💯 All CI checkers are successfully verified. Thanks.

- get omp thread num option when build, and allocate thread accordingly.
- getting system core num will be added in the near future
- With precise indexing, we can integrate multithread function and general case function for GEMM and GEMV
- For non-8-divisible case, an additional code block will be added, and it is expected to work as the same way for simd, but with value-by-value iteration to avoid segementation fault
- For 8-divisible case, we need to save room for last 8-block because:
	1. To apply openMP, we need for-loop only iterating with a single indexing variable
	2. This will not deteriorate the function latency, but might make the code length longer than before. Need to discuss after.

**Self evaluation:**
1. Build test:     [X]Passed [ ]Failed [ ]Skipped
2. Run test:     [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: skykongkong8 <[email protected]>
- Previously, there were only few half-precision GEMM, GEMV unittest cases with small dimensions.
- In this commit, I would like to add more cases to evaluate current GEMM GEMV performance in practical use.
- This commit might be helpful for:
  - Verification of custom-made GEMM, GEMV functions
  - Determining OMP_THRESHOLD
- User can modify the distribution of Tensor values by modifying MOD and alpha for their own use
- Additionally, GEN_TEST_INPUT_B is added. Previous TCs were praticing wrong input data generation when A (op) B = C.

**Self evaluation:**
1. Build test:     [X]Passed [ ]Failed [ ]Skipped
2. Run test:     [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: skykongkong8 <[email protected]>
- With current version of HGEMV transpose, there is no need to modify how-much-batches-to partial sum

**Self evaluation:**
1. Build test:     [X]Passed [ ]Failed [ ]Skipped
2. Run test:     [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: skykongkong8 <[email protected]>
- Redundant usage of int instead of unsigned int gives compiler warning.

**Self evaluation:**
1. Build test:     [X]Passed [ ]Failed [ ]Skipped
2. Run test:     [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: skykongkong8 <[email protected]>
- For better intuistic code

**Self evaluation:**
1. Build test:     [X]Passed [ ]Failed [ ]Skipped
2. Run test:     [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: skykongkong8 <[email protected]>
Copy link

@taos-ci taos-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@skykongkong8, 💯 All CI checkers are successfully verified. Thanks.

@jijoongmoon jijoongmoon merged commit 84fa87f into nnstreamer:main Feb 23, 2024
29 checks passed
@skykongkong8
Copy link
Member Author

FYI) For GEMM, an updated version is on WIP

@skykongkong8 skykongkong8 deleted the omp_neon branch March 4, 2024 03:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants