Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature(rf optimizations): enabling oneDPL and sort primitive refactoring #3046

Merged

Conversation

Alexandr-Solovev
Copy link
Contributor

@Alexandr-Solovev Alexandr-Solovev commented Jan 16, 2025

Description:

RF optimizations: enabling oneDPL and sort primitive refactoring and several functions optimization

Summary:

This PR introduces oneDPL enabling and radix sort replacement. Also the engine_type support has been added for RF GPU. A lot of CPU functions have been replaced with GPU analogues.

PR completeness and readability

  • I have reviewed my changes thoroughly before submitting this pull request.
  • I have commented my code, particularly in hard-to-understand areas.
  • I have updated the documentation to reflect the changes or created a separate PR with update and provided its number in the description, if necessary.
  • Git commit message contains an appropriate signed-off-by string (see CONTRIBUTING.md for details).
  • I have added a respective label(s) to PR if I have a permission for that.
  • I have resolved any merge conflicts that might occur with the base branch.

Testing

  • I have run it locally and tested the changes extensively.
  • All CI jobs are green or I have provided justification why they aren't.
  • I have extended testing suite if new functionality was introduced in this PR.

Performance

  • I have measured performance for affected algorithms using scikit-learn_bench and provided at least summary table with measured data, if performance change is expected.
  • I have provided justification why performance has changed or why changes are not expected.
  • I have provided justification why quality metrics have changed or why changes are not expected.
  • I have extended benchmarking suite and provided corresponding scikit-learn_bench PR if new measurable functionality was introduced in this PR.

@david-cortes-intel
Copy link
Contributor

Before merging, please remember to add this new dependency to the installation instructions in INSTALL.md, along with instructions for setting necessary env. variables when using conda:
https://github.com/uxlfoundation/oneDAL/blob/main/INSTALL.md

@Alexandr-Solovev Alexandr-Solovev added dpc++ Issue/PR related to DPC++ functionality dependencies Pull requests that update a dependency file labels Jan 22, 2025
@Alexandr-Solovev Alexandr-Solovev changed the title init adding dpl feature: enabling oneDPL and sorting primitive refactoring Jan 22, 2025
@Alexandr-Solovev Alexandr-Solovev marked this pull request as ready for review January 22, 2025 20:28
@Alexandr-Solovev
Copy link
Contributor Author

/intelci: run

@Alexandr-Solovev Alexandr-Solovev changed the title feature: enabling oneDPL and sorting primitive refactoring feature: enabling oneDPL and sort primitive refactoring Jan 22, 2025
@Alexandr-Solovev
Copy link
Contributor Author

/intelci: run

@ethanglaser
Copy link
Contributor

@ethanglaser
Copy link
Contributor

Copy link
Contributor

@ethanglaser ethanglaser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, CI is green, and I have ran this enough times on cluster to know it works :) but would wait for others on feedback for specific implementation details

@Alexandr-Solovev
Copy link
Contributor Author

/azp run CI

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@Alexandr-Solovev
Copy link
Contributor Author

/intelci: run

@Alexandr-Solovev
Copy link
Contributor Author

/intelci: run

@ethanglaser
Copy link
Contributor

/azp run CI

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

function install_mkl {
sudo apt-get install -y intel-oneapi-mkl-devel-2025.0
install_tbb
install_dpl
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is dpl a dependency of MKL? I thought tbb was integrated here to install_mkl for that reason

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess mkl and tbb have no deps on each other, but my understanding its a step for install all necessary deps for onedal

@@ -129,6 +134,9 @@ elif [ "${component}" == "tbb" ]; then
elif [ "${component}" == "mkl" ]; then
add_repo
install_mkl
elif [ "${component}" == "dpl" ]; then
add_repo
install_dpl
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add to the help list at the end of this file "dpl"

name = "dpl",
root_env_var = "DPL_ROOT",
urls = [
"https://files.pythonhosted.org/packages/95/f6/18f78cb933e01ecd9e99d37a10da4971a795fcfdd1d24640799b4050fdbb/onedpl_devel-2022.7.1-py2.py3-none-manylinux_2_28_x86_64.whl",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dumb question, but how do we find these values/maintain them? It looks painful.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we do the same thing for all other packages like tbb and mkl. Find it on pypi and copy links)

auto src_ind = pr::ndarray<Index, 1>::empty(queue_, { src.get_count() });
return pr::radix_sort_indices_inplace<Float, Index>{ queue_ }(src, src_ind, deps);
if (device_name.find("Data Center GPU Max") != std::string::npos) {
Copy link
Contributor

@icfaust icfaust Mar 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels dangerous somehow. Definitely add some comments. Ideally device checking should exist as a primitive rather than in an algo because this is a bit of a nasty surprise to anyone not well-versed in this algo when trying to debug on various hardware.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Vika-F planned to add this feature in future

@@ -61,6 +61,10 @@ class descriptor_impl : public base {
error_metric_mode error_metric_mode_value = error_metric_mode::none;
infer_mode infer_mode_value = infer_mode::class_responses;

// The default engine has been switched from mt2203 to philox for GPU,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good, I would love to see what this does to overall performance.

@Alexandr-Solovev
Copy link
Contributor Author

/intelci: run

@Alexandr-Solovev Alexandr-Solovev merged commit 2d21aad into uxlfoundation:main Mar 21, 2025
10 of 11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Pull requests that update a dependency file dpc++ Issue/PR related to DPC++ functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants