-
-
Notifications
You must be signed in to change notification settings - Fork 8.8k
[mt] Split up gradient types for the GPU hist. #11798
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Use reduced gradient for tree structure exploration. The PR adds a gradient container that has two different gradient types, one for tree split and the other one for leaf values. This is an optimization for vector-leaf to reduce the overhead of finding tree structure. work on build hist. notes. work on evaluation. build all nodes. inputs. getter. alloc. proto. apply. disable. Work on high-level tests. lint. cleanup. test policy. lazy gen. Fix scan. Cleanup. Cleanup single split. Revert. Remove sync. check. cleanup. notes. error message. Cleanup grp type, notes. notes. leaf sum. leaf weight. cleanup. work on the path. update. update. sth to run. parameters. sort. probing. change n targets. Expand tree. q set leaves. print. fix. nidx. fix. copy root sum. copy root sum. Fix sum. work on a simple python test. note. sort leaves. cleanup. test. cleanup. comment. cleanup. check. note. try to find an interface. notes. remove. check. clenaup. Add debugging utilities. set root. allow smaller weight. Cleanup. Cleanup. cleanup. Doc, cleanup. Remove allocations. move container. unify the do boost. Remove the update method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces support for reduced gradients in multi-target tree construction, enabling different gradients for split evaluation versus leaf value calculation. This is part of an experimental feature for improving multi-target learning.
Key changes:
- Introduces a new
GradientContainerstruct to hold both split gradients and optional value gradients - Adds experimental
TreeObjectivePython class withsplit_gradmethod for custom gradient reduction - Refactors tree updaters and booster code to use
GradientContainerinstead of raw gradient matrices - Renames
SetLeaftoSetRootfor multi-target trees to clarify semantics
Reviewed Changes
Copilot reviewed 54 out of 54 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| include/xgboost/gradient.h | New header defining GradientContainer struct for holding split and value gradients |
| include/xgboost/tree_updater.h | Updates Update signature to accept GradientContainer* instead of gradient matrix |
| include/xgboost/tree_model.h | Renames SetLeaf to SetRoot and adds SetLeaves method for batch leaf updates |
| include/xgboost/multi_target_tree_model.h | Updates multi-target tree API to use SetRoot and adds SetLeaves |
| include/xgboost/learner.h | Updates BoostOneIter signature to accept GradientContainer |
| include/xgboost/gbm.h | Updates DoBoost signature to accept GradientContainer |
| include/xgboost/objective.h | Changes parameter type from std::int32_t to bst_target_t for consistency |
| include/xgboost/linalg.h | Adds EmptyLike utility function for tensor creation |
| src/tree/updater_gpu_hist.cuh | Implements reduced gradient support with separate split/value quantizers |
| src/tree/updater_gpu_hist.cu | Updates GPU histogram builder to handle gradient container |
| src/tree/leaf_sum.cuh | New file implementing leaf gradient sum calculations for GPU |
| src/tree/leaf_sum.cu | Implementation of leaf weight calculation from value gradients |
| src/tree/multi_target_tree_model.cc | Refactors weight setting with SetRoot and implements SetLeaves |
| src/tree/tree_model.cc | Adds SetLeaves wrapper method |
| src/tree/updater_*.cc | Updates all tree updaters to accept GradientContainer |
| src/gbm/gbtree.cc | Refactors boosting logic to handle gradient containers |
| src/gbm/gblinear.cc | Updates linear booster to use gradient container |
| src/learner.cc | Updates learner to use gradient containers throughout |
| src/c_api/c_api.cc | Adds experimental XGBoosterTrainOneIterWithObj API and renames function |
| src/c_api/c_api.cu | Renames CopyGradientFromCUDAArrays to CopyGradientFromCudaArrays |
| src/common/device_helpers.cuh | Removes deprecated DebugSyncDevice function |
| src/common/device_debug.cuh | New file with debug utilities moved from device_helpers |
| src/common/algorithm.h | Removes incorrect GPU check from ArgSort |
| python-package/xgboost/objective.py | New module with experimental objective interface |
| python-package/xgboost/core.py | Implements support for tree objectives with split gradients |
| python-package/xgboost/testing/multi_target.py | Adds comprehensive tests for reduced gradient feature |
| python-package/xgboost/testing/init.py | Fixes ls_obj to handle multi-dimensional arrays |
| tests/python-gpu/test_gpu_multi_target.py | Adds test for reduced gradient on GPU |
| tests/cpp/**/test_*.cc | Updates test files to use GradientContainer |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 56 out of 56 changed files in this pull request and generated 9 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 56 out of 56 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 60 out of 60 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
cc @rongou |
The PR adds a gradient container with two different gradient types: one for tree splits and the other for leaf values. This is an optimization for vector-leaf to reduce the overhead of finding the tree structure. Currently, the interface is exposed through a custom objective function by creating a specialized tree objective.
An alternative approach would be to decouple the dimension reduction function from the objective function by introducing an additional parameter, such as
grad_reducer. This is actually the first approach I tried.I don't have a strong opinion on which one we should choose. The objective approach facilitates associating the reduced gradient with the prediction and label more easily, but may be less modularized.
ref #9043