[enhancement](workload policy) Add username-based backend workload policy support#60559
[enhancement](workload policy) Add username-based backend workload policy support#60559wenzhenghu wants to merge 56 commits intoapache:masterfrom
Conversation
support workload policy base on username
Removed unused import statement for DebugUtil.
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
FE UT Coverage ReportIncrement line coverage |
TPC-H: Total hot run time: 30745 ms |
ClickBench: Total hot run time: 28.3 s |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
FE Regression Coverage ReportIncrement line coverage |
|
run buildall |
TPC-H: Total hot run time: 30346 ms |
ClickBench: Total hot run time: 28.26 s |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
FE UT Coverage ReportIncrement line coverage |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
|
/review |
There was a problem hiding this comment.
Code Review Summary: [enhancement](workload policy) Add username-based backend workload policy support
Overview
This PR extends workload scheduling policies to support USERNAME as a metric that can be evaluated on the BE side, enabling policies like CONDITIONS(username='admin', query_time > 1000) ACTIONS(cancel_query). Previously, USERNAME was FE-only. The changes span FE validation logic, Thrift definitions, BE condition evaluation, and ThriftPlansBuilder for sending user identity to BE.
Critical Checkpoint Conclusions
1. Goal & Correctness: The goal is to allow USERNAME conditions in BE-evaluated policies (combined with BE metrics like query_time). The core logic is correct for the Nereids query path. However, there are gaps in other FE-to-BE paths (see issues below).
2. Modification Minimality: The change is reasonably focused. The PythonUDFMeta initialization fix (int64_t id = 0) is unrelated and should ideally be in a separate PR, but it's a trivial fix.
3. Concurrency: No new concurrency concerns. get_user() safely locks a weak_ptr before accessing QueryContext, and the return-by-value avoids dangling reference issues. The existing checkPolicyCondition logic uses the immutable FE_METRIC_SET/BE_METRIC_SET sets which are safe.
4. Lifecycle Management: QueryTaskController::get_user() correctly handles the case where QueryContext has been destroyed (returns empty string). No issues.
5. Configuration: No new configuration items added. N/A.
6. Incompatible Changes / Rolling Upgrades: New Thrift enum value USERNAME = 4 added. During rolling upgrades, an old BE that doesn't know about USERNAME will hit the default case in the factory and return nullptr / log error. The policy will not match, which is safe (no crash, just no enforcement). Acceptable.
7. Parallel Code Paths: Issue found. The resource_info is only set in ThriftPlansBuilder (Nereids path). The legacy Coordinator.FragmentExecParams.toThrift() and NereidsStreamLoadPlanner.plan() paths do NOT set resource_info. See inline comment.
8. Operator Validation: Issue found. No FE-side validation restricts operators for USERNAME to EQUAL only. A user can create CONDITIONS(username > 'admin') which will silently never match on BE and throw RuntimeException on FE evaluation. See inline comment.
9. Test Coverage: Good unit test coverage for BE (workload_condition_test, query_task_controller_test, workload_sched_policy_test). FE unit test covers the key policy creation scenarios. Regression test verifies end-to-end policy creation and basic execution. However, the regression test only verifies SELECT 1 completes (not that the policy actually triggers/matches). Negative test for invalid operators on USERNAME is missing.
10. Observability: No new metrics or log additions needed. The existing LOG(ERROR) in compare_string for unexpected operators and LOG(ERROR) in the factory for unknown metric types provide adequate observability.
11. Transaction/Persistence: The checkPolicyCondition returning null when feCondition is null bypasses the FE/BE consistency check, allowing ambiguous policies to be persisted. This is by design, since isFePolicy() is determined by actions, not conditions. Correct.
12. FE-BE Variable Passing: The TResourceInfo is sent via TPipelineFragmentParams and consumed in fragment_mgr.cpp:_get_or_create_query_ctx(). However, only one path (ThriftPlansBuilder) was updated. See Issue #1.
13. Performance: No performance concerns. The get_user() call involves a weak_ptr::lock() which is lightweight. String comparison is O(n) on username length which is negligible.
14. Other Issues: None beyond what's noted above.
|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
TPC-H: Total hot run time: 26805 ms |
TPC-DS: Total hot run time: 168722 ms |
|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
FE UT Coverage ReportIncrement line coverage |
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
|
run buildall |
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
FE UT Coverage ReportIncrement line coverage |
|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
|
/run buildall |
|
run buildall |
|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
What problem does this PR solve?
Issue Number: close #xxx
Related PR: #xxx
Problem Summary:
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)