Skip to content

Conversation

@wtn
Copy link
Contributor

@wtn wtn commented Dec 3, 2025

Fixes #20951.

When using group_by() with a quantile parameter that varies per group (e.g., pl.col.quantile.first()), all groups incorrectly received the same quantile value instead of each group using its own.

Reproduction

df = pl.DataFrame({
    "value": [1, 2, 1, 2],
    "quantile": [0, 0, 1, 1],
})
df.group_by("quantile").agg(pl.col("value").quantile(pl.col("quantile").first()))
# Expected: quantile=0 -> 1.0, quantile=1 -> 2.0
# Actual: both groups returned 1.0

Cause

AggQuantileExpr::evaluate_on_groups() always called get_quantile() which evaluates the quantile expression against the full dataframe, returning a single scalar. This worked for literal quantile values but failed when the quantile expression varied per group (e.g., first() aggregation).

Fix

Added agg_varying_quantile which accepts a slice of quantile values (one per group) and computes quantile per group using the existing aggregation helpers.

polars-core changes:

  • Added agg_helper_idx_on_all_with_idx and _agg_helper_slice_with_idx helpers that pass the group index to closures
  • Added agg_varying_quantile_generic that iterates over groups with their corresponding quantile values
  • Added agg_varying_quantile methods to Float32Chunked, Float64Chunked, integer ChunkedArray, Series, and Column

polars-expr changes:

  • AggQuantileExpr::evaluate_on_groups() now detects whether the quantile is uniform (literal/scalar) or varies per group, and dispatches to the appropriate path

@github-actions github-actions bot added fix Bug fix python Related to Python Polars rust Related to Rust Polars labels Dec 3, 2025
@wtn wtn marked this pull request as ready for review December 3, 2025 18:00
@wtn wtn force-pushed the quantile branch 2 times, most recently from 1269250 to ea1dbcd Compare December 3, 2025 18:27
@codecov
Copy link

codecov bot commented Dec 3, 2025

Codecov Report

❌ Patch coverage is 94.35897% with 11 lines in your changes missing coverage. Please review.
✅ Project coverage is 79.47%. Comparing base (8ab1657) to head (bc86c4b).

Files with missing lines Patch % Lines
...s-core/src/frame/group_by/aggregations/dispatch.rs 90.56% 5 Missing ⚠️
...polars-core/src/frame/group_by/aggregations/mod.rs 95.45% 4 Missing ⚠️
crates/polars-expr/src/expressions/aggregation.rs 95.45% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #25606      +/-   ##
==========================================
- Coverage   79.58%   79.47%   -0.11%     
==========================================
  Files        1743     1743              
  Lines      240439   240618     +179     
  Branches     3038     3038              
==========================================
- Hits       191347   191242     -105     
- Misses      48310    48594     +284     
  Partials      782      782              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@wtn wtn force-pushed the quantile branch 3 times, most recently from d9f4626 to 630ab28 Compare December 3, 2025 21:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

fix Bug fix python Related to Python Polars rust Related to Rust Polars

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Varying quantile by group is broken

2 participants