Add built-in categorical_agg operation for link_selections with categorical Bars
Problem
HoloViews' link_selections is a powerful cross-filtering mechanism, but it only works out of the box with numeric dimensions via hv.operation.histogram. There is no built-in operation for categorical dimensions. Every natural approach a user tries fails:
Failure 1 — histogram() on a categorical dimension
from holoviews.operation import histogram
hist_cat = histogram(points, dimension='category')
ValueError: Categorical data found. Cannot create histogram from categorical data.
histogram explicitly rejects non-numeric data, so it cannot be repurposed for categorical counts.
Failure 2 — Pre-aggregated hv.Bars with link_selections
bars_agg = hv.Bars(
df.groupby('category').size().reset_index(name='count'),
kdims='category', vdims='count',
)
layout = ls(points) + ls(bars_agg)
The layout renders initially, but the moment a selection is made (lasso, box-select, or programmatic expression), it raises:
CallbackError: linked_selection aborted because it could not display selection
for all elements: One or more dimensions in the expression dim('x')>0 could not
resolve on ':Dataset [category] (count)' Ensure all dimensions referenced by
the expression are present on the supplied object on ':Bars [category] (count)'.
Because bars_agg was constructed from an independent DataFrame, it has no lineage back to points and cannot resolve the cross-filter expression.
Failure 3 — Raw hv.Bars from unaggregated data
bars_raw = hv.Bars(df, kdims=['category'])
print(bars_raw.kdims) # [Dimension('category')]
print(bars_raw.vdims) # [Dimension('x'), Dimension('y')]
No exception is raised, but the result is silently wrong. HoloViews auto-detects the remaining DataFrame columns (x, y) as vdims, producing 100 individual bars rather than a 4-bar categorical count chart. The element technically renders with link_selections, but the visualization is meaningless — it is not a count (or any aggregation) over categories.
Minimum reproducible example
Self-contained script demonstrating all three failures (HoloViews 1.22.1, Python 3.12):
import holoviews as hv
import numpy as np
import pandas as pd
hv.extension('bokeh')
# --- Sample data ---
rng = np.random.default_rng(42)
df = pd.DataFrame({
'category': rng.choice(['A', 'B', 'C', 'D'], 100),
'x': rng.normal(size=100),
'y': rng.normal(size=100),
})
ls = hv.link_selections.instance()
points = hv.Points(df, kdims=['x', 'y'], vdims=['category'])
# --- Failure 1: histogram on categorical dimension ---
from holoviews.operation import histogram
try:
hist_cat = histogram(points, dimension='category')
print('Attempt 1 succeeded (unexpected)')
except Exception as e:
print(f'Attempt 1 — histogram on categorical:\n {type(e).__name__}: {e}\n')
# --- Failure 2: Pre-aggregated Bars with link_selections ---
bars_agg = hv.Bars(
df.groupby('category').size().reset_index(name='count'),
kdims='category', vdims='count',
)
try:
layout = ls(points) + ls(bars_agg)
# Trigger a selection to expose the error
from holoviews.util.transform import dim
ls.selection_expr = (dim('x') > 0)
hv.render(layout)
print('Attempt 2 succeeded (unexpected)')
except Exception as e:
print(f'Attempt 2 — Pre-aggregated Bars:\n {type(e).__name__}: {e}\n')
# --- Failure 3: Raw Bars from unaggregated data (silent wrong behavior) ---
bars_raw = hv.Bars(df, kdims=['category'])
print(f'Attempt 3 — Raw Bars auto-detected vdims: {bars_raw.vdims}')
print(f' Expected 4 category-count bars, got {len(bars_raw)} individual rows.')
print(' No exception, but the chart is meaningless — not a categorical aggregation.\n')
Proposed solution — a built-in categorical_agg operation
Add a new operation to holoviews.operation (e.g. categorical_agg) that generalises what histogram does for numeric dimensions to categorical dimensions, supporting arbitrary aggregation functions — not just counting.
Suggested API
from holoviews.operation import categorical_agg
# Count occurrences per category (default)
bars_count = categorical_agg(points, dimension='category')
# Sum a numeric value dimension per category
bars_sum = categorical_agg(points, dimension='category', value_dimension='y', function=np.sum)
# Mean
bars_mean = categorical_agg(points, dimension='category', value_dimension='y', function=np.mean)
# Standard deviation
bars_std = categorical_agg(points, dimension='category', value_dimension='y', function=np.std)
# Min / Max
bars_min = categorical_agg(points, dimension='category', value_dimension='y', function=np.min)
bars_max = categorical_agg(points, dimension='category', value_dimension='y', function=np.max)
# All work seamlessly with link_selections
ls = hv.link_selections.instance()
layout = ls(points) + ls(bars_count) + ls(bars_mean)
Key parameters
| Parameter |
Type |
Default |
Description |
dimension |
str |
(required) |
Categorical dimension to group by |
value_dimension |
str or None |
None |
Numeric dimension to aggregate. When None, counts rows per category. |
function |
callable |
np.size |
Aggregation function applied to value_dimension (or row count when value_dimension is None). Any function accepting an array and returning a scalar: np.sum, np.mean, np.std, np.min, np.max, or a custom callable. |
label |
str or None |
None |
Label for the value axis. Auto-generated from function name if None (e.g. "mean(y)"). |
Requirements
- Must be a
holoviews.core.Operation subclass so it preserves data lineage — link_selections recurses into the operation's source element to resolve cross-filter expressions, then re-runs the operation on the filtered subset.
- Returns
hv.Bars with kdims=[dimension] and vdims=[label].
- Should handle edge cases: empty selections (return zero-height bars for all categories), missing categories in a filtered subset (fill with 0 or NaN as appropriate for the aggregation).
Working workaround
Until a built-in operation exists, this custom Operation subclass achieves the same result. The pattern comes from a real-world energy trading dashboard:
import param
import numpy as np
import holoviews as hv
from holoviews.core import Operation
class categorical_agg(Operation):
"""Aggregate a categorical dimension, returning Bars.
Preserves data lineage back to the source element so that
link_selections can resolve all source dimensions during cross-filtering.
"""
dimension = param.String(doc="Categorical dimension to group by")
value_dimension = param.String(default=None, allow_None=True,
doc="Numeric dimension to aggregate. None means count rows.")
function = param.Callable(default=np.size,
doc="Aggregation function (np.sum, np.mean, np.std, np.min, np.max, ...)")
label = param.String(default=None, allow_None=True,
doc="Label for the value axis. Auto-generated if None.")
def _process(self, element, key=None):
cat_vals = element.dimension_values(self.p.dimension, expanded=True)
unique_cats = np.unique(cat_vals)
if self.p.value_dimension is None:
# Pure count
_, counts = np.unique(cat_vals, return_counts=True)
agg_label = self.p.label or "Count"
data = list(zip(unique_cats, counts))
else:
num_vals = element.dimension_values(self.p.value_dimension, expanded=True)
results = []
for cat in unique_cats:
mask = cat_vals == cat
results.append(self.p.function(num_vals[mask]))
func_name = getattr(self.p.function, '__name__', 'agg')
agg_label = self.p.label or f"{func_name}({self.p.value_dimension})"
data = list(zip(unique_cats, results))
return hv.Bars(data, kdims=[self.p.dimension], vdims=[agg_label])
Full working example with cross-filtering
import holoviews as hv
import numpy as np
import param
from holoviews.core import Operation
class categorical_agg(Operation):
"""Aggregate a categorical dimension, returning Bars.
Preserves data lineage back to the source element so that
link_selections can resolve all source dimensions during cross-filtering.
"""
dimension = param.String(doc="Categorical dimension to group by")
value_dimension = param.String(
default=None,
allow_None=True,
doc="Numeric dimension to aggregate. None means count rows.",
)
function = param.Callable(
default=np.size,
doc="Aggregation function (np.sum, np.mean, np.std, np.min, np.max, ...)",
)
label = param.String(
default=None,
allow_None=True,
doc="Label for the value axis. Auto-generated if None.",
)
def _process(self, element, key=None):
cat_vals = element.dimension_values(self.p.dimension, expanded=True)
unique_cats = np.unique(cat_vals)
if self.p.value_dimension is None:
# Pure count
_, counts = np.unique(cat_vals, return_counts=True)
agg_label = self.p.label or "Count"
data = list(zip(unique_cats, counts))
else:
num_vals = element.dimension_values(self.p.value_dimension, expanded=True)
results = []
for cat in unique_cats:
mask = cat_vals == cat
results.append(self.p.function(num_vals[mask]))
func_name = getattr(self.p.function, "__name__", "agg")
agg_label = self.p.label or f"{func_name}({self.p.value_dimension})"
data = list(zip(unique_cats, results))
return hv.Bars(data, kdims=[self.p.dimension], vdims=[agg_label])
import numpy as np
import pandas as pd
hv.extension("bokeh")
rng = np.random.default_rng(42)
df = pd.DataFrame(
{
"x": rng.normal(0, 1, 200),
"y": rng.normal(0, 1, 200),
"category": rng.choice(["A", "B", "C", "D"], 200),
}
)
from holoviews.operation import histogram
ls = hv.link_selections.instance()
points = hv.Points(df, kdims=["x", "y"], vdims=["category"])
scatter = ls(points)
bars_count = ls(categorical_agg(points, dimension="category"))
bars_mean = ls(
categorical_agg(points, dimension="category", value_dimension="y", function=np.mean)
)
hist_x = ls(histogram(points, dimension="x", num_bins=20))
layout = scatter + bars_count + bars_mean + hist_x
import panel as pn
pn.extension()
pn.panel(layout, sizing_mode="stretch_both").servable()
Why this works
HoloViews operations maintain a reference to their source element. When link_selections encounters a selection expression like dim('x') > 0, it recurses into the operation's source (points) where x exists, applies the filter, then re-runs the operation on the filtered subset. Pre-aggregated hv.Bars lack this source reference, so the expression cannot resolve and raises CallbackError.
Documentation ask
The Linked Brushing user guide currently only demonstrates numeric dimensions with histogram. Please update it to:
- Show the
categorical_agg operation (once built in) as the recommended approach for categorical Bars, including examples of count, sum, mean, etc.
- Document the general principle: operations that derive from a source element preserve lineage, enabling
link_selections to cross-filter through them.
- Demonstrate how to build and use a custom
Operation with linked selections
Related issues
- #3842 — Original linked selections proposal
- PR #3951 — Initial
link_selections implementation
Add built-in
categorical_aggoperation forlink_selectionswith categorical BarsProblem
HoloViews'
link_selectionsis a powerful cross-filtering mechanism, but it only works out of the box with numeric dimensions viahv.operation.histogram. There is no built-in operation for categorical dimensions. Every natural approach a user tries fails:Failure 1 —
histogram()on a categorical dimensionhistogramexplicitly rejects non-numeric data, so it cannot be repurposed for categorical counts.Failure 2 — Pre-aggregated
hv.Barswithlink_selectionsThe layout renders initially, but the moment a selection is made (lasso, box-select, or programmatic expression), it raises:
Because
bars_aggwas constructed from an independent DataFrame, it has no lineage back topointsand cannot resolve the cross-filter expression.Failure 3 — Raw
hv.Barsfrom unaggregated dataNo exception is raised, but the result is silently wrong. HoloViews auto-detects the remaining DataFrame columns (
x,y) asvdims, producing 100 individual bars rather than a 4-bar categorical count chart. The element technically renders withlink_selections, but the visualization is meaningless — it is not a count (or any aggregation) over categories.Minimum reproducible example
Self-contained script demonstrating all three failures (HoloViews 1.22.1, Python 3.12):
Proposed solution — a built-in
categorical_aggoperationAdd a new operation to
holoviews.operation(e.g.categorical_agg) that generalises whathistogramdoes for numeric dimensions to categorical dimensions, supporting arbitrary aggregation functions — not just counting.Suggested API
Key parameters
dimensionstrvalue_dimensionstrorNoneNoneNone, counts rows per category.functioncallablenp.sizevalue_dimension(or row count whenvalue_dimensionisNone). Any function accepting an array and returning a scalar:np.sum,np.mean,np.std,np.min,np.max, or a custom callable.labelstrorNoneNoneNone(e.g."mean(y)").Requirements
holoviews.core.Operationsubclass so it preserves data lineage —link_selectionsrecurses into the operation's source element to resolve cross-filter expressions, then re-runs the operation on the filtered subset.hv.Barswithkdims=[dimension]andvdims=[label].Working workaround
Until a built-in operation exists, this custom
Operationsubclass achieves the same result. The pattern comes from a real-world energy trading dashboard:Full working example with cross-filtering
import holoviews as hv import numpy as np import param from holoviews.core import Operation class categorical_agg(Operation): """Aggregate a categorical dimension, returning Bars. Preserves data lineage back to the source element so that link_selections can resolve all source dimensions during cross-filtering. """ dimension = param.String(doc="Categorical dimension to group by") value_dimension = param.String( default=None, allow_None=True, doc="Numeric dimension to aggregate. None means count rows.", ) function = param.Callable( default=np.size, doc="Aggregation function (np.sum, np.mean, np.std, np.min, np.max, ...)", ) label = param.String( default=None, allow_None=True, doc="Label for the value axis. Auto-generated if None.", ) def _process(self, element, key=None): cat_vals = element.dimension_values(self.p.dimension, expanded=True) unique_cats = np.unique(cat_vals) if self.p.value_dimension is None: # Pure count _, counts = np.unique(cat_vals, return_counts=True) agg_label = self.p.label or "Count" data = list(zip(unique_cats, counts)) else: num_vals = element.dimension_values(self.p.value_dimension, expanded=True) results = [] for cat in unique_cats: mask = cat_vals == cat results.append(self.p.function(num_vals[mask])) func_name = getattr(self.p.function, "__name__", "agg") agg_label = self.p.label or f"{func_name}({self.p.value_dimension})" data = list(zip(unique_cats, results)) return hv.Bars(data, kdims=[self.p.dimension], vdims=[agg_label]) import numpy as np import pandas as pd hv.extension("bokeh") rng = np.random.default_rng(42) df = pd.DataFrame( { "x": rng.normal(0, 1, 200), "y": rng.normal(0, 1, 200), "category": rng.choice(["A", "B", "C", "D"], 200), } ) from holoviews.operation import histogram ls = hv.link_selections.instance() points = hv.Points(df, kdims=["x", "y"], vdims=["category"]) scatter = ls(points) bars_count = ls(categorical_agg(points, dimension="category")) bars_mean = ls( categorical_agg(points, dimension="category", value_dimension="y", function=np.mean) ) hist_x = ls(histogram(points, dimension="x", num_bins=20)) layout = scatter + bars_count + bars_mean + hist_x import panel as pn pn.extension() pn.panel(layout, sizing_mode="stretch_both").servable()Why this works
HoloViews operations maintain a reference to their source element. When
link_selectionsencounters a selection expression likedim('x') > 0, it recurses into the operation's source (points) wherexexists, applies the filter, then re-runs the operation on the filtered subset. Pre-aggregatedhv.Barslack this source reference, so the expression cannot resolve and raisesCallbackError.Documentation ask
The Linked Brushing user guide currently only demonstrates numeric dimensions with
histogram. Please update it to:categorical_aggoperation (once built in) as the recommended approach for categorical Bars, including examples of count, sum, mean, etc.link_selectionsto cross-filter through them.Operationwith linked selectionsRelated issues
link_selectionsimplementation