stacked_eventstudy implements a stacked difference-in-differences estimator with
rolling-window controls by age at first birth.
The current package is built around the heterogeneity-robust stacked estimator described in Melentyeva and Riedel (2025), where each treatment-age cohort is estimated in its own stacked subevent and the resulting event-study coefficients are aggregated using focal cohort observation shares.
The code is not yet checked against the code of the authors (as there is no replication code available as of now).
The package currently includes:
- public APIs for validation and estimation
- validation of panel structure, treatment-age consistency, and cohort feasibility
- stacked data construction with rolling-window controls
- cohort-specific and aggregated event-study outputs
- clustered standard errors from a joint stacked regression
- optional pre-birth scaling
- synthetic tests for validation, aggregation, and recovery in simple designs
This repository uses pixi for environment management.
pixi installRun tests with:
pixi run pytestInput data must be an individual-level panel with:
- an individual identifier
- age in integer years
- age at first birth for every individual
- an outcome variable
Optional inputs:
- calendar year
- covariates
- weights
- a custom clustering variable
The current implementation assumes:
- all individuals are eventually treated
- age is observed in integer years
- treatment age is constant within individual
- there are no duplicate
id-ageobservations - if
calendar_year_colis supplied, there are no duplicateid-calendar_yearobservations
Let:
A_ibe individuali's age at first birthabe observed agel = a - A_ibe event time
For each treated cohort A:
- treated observations come from individuals with
A_i == A - control observations come from individuals with
A_i in {A + 1, ..., A + G}, whereG = control_window - controls are restricted to pre-birth observations,
age < treatment_age - both treated and controls are aligned relative to the treated cohort's birth age
A
That last point matters:
- controls are not aligned to their own birth age
- in subevent
A, both groups useevent_time = age - A
The estimator then:
- runs a cohort-specific regression for each admissible treatment-age cohort
- estimates a joint stacked model for covariance extraction
- aggregates cohort-specific coefficients using focal treated cohort observation shares
The table below maps the main implementation claims from Section IV of Melentyeva and Riedel (2025) to the package modules and behavioral tests that guard them.
| Paper component | Implementation | Test coverage |
|---|---|---|
| Rolling-window controls use future-treated cohorts and exclude already-treated control observations. | src/stacked_eventstudy/stacking.py::build_subevent_stack |
tests/test_paper_alignment.py::test_clean_room_reference_matches_package_cohort_estimates; tests/test_paper_alignment.py::test_stacked_data_satisfies_paper_invariants |
| Subevent-specific event-study indicators are estimated with subevent-by-age fixed effects, unit-by-subevent fixed effects, and clustered standard errors. | src/stacked_eventstudy/estimation.py::fit_joint_model |
tests/test_paper_alignment.py::test_clean_room_reference_matches_package_cohort_estimates; tests/test_estimation_properties.py::test_pyfixest_backend_matches_statsmodels_average_effects |
| The requested event-time window must be compatible with the rolling control window and feasible cohort range. | src/stacked_eventstudy/validate.py::_validate_with_config |
tests/test_validation.py::test_validate_rejects_infeasible_window; tests/test_paper_alignment.py::test_stacked_data_satisfies_paper_invariants |
| Age-at-birth-specific estimates are aggregated using cohort sample shares and the joint covariance matrix. | src/stacked_eventstudy/aggregation.py::compute_cohort_weights; src/stacked_eventstudy/aggregation.py::aggregate_cohort_params |
tests/test_estimation_properties.py::test_aggregation_matches_weighted_cohort_average; tests/test_paper_alignment.py::test_cohort_weights_use_focal_observation_mass |
| Pre-birth scaling divides cohort-specific effects and covariances by cohort-specific pre-birth levels. | src/stacked_eventstudy/scaling.py::compute_pre_birth_levels; src/stacked_eventstudy/scaling.py::scale_cohort_params |
tests/test_estimation_properties.py::test_pre_birth_scaling_matches_manual_scaling |
Use this first when you want to inspect whether the requested event-time window and rolling control design are feasible in your sample.
It returns a StackedEventStudyValidation object with:
is_validerrorswarningscohort_diagnosticssample_countswindow_feasibility
Main estimator signature:
estimate_stacked_eventstudy(
data,
id_col,
age_col,
treatment_age_col,
outcome_col,
l_min=-3,
l_max=4,
control_window=5,
reference_event_time=-1,
min_treatment_age=None,
max_treatment_age=None,
observed_min_age=None,
calendar_year_col=None,
covariates=(),
weights_col=None,
cluster_col=None,
heterogeneity_col=None,
heterogeneity_weighting="within",
scale="none",
backend="statsmodels",
return_stacked_data=False,
)Important arguments:
control_window: width of the future-treated control windowreference_event_time: omitted event time, usually-1backend: regression backend, either"statsmodels"or"pyfixest"heterogeneity_col: optional categorical, time-invariant column for group-specific effectsheterogeneity_weighting: group-specific aggregation weights, either"within"or"overall"scale="pre_birth": rescales effects by the treated cohort's mean outcome at the reference periodcluster_col: overrides default clustering on the original individual id
The estimator always requires complete treated and control support in the requested event-time window for an admissible cohort.
import pandas as pd
from stacked_eventstudy import (
estimate_stacked_eventstudy,
validate_stacked_eventstudy,
)
data = pd.DataFrame(
{
"id": [1, 1, 1, 2, 2, 2, 3, 3, 3],
"age": [24, 25, 26, 24, 25, 26, 24, 25, 26],
"treatment_age": [25, 25, 25, 26, 26, 26, 27, 27, 27],
"outcome": [10.0, 10.5, 11.2, 10.1, 10.6, 10.9, 10.2, 10.7, 11.0],
"calendar_year": [2004, 2005, 2006, 2004, 2005, 2006, 2004, 2005, 2006],
}
)
validation = validate_stacked_eventstudy(
data=data,
id_col="id",
age_col="age",
treatment_age_col="treatment_age",
outcome_col="outcome",
l_min=-1,
l_max=0,
control_window=2,
reference_event_time=-1,
calendar_year_col="calendar_year",
)
print("Validation status:", validation.is_valid)
print("Validation errors:", validation.errors)
if validation.is_valid:
result = estimate_stacked_eventstudy(
data=data,
id_col="id",
age_col="age",
treatment_age_col="treatment_age",
outcome_col="outcome",
l_min=-1,
l_max=0,
control_window=2,
reference_event_time=-1,
calendar_year_col="calendar_year",
return_stacked_data=True,
)
print(result.cohort_params)
print(result.average_params)A larger executable example is available in examples/basic_usage.py.
estimate_stacked_eventstudy(...) returns a StackedEventStudyResult.
One row per subevent x event_time, including:
subeventevent_timeterm_labelestimatestd_errorci_lowci_highscalen_treated_individualsn_control_individualsn_treated_obsn_control_obs
When heterogeneity_col is supplied, this table also includes heterogeneity_col and
heterogeneity_value.
One row per event time, including:
event_timeestimatestd_errorci_lowci_highn_cohortsscale
When heterogeneity_col is supplied, this table is one row per group and event time and
also includes heterogeneity_col, heterogeneity_value, and weight_scheme.
One row per admissible treated cohort:
subeventn_observationsweight_massweight
n_observations counts focal treated cohort rows in the requested event-time window.
weight_mass is the corresponding sum of input weights; in unweighted data this equals
n_observations. The normalized weight is the cohort's weight_mass divided by the
total weight_mass across retained cohorts.
When heterogeneity_col is supplied, weights are returned by group and cohort.
Covariance matrix for the aggregated event-study coefficients, indexed by event time.
When heterogeneity_col is supplied, rows and columns are indexed by
heterogeneity_value and event_time.
Pairwise group differences for each event time when heterogeneity_col is supplied. The
table is empty otherwise.
Returned only when return_stacked_data=True. This is useful for debugging cohort
construction and control alignment.
cohort_paramstells you how the estimated effect evolves within each admissible treatment-age cohortaverage_paramstells you the weighted average effect across admissible cohorts at each event timecohort_weightstells you how much each cohort contributes to that average
- Controls are aligned to the treated cohort's event time, not their own.
- The post-birth horizon is constrained by
control_window. - Supported regression backends are
statsmodelsandpyfixest. - There is no conventional benchmark estimator in the current package version.
- The current implementation is aimed at clean panel inputs and synthetic validation first; broader empirical hardening is still ongoing.
Run the included example with:
PYTHONPATH=src pixi run python examples/basic_usage.pyRun the heterogeneity example with:
PYTHONPATH=src pixi run python examples/heterogeneity_usage.pyMelentyeva, V., & Riedel, L. (2025). Child penalty estimation and mothers' age at first birth (No. 25-033). ZEW Discussion Papers.