StaggeredDifferenceInDifferences#
- class causalpy.experiments.staggered_did.StaggeredDifferenceInDifferences[source]#
A class to analyse data from staggered adoption Difference-in-Differences settings.
This estimator uses an imputation-based approach: it fits a model on untreated observations only (pre-treatment periods for eventually-treated units plus all periods for never-treated units), then predicts counterfactual outcomes for all observations. Treatment effects are computed as the difference between observed and predicted outcomes for treated observations.
- Parameters:
data (pd.DataFrame) – A pandas dataframe with panel data (unit x time observations).
formula (str) – A statistical model formula. Recommended: “y ~ 1 + C(unit) + C(time)” for unit and time fixed effects.
unit_variable_name (str) – Name of the column identifying units.
time_variable_name (str) – Name of the column identifying time periods.
treated_variable_name (str, optional) – Name of the column indicating treatment status (0/1). Defaults to “treated”.
treatment_time_variable_name (str, optional) – Name of the column containing unit-level treatment time (G_i). If None, treatment time is inferred from the treated_variable_name column.
never_treated_value (Any, optional) – Value indicating never-treated units in treatment_time column. Defaults to np.inf.
model (PyMCModel or RegressorMixin, optional) – A model for the untreated outcome. Defaults to None.
event_window (tuple[int, int], optional) – Tuple (min_event_time, max_event_time) to restrict event-time aggregation. If None, uses all available event-times.
reference_event_time (int, optional) – Event-time to use as reference (normalized to zero effect) in plots. Defaults to -1.
- data_#
Augmented data with G (treatment time), event_time, y_hat0 (counterfactual), and tau_hat (treatment effect) columns.
- Type:
pd.DataFrame
- att_group_time_#
Group-time ATT estimates: ATT(g, t) for each cohort g and calendar time t.
- Type:
pd.DataFrame
- att_event_time_#
Event-time ATT estimates: ATT(e) for each event-time e = t - G.
- Type:
pd.DataFrame
Example
>>> import causalpy as cp >>> from causalpy.data.simulate_data import generate_staggered_did_data >>> df = generate_staggered_did_data(n_units=30, n_time_periods=15, seed=42) >>> result = cp.StaggeredDifferenceInDifferences( ... df, ... formula="y ~ 1 + C(unit) + C(time)", ... unit_variable_name="unit", ... time_variable_name="time", ... treated_variable_name="treated", ... treatment_time_variable_name="treatment_time", ... model=cp.pymc_models.LinearRegression( ... sample_kwargs={ ... "tune": 100, ... "draws": 200, ... "chains": 2, ... "progressbar": False, ... } ... ), ... )
References
Borusyak, K., Jaravel, X., & Spiess, J. (2024). Revisiting Event Study Designs: Robust and Efficient Estimation. Review of Economic Studies.
Methods
Generate a decision-ready summary of causal effects.
StaggeredDifferenceInDifferences.fit(*args, ...)Recover the data of an experiment along with the prediction and causal impact information.
StaggeredDifferenceInDifferences.get_plot_data_bayesian([...])Get plotting data for Bayesian model.
Get plotting data for OLS model.
Validate the input data and parameters.
StaggeredDifferenceInDifferences.plot(*args, ...)Plot the model.
Ask the model to print its coefficients.
Print summary of main results.
Attributes
idataReturn the InferenceData object of the model.
supports_bayessupports_olslabels- __init__(data, formula, unit_variable_name, time_variable_name, treated_variable_name='treated', treatment_time_variable_name=None, never_treated_value=inf, model=None, event_window=None, reference_event_time=-1, **kwargs)[source]#
- Parameters:
- Return type:
None
- classmethod __new__(*args, **kwargs)#