Causal Effects of Antibiotic Exposure on Antimicrobial Resistance

A multi-site double machine learning study of 1.2 million culture episodes

Aravind V. Kuruvikkattil, Shikhar Shukla, Leo A. Celi, Zanthia Wiley, Judy W. Gichoya, Saptarshi Purkayastha

This repository contains the analysis code for estimating the causal effect of prior antibiotic exposure on subsequent antimicrobial resistance using double machine learning (DML) with XGBoost GPU nuisance models across three U.S. health systems: Mass General Brigham (MGB), Stanford Health Care, and Beth Israel Deaconess Medical Center (BIDMC, via the MIMIC-IV dataset).

Data Requirements

MGB: ARMD-MGB v1.0.0 (Wei & Kanjilal, 2025). Requires PhysioNet credentialed access.
Stanford: ARMD-Stanford (Nateghi Haredasht et al., 2025; Oct 22, 2025 release). Available on Dryad under CC0.
BIDMC/MIMIC: MIMIC-IV v3.1 (Johnson et al., 2023). Requires CITI training and signed DUA.

Raw data files should be placed in:

/data0/armd/ (MGB)
/data0/armd-stanford/ (Stanford)
/data0/mimic-iv/ (BIDMC / MIMIC-IV)

Installation

python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

Requires Python 3.10+ and an NVIDIA GPU with CUDA support for XGBoost GPU acceleration.

Usage

Run the full analysis pipeline:

./run_analysis.sh          # All 5 notebooks sequentially
./run_analysis.sh part1    # Data pipeline + primary DML only
./run_analysis.sh part2    # Sensitivity analyses only
./run_analysis.sh part3    # Cross-site downstream analyses only
./run_analysis.sh part4    # Empiric failure analysis only
./run_analysis.sh part5    # MRSA outcome + organism distribution by specimen only

Notebooks are executed via jupyter nbconvert --execute with a 2-hour timeout per notebook. Total runtime: approximately 4-6 hours on a single GPU.

The XGBoost nuisance models run on the GPU by default. Notebooks 02 and 03 honor a CV_DEVICE=cpu environment variable for GPU-free re-runs (results are equivalent; runtime is longer); notebooks 01 and 05 require a CUDA GPU.

Pipeline overview

Step	Notebook	Description	Key outputs
1	`01_data_pipeline_dml.ipynb`	Data build + primary DML, cross-class adjustment, organism-stratified / clustered-bootstrap / propensity-score sensitivity	`*_dml_primary.csv`, `cross_class_7x7_matrix.csv`
2	`02_sensitivity_analyses.ipynb`	IPTW robustness, E-values, window sensitivity, dose-response, calendar-period drift	`iptw_results.csv`, `dml_vs_iptw.csv`, `evalue_sensitivity.csv`
3	`03_cross_site_downstream.ipynb`	Forest plot, heterogeneity, random-effects pooling, time decay, permutations, CEM	`fig1_forest_plot.pdf`, `fig_random_effects_forest.pdf`
4	`04_empiric_failure_analysis.ipynb`	Empiric therapy failure rates, preventable failures, monotherapy vs combination	`ef_regimen_mono_vs_combo.csv`, failure figures
5	`05_mrsa_and_specimen.ipynb`	MRSA as an outcome + organism distribution by specimen type	`mrsa_outcome_dml.csv`, `organism_distribution_by_specimen.csv`

Notebooks must be run in order (each depends on outputs from previous steps).

Repository Structure

amr_causal/
├── README.md
├── LICENSE                            (MIT)
├── .gitignore
├── requirements.txt
├── run_analysis.sh                    (pipeline runner)
├── validate_pipeline.py               (output validation)
├── notebooks/
│   ├── 01_data_pipeline_dml.ipynb     (data build + primary DML)
│   ├── 02_sensitivity_analyses.ipynb  (IPTW, E-values, window, dose-response)
│   ├── 03_cross_site_downstream.ipynb (heterogeneity, permutations, CEM)
│   ├── 04_empiric_failure_analysis.ipynb (empiric therapy failure)
│   └── 05_mrsa_and_specimen.ipynb     (MRSA outcome + organism by specimen)
├── outputs/
│   ├── data/                          (intermediate CSVs, gitignored)
│   ├── results/                       (analysis result CSVs)
│   └── figures/                       (publication figures)
├── manuscript/
│   ├── manuscript.tex
│   ├── supplementary.tex
│   └── figures/                       (copied from outputs/figures/)
└── executed/                          (executed notebooks, gitignored)

Key Results

Drug Class	MGB (ACE, pp)	Stanford (ACE, pp)	MIMIC-IV (ACE, pp)
Fluoroquinolones	11.9	12.6	8.6
3rd-gen cephalosporins	4.0	5.7	2.8
Carbapenems	4.0	4.7	2.6
Glycopeptides	3.5	5.6	1.8
Sulfonamides	12.8	4.8*	---
Ext-spec penicillins	2.8	5.0	1.8
Aminoglycosides	3.3	6.0	3.2

All P < 0.001 except *Stanford sulfonamides (P = 0.059, 95% CI −0.2 to 9.7, crosses the null). Sulfonamides not testable in MIMIC-IV. Estimates adjust for concurrent cross-class exposure and the expanded reviewer-requested confounder set, which attenuates the average causal effects (ACE, percentage points) relative to earlier single-class models.

Citation

@article{kuruvikkattil2026amr,
  title   = {Prior Antibiotic Exposure and the Causal Risk of Antimicrobial Resistance: A Multi-Site Study of 1.2 Million Culture Episodes},
  author  = {Kuruvikkattil, Aravind V. and Shukla, Shikhar and Celi, Leo A. and Wiley, Zanthia and Gichoya, Judy W. and Purkayastha, Saptarshi},
  year    = {2026},
  journal = {Submitted},
}

License

MIT License. See LICENSE.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Causal Effects of Antibiotic Exposure on Antimicrobial Resistance

Data Requirements

Installation

Usage

Pipeline overview

Repository Structure

Key Results

Citation

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
notebooks		notebooks
outputs		outputs
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
run_analysis.sh		run_analysis.sh
validate_pipeline.py		validate_pipeline.py

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Causal Effects of Antibiotic Exposure on Antimicrobial Resistance

Data Requirements

Installation

Usage

Pipeline overview

Repository Structure

Key Results

Citation

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages