A multi-site double machine learning study of 1.2 million culture episodes
Aravind V. Kuruvikkattil, Shikhar Shukla, Leo A. Celi, Zanthia Wiley, Judy W. Gichoya, Saptarshi Purkayastha
This repository contains the analysis code for estimating the causal effect of prior antibiotic exposure on subsequent antimicrobial resistance using double machine learning (DML) with XGBoost GPU nuisance models across three U.S. health systems: Mass General Brigham (MGB), Stanford Health Care, and Beth Israel Deaconess Medical Center (BIDMC, via the MIMIC-IV dataset).
- MGB: ARMD-MGB v1.0.0 (Wei & Kanjilal, 2025). Requires PhysioNet credentialed access.
- Stanford: ARMD-Stanford (Nateghi Haredasht et al., 2025; Oct 22, 2025 release). Available on Dryad under CC0.
- BIDMC/MIMIC: MIMIC-IV v3.1 (Johnson et al., 2023). Requires CITI training and signed DUA.
Raw data files should be placed in:
/data0/armd/(MGB)/data0/armd-stanford/(Stanford)/data0/mimic-iv/(BIDMC / MIMIC-IV)
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txtRequires Python 3.10+ and an NVIDIA GPU with CUDA support for XGBoost GPU acceleration.
Run the full analysis pipeline:
./run_analysis.sh # All 5 notebooks sequentially
./run_analysis.sh part1 # Data pipeline + primary DML only
./run_analysis.sh part2 # Sensitivity analyses only
./run_analysis.sh part3 # Cross-site downstream analyses only
./run_analysis.sh part4 # Empiric failure analysis only
./run_analysis.sh part5 # MRSA outcome + organism distribution by specimen onlyNotebooks are executed via jupyter nbconvert --execute with a 2-hour timeout per notebook.
Total runtime: approximately 4-6 hours on a single GPU.
The XGBoost nuisance models run on the GPU by default. Notebooks 02 and 03 honor a
CV_DEVICE=cpu environment variable for GPU-free re-runs (results are equivalent;
runtime is longer); notebooks 01 and 05 require a CUDA GPU.
| Step | Notebook | Description | Key outputs |
|---|---|---|---|
| 1 | 01_data_pipeline_dml.ipynb |
Data build + primary DML, cross-class adjustment, organism-stratified / clustered-bootstrap / propensity-score sensitivity | *_dml_primary.csv, cross_class_7x7_matrix.csv |
| 2 | 02_sensitivity_analyses.ipynb |
IPTW robustness, E-values, window sensitivity, dose-response, calendar-period drift | iptw_results.csv, dml_vs_iptw.csv, evalue_sensitivity.csv |
| 3 | 03_cross_site_downstream.ipynb |
Forest plot, heterogeneity, random-effects pooling, time decay, permutations, CEM | fig1_forest_plot.pdf, fig_random_effects_forest.pdf |
| 4 | 04_empiric_failure_analysis.ipynb |
Empiric therapy failure rates, preventable failures, monotherapy vs combination | ef_regimen_mono_vs_combo.csv, failure figures |
| 5 | 05_mrsa_and_specimen.ipynb |
MRSA as an outcome + organism distribution by specimen type | mrsa_outcome_dml.csv, organism_distribution_by_specimen.csv |
Notebooks must be run in order (each depends on outputs from previous steps).
amr_causal/
├── README.md
├── LICENSE (MIT)
├── .gitignore
├── requirements.txt
├── run_analysis.sh (pipeline runner)
├── validate_pipeline.py (output validation)
├── notebooks/
│ ├── 01_data_pipeline_dml.ipynb (data build + primary DML)
│ ├── 02_sensitivity_analyses.ipynb (IPTW, E-values, window, dose-response)
│ ├── 03_cross_site_downstream.ipynb (heterogeneity, permutations, CEM)
│ ├── 04_empiric_failure_analysis.ipynb (empiric therapy failure)
│ └── 05_mrsa_and_specimen.ipynb (MRSA outcome + organism by specimen)
├── outputs/
│ ├── data/ (intermediate CSVs, gitignored)
│ ├── results/ (analysis result CSVs)
│ └── figures/ (publication figures)
├── manuscript/
│ ├── manuscript.tex
│ ├── supplementary.tex
│ └── figures/ (copied from outputs/figures/)
└── executed/ (executed notebooks, gitignored)
| Drug Class | MGB (ACE, pp) | Stanford (ACE, pp) | MIMIC-IV (ACE, pp) |
|---|---|---|---|
| Fluoroquinolones | 11.9 | 12.6 | 8.6 |
| 3rd-gen cephalosporins | 4.0 | 5.7 | 2.8 |
| Carbapenems | 4.0 | 4.7 | 2.6 |
| Glycopeptides | 3.5 | 5.6 | 1.8 |
| Sulfonamides | 12.8 | 4.8* | --- |
| Ext-spec penicillins | 2.8 | 5.0 | 1.8 |
| Aminoglycosides | 3.3 | 6.0 | 3.2 |
All P < 0.001 except *Stanford sulfonamides (P = 0.059, 95% CI −0.2 to 9.7, crosses the null). Sulfonamides not testable in MIMIC-IV. Estimates adjust for concurrent cross-class exposure and the expanded reviewer-requested confounder set, which attenuates the average causal effects (ACE, percentage points) relative to earlier single-class models.
@article{kuruvikkattil2026amr,
title = {Prior Antibiotic Exposure and the Causal Risk of Antimicrobial Resistance: A Multi-Site Study of 1.2 Million Culture Episodes},
author = {Kuruvikkattil, Aravind V. and Shukla, Shikhar and Celi, Leo A. and Wiley, Zanthia and Gichoya, Judy W. and Purkayastha, Saptarshi},
year = {2026},
journal = {Submitted},
}MIT License. See LICENSE.