Skip to content

Commit 6851520

Browse files
gigikennethbms63manciniedoardorossfarrugiaStefanThoma
authored
Closes #239 packages for managing clinical trial data (#273)
* packages for managing clinical trial data * Update WORDLIST.txt * Delete posts/2025-02-17_theres_a_pharmaverse_package_for_that/appendix.R * Update managing-clinical-trial-data.qmd * Update managing-clinical-trial-data.qmd * Update posts/2025-02-17_theres_a_pharmaverse_package_for_that/managing-clinical-trial-data.qmd Co-authored-by: Ben Straub <[email protected]> * Update managing-clinical-trial-data.qmd * Update posts/2025-02-17_theres_a_pharmaverse_package_for_that/managing-clinical-trial-data.qmd Co-authored-by: Ben Straub <[email protected]> * Update WORDLIST.txt * Update posts/2025-02-17_theres_a_pharmaverse_package_for_that/managing-clinical-trial-data.qmd Co-authored-by: Edoardo Mancini <[email protected]> * Update posts/2025-02-17_theres_a_pharmaverse_package_for_that/managing-clinical-trial-data.qmd Co-authored-by: Edoardo Mancini <[email protected]> * Update posts/2025-02-17_theres_a_pharmaverse_package_for_that/managing-clinical-trial-data.qmd Co-authored-by: Edoardo Mancini <[email protected]> * Update posts/2025-02-17_theres_a_pharmaverse_package_for_that/managing-clinical-trial-data.qmd Co-authored-by: Edoardo Mancini <[email protected]> * Update posts/2025-02-17_theres_a_pharmaverse_package_for_that/managing-clinical-trial-data.qmd Co-authored-by: Edoardo Mancini <[email protected]> * Update posts/2025-02-17_theres_a_pharmaverse_package_for_that/managing-clinical-trial-data.qmd Co-authored-by: Edoardo Mancini <[email protected]> * Update posts/2025-02-17_theres_a_pharmaverse_package_for_that/managing-clinical-trial-data.qmd Co-authored-by: Ross Farrugia <[email protected]> * Update posts/2025-02-17_theres_a_pharmaverse_package_for_that/managing-clinical-trial-data.qmd Co-authored-by: Ross Farrugia <[email protected]> * Update posts/2025-02-17_theres_a_pharmaverse_package_for_that/managing-clinical-trial-data.qmd Co-authored-by: Edoardo Mancini <[email protected]> * Update posts/2025-02-17_theres_a_pharmaverse_package_for_that/managing-clinical-trial-data.qmd Co-authored-by: Edoardo Mancini <[email protected]> * Update posts/2025-02-17_theres_a_pharmaverse_package_for_that/managing-clinical-trial-data.qmd Co-authored-by: Edoardo Mancini <[email protected]> * Update posts/2025-02-17_theres_a_pharmaverse_package_for_that/managing-clinical-trial-data.qmd Co-authored-by: Edoardo Mancini <[email protected]> * Update posts/2025-02-17_theres_a_pharmaverse_package_for_that/managing-clinical-trial-data.qmd Co-authored-by: Edoardo Mancini <[email protected]> * Update posts/2025-02-17_theres_a_pharmaverse_package_for_that/managing-clinical-trial-data.qmd Co-authored-by: Edoardo Mancini <[email protected]> * Update managing-clinical-trial-data.qmd * Update posts/2025-02-17_theres_a_pharmaverse_package_for_that/managing-clinical-trial-data.qmd Co-authored-by: Edoardo Mancini <[email protected]> * Update managing-clinical-trial-data.qmd * Update managing-clinical-trial-data.qmd * Update posts/2025-02-17_theres_a_pharmaverse_package_for_that/managing-clinical-trial-data.qmd Co-authored-by: Edoardo Mancini <[email protected]> * Update posts/2025-02-17_theres_a_pharmaverse_package_for_that/managing-clinical-trial-data.qmd Co-authored-by: Edoardo Mancini <[email protected]> * Update posts/2025-02-17_theres_a_pharmaverse_package_for_that/managing-clinical-trial-data.qmd Co-authored-by: Edoardo Mancini <[email protected]> * Update managing-clinical-trial-data.qmd * Update managing-clinical-trial-data.qmd * Update managing-clinical-trial-data.qmd * Update managing-clinical-trial-data.qmd * Update posts/2025-02-17_theres_a_pharmaverse_package_for_that/managing-clinical-trial-data.qmd Co-authored-by: StefanThoma <[email protected]> * Update posts/2025-02-17_theres_a_pharmaverse_package_for_that/managing-clinical-trial-data.qmd Co-authored-by: StefanThoma <[email protected]> * Update posts/2025-02-17_theres_a_pharmaverse_package_for_that/managing-clinical-trial-data.qmd Co-authored-by: StefanThoma <[email protected]> * Update posts/2025-02-17_theres_a_pharmaverse_package_for_that/managing-clinical-trial-data.qmd Co-authored-by: StefanThoma <[email protected]> * Update managing-clinical-trial-data.qmd * Update managing-clinical-trial-data.qmd * Update WORDLIST.txt * Update managing-clinical-trial-data.qmd * Updates following review * Date change * Update posts/2025-02-28_theres_a_pharmaverse_package_for_that/managing-clinical-trial-data.qmd Co-authored-by: Ross Farrugia <[email protected]> * Update posts/2025-02-28_theres_a_pharmaverse_package_for_that/managing-clinical-trial-data.qmd Co-authored-by: Ross Farrugia <[email protected]> * spelling --------- Co-authored-by: Ben Straub <[email protected]> Co-authored-by: Edoardo Mancini <[email protected]> Co-authored-by: Ross Farrugia <[email protected]> Co-authored-by: StefanThoma <[email protected]> Co-authored-by: Edoardo Mancini <[email protected]>
1 parent ececc29 commit 6851520

File tree

9 files changed

+325
-0
lines changed

9 files changed

+325
-0
lines changed

inst/WORDLIST.txt

+65
Original file line numberDiff line numberDiff line change
@@ -1150,3 +1150,68 @@ zxqguo
11501150
Żyła
11511151
ZZHPh
11521152
zzz
1153+
biostatistics
1154+
covtracer
1155+
cykuXxFc
1156+
danieldsjoberg
1157+
datacutr
1158+
dm
1159+
ds
1160+
DSCAT
1161+
DSDECOD
1162+
DSST
1163+
DSSTDT
1164+
DSSTDTC
1165+
EXEN
1166+
EXENDT
1167+
EXENDTC
1168+
EXST
1169+
EXSTDT
1170+
EXSTDTC
1171+
grflaRJu
1172+
gtsummary
1173+
hy
1174+
Kaplan
1175+
KlQ
1176+
lXR
1177+
MkRIRlA
1178+
nADSL
1179+
nSummary
1180+
openpharma
1181+
PLexAKolMzPcpzPAXNU
1182+
rdrr
1183+
responsepage
1184+
SAFFL
1185+
SDE
1186+
sdtmchecks
1187+
shorturl
1188+
storybench
1189+
Sunil
1190+
tfrmt
1191+
thevalidatoR
1192+
TlZSNDhUSC
1193+
tryCatch
1194+
UAPF
1195+
UMTE
1196+
UpIFsuEPCx
1197+
VC
1198+
visR
1199+
VpX
1200+
xeEJLj
1201+
EMA
1202+
ACTARMCD
1203+
CDISCPILOT
1204+
DMDTC
1205+
DMDY
1206+
DTHDTC
1207+
NA's
1208+
Pbo
1209+
QC'ing
1210+
Qu
1211+
RFENDTC
1212+
RFPENDTC
1213+
RFSTDTC
1214+
RFXENDTC
1215+
RFXSTDTC
1216+
SITEID
1217+
Xan

posts/2025-02-28_theres_a_pharmaverse_package_for_that/appendix.R

Whitespace-only changes.
Loading
Loading
Loading
Loading
Loading
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,260 @@
1+
---
2+
title: "Working with Clinical Trial Data? There’s a Pharmaverse Package for That"
3+
author:
4+
- name: Gift Kenneth
5+
- name: Sunil Gupta
6+
- name: APPSILON
7+
description: "Looking for R packages to manage clinical trial data? Pharmaverse has tools for every stage from data collection to submission!"
8+
date: "2025-02-28"
9+
# please do not use any non-default categories.
10+
# You can find the default categories in the repository README.md
11+
categories: [Technical, Community]
12+
# feel free to change the image
13+
image: "pharmaverse-post.png"
14+
---
15+
16+
<!--------------- typical setup ----------------->
17+
18+
```{r setup, include=FALSE}
19+
long_slug <- "2025-02-28_managing_clinical_trial_data..."
20+
# renv::use(lockfile = "renv.lock")
21+
22+
library(admiraldev)
23+
```
24+
25+
<!--------------- post begins here ----------------->
26+
27+
Working with clinical trial data is no small task. It needs to be precise, compliant, and efficient. Traditionally, this meant using proprietary tools and working within siloed systems, which often made the process more complicated and expensive than necessary. But we think there’s a better way.
28+
29+
The [**pharmaverse**](https://pharmaverse.org/) is an open-source ecosystem of R packages built specifically for clinical trials. These tools integrate seamlessly with the [Tidyverse](https://www.tidyverse.org/), making data management more flexible, efficient, and transparent.
30+
31+
Whether you’re collecting, validating, analyzing, or preparing data for regulatory submission, there’s a pharmaverse package designed to support your workflow and help you work smarter.
32+
33+
This post covers:
34+
35+
- Key stages of clinical trials and the R packages that support them
36+
37+
- Creating ADSL datasets and essential programming steps
38+
39+
- Key players in pharmaverse and whether you need all packages
40+
41+
- How pharmaverse compares to Tidyverse and how to learn it
42+
43+
By the end, you'll have a clear understanding of how pharmaverse supports clinical trial operations and how to apply these tools in your work.
44+
45+
## Key Stages of Clinical Reporting
46+
47+
Managing clinical trial data involves multiple stages, each with its own challenges. **Pharmaverse** provides a range of R packages that support different parts of the process, sometimes even offering multiple options for the same task. This flexibility allows organizations to choose the best tools for their specific needs rather than sticking to a one-size-fits-all approach.
48+
49+
A **metadata-driven approach** helps ensure that clinical trial data is consistently structured and aligned with regulatory standards. The typical process follows this sequence:
50+
51+
**Metadata****OAK****Admiral****Define.xml****TLGs****Submissions**
52+
53+
Some examples of **pharmaverse** packages that support clinical reporting include:
54+
55+
- {diffdf} – Tracking differences in datasets.
56+
- {metatools} – Metadata management and transformation.
57+
- {sdtm.oak} – The primary **pharmaverse** package for SDTM dataset creation.
58+
- {datacutr} - Performing data cuts.
59+
- {admiral} – Standardized data derivations.
60+
- {metacore} – Metadata-driven structures.
61+
- The **pharmaverse** provides multiple table-making packages, such as {chevron} (which builds on {rtables}), {Tplyr}, {pharmaRTF}, {gtsummary}, {cards}, {tfrmt}, and {tidytlg}. More tools are listed on the [TLGs page](https://pharmaverse.org/e2eclinical/tlg/).
62+
- {xportr} – CDISC-compliant dataset export.
63+
- {pkglite} – Package management and tracking.
64+
- {metacore} and {metatools} – For standardized metadata structures and validation.
65+
- {logrx} - For logging R scripts.
66+
67+
Pharmaverse packages are built on top of **[Tidyverse](https://www.tidyverse.org/)** tools and integrate seamlessly with packages like {dplyr} for data manipulation and {ggplot2} for visualization.
68+
69+
> **Note:** This post highlights some key **pharmaverse** packages relevant to clinical reporting. For a full and up-to-date list, visit the [Pharmaverse website](https://pharmaverse.org/). If there's a package we missed that should be included, let us know, and we’d be happy to update this post.
70+
71+
By using these tools, organizations can optimize their data pipeline, ensuring clinical data is well-structured and ready for regulatory submission with ease.
72+
73+
## **Example: Creating ADSL**
74+
75+
Building an ADSL dataset involves several key steps, from reading in data to deriving treatment variables and population flags. While these steps apply regardless of the tools used, **pharmaverse packages like {admiral} simplify the process** with functions designed for CDISC-compliant datasets.
76+
77+
This example is based on the [ADSL template](https://cran.r-project.org/web/packages/admiral/vignettes/adsl.html), which provides a structured approach to creating an ADSL dataset.
78+
79+
#### **Step 1: Read in Data**
80+
81+
To begin, clinical trial datasets such as **DM, EX, DS, AE, and LB** are loaded. The {pharmaversesdtm} package provides sample CDISC SDTM datasets:
82+
83+
``` {R, eval = TRUE}
84+
library(admiral)
85+
library(dplyr, warn.conflicts = FALSE)
86+
library(pharmaversesdtm)
87+
library(stringr)
88+
89+
# Load sample data
90+
data("dm", package = "pharmaversesdtm")
91+
data("ex", package = "pharmaversesdtm")
92+
data("ds", package = "pharmaversesdtm")
93+
```
94+
95+
ADSL is typically **built from the DM dataset**, removing unnecessary columns and adding treatment variables in one step:
96+
97+
``` {R, eval = TRUE}
98+
adsl <- dm %>%
99+
select(-DOMAIN) %>%
100+
mutate(
101+
TRT01P = ARM,
102+
TRT01A = ACTARM
103+
)
104+
```
105+
106+
#### **Step 2: Derive Treatment Variables**
107+
108+
Using {admiral}, we extract and standardize treatment dates from the EX dataset:
109+
110+
``` {R, eval = TRUE}
111+
ex_ext <- ex %>%
112+
filter(!is.na(USUBJID)) %>%
113+
derive_vars_dt(
114+
dtc = EXSTDTC,
115+
new_vars_prefix = "EXST"
116+
) %>%
117+
derive_vars_dt(
118+
dtc = EXENDTC,
119+
new_vars_prefix = "EXEN"
120+
)
121+
```
122+
123+
Then merge these dates into ADSL:
124+
125+
``` {R, eval = TRUE}
126+
adsl <- adsl %>%
127+
derive_vars_merged(
128+
dataset_add = ex_ext,
129+
filter_add = (EXDOSE > 0 |
130+
(EXDOSE == 0 &
131+
str_detect(EXTRT, "PLACEBO"))) & !is.na(EXSTDT),
132+
new_vars = exprs(TRTSDT = EXSTDT),
133+
order = exprs(EXSTDT, EXSEQ),
134+
mode = "first",
135+
by_vars = exprs(STUDYID, USUBJID)
136+
) %>%
137+
derive_vars_merged(
138+
dataset_add = ex_ext,
139+
filter_add = (EXDOSE > 0 |
140+
(EXDOSE == 0 &
141+
str_detect(EXTRT, "PLACEBO"))) & !is.na(EXENDT),
142+
new_vars = exprs(TRTEDT = EXENDT),
143+
order = exprs(EXENDT, EXSEQ),
144+
mode = "last",
145+
by_vars = exprs(STUDYID, USUBJID)
146+
)
147+
```
148+
149+
#### **Step 3: Derive End of Study (EOS) Status**
150+
151+
The disposition dataset (DS) is used to determine when a patient exited the study:
152+
153+
``` {R, eval = TRUE}
154+
ds_ext <- ds %>%
155+
filter(!is.na(DSSTDTC)) %>%
156+
derive_vars_dt(
157+
dtc = DSSTDTC,
158+
new_vars_prefix = "DSST"
159+
)
160+
161+
adsl <- adsl %>%
162+
derive_vars_merged(
163+
dataset_add = ds_ext,
164+
by_vars = exprs(STUDYID, USUBJID),
165+
new_vars = exprs(EOSDT = DSSTDT),
166+
filter_add = DSCAT == "DISPOSITION EVENT" & DSDECOD != "SCREEN FAILURE"
167+
)
168+
```
169+
170+
#### **Step 4: Assign Population Flags**
171+
172+
For safety population (`SAFFL`), we check if the patient received a treatment dose:
173+
174+
``` {R, eval = TRUE}
175+
adsl <- adsl %>%
176+
derive_var_merged_exist_flag(
177+
dataset_add = ex,
178+
by_vars = exprs(STUDYID, USUBJID),
179+
new_var = SAFFL,
180+
condition = EXDOSE > 0 | str_detect(EXTRT, "PLACEBO")
181+
)
182+
```
183+
184+
#### **Step 5: Generate and Save Results**
185+
186+
Finally, we save the dataset CSV and can view some of its columns:
187+
188+
```{R, eval = FALSE}
189+
# Save to a CSV file
190+
write.csv(adsl, "adsl_output.csv", row.names = FALSE)
191+
192+
adsl
193+
```
194+
195+
```{r, eval=TRUE, echo=FALSE}
196+
adsl%>%
197+
dataset_vignette(
198+
display_vars = exprs(USUBJID, TRT01P, TRT01A, TRTSDT, TRTEDT, SAFFL)
199+
)
200+
```
201+
202+
#### **More Details on ADSL Creation**
203+
204+
This is just a **high-level example**; the full process includes deriving death variables, grouping populations, and applying labels. For a deeper dive, check out the [ADSL Implementation Guide](https://cran.r-project.org/web/packages/admiral/vignettes/adsl.html).
205+
206+
## **Who Are the Key Players in Pharmaverse, and Do You Need to Use All Packages?**
207+
208+
### **Key Players in pharmaverse**
209+
210+
- **Pharmaverse Council and Community** – A collaborative group of developers, industry experts, and contributors maintaining and expanding the ecosystem.
211+
- **Open-Source Contributors** – Individuals and organizations developing and refining **pharmaverse** packages.
212+
- **Pharmaverse is part of [PHUSE](https://phuse.global/)** – PHUSE plays an active role in supporting and advancing the **pharmaverse** initiative.
213+
- **The pharmaverse community collaborates with organizations like the FDA, EMA, R Consortium, and CDISC** to align with industry standards and best practices for clinical data reporting.
214+
215+
216+
### **Do You Need to Use All Pharmaverse Packages?**
217+
218+
- No, organizations can select only the packages that fit their needs.
219+
220+
- Many packages are modular and independent, allowing selective integration.
221+
222+
- Pharmaverse hosts multiple packages with similar aims, giving users the flexibility to choose what works best for them rather than prescribing a single approach.
223+
224+
- Pharmaverse complements [Tidyverse](https://www.tidyverse.org/), allowing organizations to continue using familiar R workflows.
225+
226+
## **How Pharmaverse Differs from Tidyverse & How to Learn It Effectively**
227+
228+
#### **Differences Between pharmaverse and Tidyverse**
229+
230+
- Tidyverse provides general-purpose data science tools such as data wrangling and visualization...
231+
232+
- ... Whereas pharmaverse integrates Tidyverse functions but adds compliance, validation, and reporting features for pharma-specific clinical data structuring, reporting and regulatory submissions.
233+
234+
## **Getting Started with the Pharmaverse**
235+
236+
Pharmaverse provides an open-source ecosystem for clinical reporting, extending Tidyverse with validation, compliance, and regulatory submission capabilities. By following a structured approach from raw data to ADaMs, organizations can enhance efficiency while maintaining data integrity.
237+
238+
- You can [start with Pharmaverse Examples](https://pharmaverse.github.io/examples/) – A curated set of documentation and tutorials.
239+
240+
- Attend Pharma Industry Webinars and Conferences – Stay updated on new developments through events like [PHUSE events and webinars](https://phuse.global/Events_Calendar), [R/Pharma conferences and events](https://rinpharma.com/), [CDISC events](https://www.cdisc.org/events), [Shiny Gatherings x Pharmaverse webinars](https://www.youtube.com/playlist?list=PLexAKolMzPcpzPAXNU6KlQ_UpIFsuEPCx), etc.
241+
242+
- Engage with the Open-Source Community – Contribute to package improvements or discussions. [You can join the pharmaverse community to get started.](join the pharmaverse](https://join.slack.com/t/pharmaverse/shared_invite/zt-yv5atkr4-Np2ytJ6W_QKz_4Olo7Jo9A)).
243+
244+
- Explore packages on the [pharmaverse website](https://pharmaverse.org).
245+
246+
- Try implementing an ADSL dataset using following the [ADSL Implementation Guide](https://cran.r-project.org/web/packages/admiral/vignettes/adsl.html).
247+
248+
- Refer to [this grid for guidance on using Tidyverse or pharmaverse](https://r-guru.com/pharma) to complete tasks in the submission process. 
249+
250+
251+
252+
### **Resources**
253+
254+
- This blog post was based on this presentation by Sunil Gupta: [R and pharmaverse: The New Frontier for Today’s Statistical Programmers](https://phuse.s3.eu-central-1.amazonaws.com/Archive/2024/SDE/US/Mississauga/PRE_Mississauga05.pdf)
255+
256+
- [R-Guru Resource Hub for Rapid R Learning](https://www.lexjansen.com/phuse-us/2024/pd/PAP_PD05.pdf)
257+
258+
- [Explore more posts in the pharmaverse blog](https://pharmaverse.github.io/blog/)
259+
260+
- [Subscribe to the pharmaverse newsletter](https://forms.office.com/pages/responsepage.aspx?id=xeEJLj1cykuXxFc6VpX1UAPF0grflaRJu8z6VC7-hy5UMTE0M0lXR1JON1Q0MkRIRlA1TlZSNDhUSC4u&route=shorturl)
Loading

0 commit comments

Comments
 (0)