pharmaversesdtm

Test data (SDTM) for the pharmaverse family of packages

Purpose
Installation
Data Sources
Naming Conventions
How To Update

Purpose {#purpose}

To provide a one-stop-shop for SDTM test data in the pharmaverse family of packages. This includes datasets that are therapeutic area (TA)-agnostic (DM, VS, EG, etc.) as well TA-specific ones (RS, TR, OE, etc.).

Installation {#installation}

The package is available from CRAN and can be installed by running install.packages("pharmaversesdtm"). To install the latest development version of the package directly from GitHub use the following code:

if (!requireNamespace("remotes", quietly = TRUE)) {
  install.packages("remotes")
}

remotes::install_github("pharmaverse/pharmaversesdtm", ref = "main") # This command installs the latest development version directly from GitHub.

Data Sources {#data-sources}

Some test datasets have been sourced from the CDISC pilot project, while other datasets have been constructed ad-hoc by the {admiral} team. Please check the Reference page for detailed information regarding the source of specific datasets.

Naming Conventions {#naming}

Datasets that are TA-agnostic: same as SDTM domain name (e.g., dm, rs).
Datasets that are TA-specific: domain_TA_others, others go from broader categories to more specific ones (e.g., oe_ophtha, rs_onco, rs_onco_irecist).

Note: If an SDTM domain is used by multiple TAs, {pharmaversesdtm} may provide multiple versions of the corresponding test dataset. For instance, the package contains ex and ex_ophtha as the latter contains ophthalmology-specific variables such as EXLAT and EXLOC, and EXROUTE is exchanged for a plausible ophthalmology value.

How To Update {#how-to-update}

Firstly, make a GitHub issue in {pharmaversesdtm} with the planned updates and tag @pharmaverse/admiral so that one of the development core team can sanity check the request. Then there are two main ways to extend the test data: either by adding new datasets or extending existing datasets with new records/variables. Whichever method you choose, it is worth noting the following:

Programs that generate test data are stored in the data-raw/ folder.
Each of these programs is written as a standalone R script: if any packages need to be loaded for a given program, then call library() at the start of the program (but please do not call library(pharmaversesdtm)).
When you have created a program in the data-raw/ folder, you need to run it as a standalone R script, in order to generate a test dataset that will become part of the {pharmaversesdtm} package, but you do not need to build the package.
Following best practice, each dataset is stored as a .rda file whose name is consistent with the name of the dataset, e.g., dataset xx is stored as xx.rda. The easiest way to achieve this is to use usethis::use_data(xx)
The programs in data-raw/ are stored within the {pharmaversesdtm} GitHub repository, but they are not part of the {pharmaversesdtm} package--the data-raw/ folder is specified in .Rbuildignore.
When you run a program that is in the data-raw/ folder, you generate a dataset that is written to the data/ folder, which will become part of the {pharmaversesdtm} package.
The names and sources of test datasets are specified in R/*.R, for the purpose of generating documentation in the man/ folder.

Note: The documentation process in {pharmaversesdtm} is automated for consistency and ease of maintenance.

Centralized Metadata `(inst/extdata/sdtms-specs.json)`

{pharmaversesdtm} uses a single JSON file to store metadata for all SDTM datasets. This file contains information such as:

dataset name
dataset label
dataset description
author
source
therapeutic area
any other dataset-specific metadata.

This metadata drives the automated documentation process, and the file is read by data-raw/create_sdtms_data.R to help generate:

Documentation .R files in R/
.Rd files in man/
Test Name/Test Code table inclusion (when present)
Dataset grouping by Therapeutic Area.

Adding New SDTM Datasets

Create a program in the data-raw/ folder, named <name>.R, where <name> should follow the naming convention, to generate the test data and output <name>.rda to the data/ folder.
- Use CDISC pilot data such as dm as input in this program in order to create realistic synthetic data that remains consistent with other domains (not mandatory).
- Note that no personal data should be used as part of this package, even if anonymized.
Run the program.
Update inst/extdata/sdtms-specs.json with the new dataset metadata, including:
- Assigning the dataset label, description, author, source, purpose, or structure.
- Assigning or updating the dataset therapeutic area (used for reference-page grouping).
Run data-raw/create_sdtms_data.R in order to update NAMESPACE and update the .Rd files in man/.
Add your GitHub handle to .github/CODEOWNERS.
Update NEWS.md.

Updating Existing SDTM Datasets

Locate the existing program <name>.R in the data-raw/ folder, update it accordingly.
Update the corresponding entry in inst/extdata/sdtms-specs.json to reflect the changes, including:
- Changing the dataset label, description, author, or source.
- Modifying the dataset purpose or structure.
- Updating the dataset therapeutic area.
- Removing a dataset (delete its entry from the JSON entirely).
Run the program, and output updated <name>.rda to the data/ folder.
Run data-raw/create_sdtms_data.R in order to update NAMESPACE and update the .Rd files in man/.
Add your GitHub handle to .github/CODEOWNERS.
Update NEWS.md.

Acknowledgments

Along with the authors and contributors, thanks to the following people for their work on the package:

G Gayatri, Pooja Kumari, Sadchla Mascary, Kangjie Zhang and Zelos Zhu.

Name		Name	Last commit message	Last commit date
Latest commit History 362 Commits
.devcontainer		.devcontainer
.github		.github
R		R
data-raw		data-raw
data		data
inst		inst
man		man
pkgdown/favicon		pkgdown/favicon
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
.lintr.R		.lintr.R
.lycheeignore		.lycheeignore
DESCRIPTION		DESCRIPTION
LICENSE.md		LICENSE.md
NAMESPACE		NAMESPACE
NEWS.md		NEWS.md
README.md		README.md
_pkgdown.yml		_pkgdown.yml
cran-comments.md		cran-comments.md
pharmaversesdtm.Rproj		pharmaversesdtm.Rproj
staged_dependencies.yaml		staged_dependencies.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

pharmaversesdtm

Purpose {#purpose}

Installation {#installation}

Data Sources {#data-sources}

Naming Conventions {#naming}

How To Update {#how-to-update}

Centralized Metadata `(inst/extdata/sdtms-specs.json)`

Adding New SDTM Datasets

Updating Existing SDTM Datasets

Acknowledgments

About

Uh oh!

Releases 7

Packages

Uh oh!

Contributors 28

Uh oh!

Languages

License

pharmaverse/pharmaversesdtm

Folders and files

Latest commit

History

Repository files navigation

pharmaversesdtm

Purpose {#purpose}

Installation {#installation}

Data Sources {#data-sources}

Naming Conventions {#naming}

How To Update {#how-to-update}

Centralized Metadata (inst/extdata/sdtms-specs.json)

Adding New SDTM Datasets

Updating Existing SDTM Datasets

Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 7

Packages 0

Uh oh!

Contributors 28

Uh oh!

Languages

Centralized Metadata `(inst/extdata/sdtms-specs.json)`

Packages