Skip to content

Commit a6e2eb6

Browse files
authored
Merge pull request #236 from cmu-delphi/lcb/fix-various-doc-and-vignette-issues
Fix various documentation bugs noted in #217
2 parents 94d5ca5 + db755e2 commit a6e2eb6

16 files changed

+61
-73
lines changed

NEWS.md

+11-10
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@ development versions. A ".9999" suffix indicates a development version.
77
## Cleanup:
88

99
* Added a `NEWS.md` file to track changes to the package.
10+
* Fixed various small documentation issues ([#217](https://github.com/cmu-delphi/epiprocess/issues/217)).
1011

1112
# epiprocess 0.5.0:
1213

@@ -19,13 +20,13 @@ development versions. A ".9999" suffix indicates a development version.
1920

2021
## Improvements:
2122

22-
* Fixed `epix_merge`, `<epi_archive>$merge` always raising error on `sync="truncate"`
23+
* Fixed `epix_merge`, `<epi_archive>$merge` always raising error on `sync="truncate"`.
2324

2425
## Cleanup:
2526

26-
* Added `Remotes:` entry for `genlasso`, which was removed from CRAN
27-
* Added `as_epi_archive` tests
28-
* Added missing `epix_merge` test for `sync="truncate"`
27+
* Added `Remotes:` entry for `genlasso`, which was removed from CRAN.
28+
* Added `as_epi_archive` tests.
29+
* Added missing `epix_merge` test for `sync="truncate"`.
2930

3031
# epiprocess 0.4.0:
3132

@@ -68,7 +69,7 @@ development versions. A ".9999" suffix indicates a development version.
6869
* `epix_<method>` will not mutate input `epi_archive`s, but may alias them
6970
or alias their fields (which should not be a worry if a user sticks to
7071
these `epix_*` functions and "regular" R functions with
71-
copy-on-write-like behavior, avoiding mutating functions `[.data.table`)
72+
copy-on-write-like behavior, avoiding mutating functions `[.data.table`).
7273
* `x$<method>` may mutate `x`; if it mutates `x`, it will return `x`
7374
invisibly (where this makes sense), and, for each of its fields, may
7475
either mutate the object to which it refers or reseat the reference (but
@@ -109,7 +110,7 @@ development versions. A ".9999" suffix indicates a development version.
109110
* New function `epix_fill_through_version`, method
110111
`<epi_archive>$fill_through_version`: non-mutating & mutating way to
111112
ensure that an archive contains versions at least through some
112-
`fill_versions_end`, extrapolating according to `how` if necessary
113+
`fill_versions_end`, extrapolating according to `how` if necessary.
113114
* Example archive data object is now constructed on demand from its
114115
underlying data, so it will be based on the user's version of
115116
`epi_archive` rather than an outdated R6 implementation from whenever the
@@ -215,7 +216,7 @@ Classes:
215216
* `epi_cor` calculates Pearson, Kendall, or Spearman correlations
216217
between two (optionally time-shifted) variables in an `epi_df` within
217218
user-specified groups.
218-
* Convenience function: `is_epi_df`
219+
* Convenience function: `is_epi_df`.
219220
* `epi_archive`: R6 class for version (patch) data for geotemporal
220221
epidemiological time series data sets. Comes with S3 methods and regular
221222
functions that wrap around this functionality for those unfamiliar with R6
@@ -224,23 +225,23 @@ Classes:
224225
containing snapshots and/or patch data for every available version of
225226
the data set.
226227
* `as_of`: extracts a snapshot of the data set as of some requested
227-
version, in `epi_df` format
228+
version, in `epi_df` format.
228229
* `epix_slide`, `<epi_archive>$slide`: similar to `epi_slide`, but for
229230
`epi_archive`s; for each requested `ref_time_value` and group, applies
230231
a time window and user-specified computation to a snapshot of the data
231232
as of `ref_time_value`.
232233
* `epix_merge`, `<epi_archive>$merge`: like `merge` for `epi_archive`s,
233234
but allowing for the last version of each observation to be carried
234235
forward to fill in gaps in `x` or `y`.
235-
* Convenience function: `is_epi_archive`
236+
* Convenience function: `is_epi_archive`.
236237

237238
Additional functions:
238239
* `growth_rate`: estimates growth rate of a time series using one of a few
239240
built-in `method`s based on relative change, linear regression,
240241
smoothing splines, or trend filtering.
241242
* `detect_outlr`: applies one or more outlier detection methods to a given
242243
signal variable, and optionally aggregates the outputs to create a
243-
consensus result
244+
consensus result.
244245
* `detect_outlr_rm`: outlier detection function based on a
245246
rolling-median-based outlier detection function; one of the methods
246247
included in `detect_outlr`.

R/methods-epi_archive.R

+5-4
Original file line numberDiff line numberDiff line change
@@ -76,7 +76,7 @@ epix_as_of = function(x, max_version, min_time_value = -Inf) {
7676
#' `x`), and returns the updated `x` [invisibly][base::invisible].
7777
#'
7878
#' @param x An `epi_archive`
79-
#' @param fill_versions_end Length-1, same class&type as `%s$version`: the
79+
#' @param fill_versions_end Length-1, same class&type as `x$version`: the
8080
#' version through which to fill in missing version history; this will be the
8181
#' result's `$versions_end` unless it already had a later
8282
#' `$versions_end`.
@@ -397,9 +397,10 @@ epix_merge = function(x, y,
397397
#' @param new_col_name String indicating the name of the new column that will
398398
#' contain the derivative values. Default is "slide_value"; note that setting
399399
#' `new_col_name` equal to an existing column name will overwrite this column.
400-
#' @param as_list_col Should the new column be stored as a list column? Default
401-
#' is `FALSE`, in which case a list object returned by `f` would be unnested
402-
#' (using `tidyr::unnest()`), and the names of the resulting columns are given
400+
#' @param as_list_col If the computations return data frames, should the slide
401+
#' result hold these in a single list column or try to unnest them? Default is
402+
#' `FALSE`, in which case a list object returned by `f` would be unnested
403+
#' (using [`tidyr::unnest()`]), and the names of the resulting columns are given
403404
#' by prepending `new_col_name` to the names of the list elements.
404405
#' @param names_sep String specifying the separator to use in `tidyr::unnest()`
405406
#' when `as_list_col = FALSE`. Default is "_". Using `NULL` drops the prefix

R/slide.R

+4-3
Original file line numberDiff line numberDiff line change
@@ -46,9 +46,10 @@
4646
#' @param new_col_name String indicating the name of the new column that will
4747
#' contain the derivative values. Default is "slide_value"; note that setting
4848
#' `new_col_name` equal to an existing column name will overwrite this column.
49-
#' @param as_list_col Should the new column be stored as a list column? Default
50-
#' is `FALSE`, in which case a list object returned by `f` would be unnested
51-
#' (using `tidyr::unnest()`), and the names of the resulting columns are given
49+
#' @param as_list_col If the computations return data frames, should the slide
50+
#' result hold these in a single list column or try to unnest them? Default is
51+
#' `FALSE`, in which case a list object returned by `f` would be unnested
52+
#' (using [`tidyr::unnest()`]), and the names of the resulting columns are given
5253
#' by prepending `new_col_name` to the names of the list elements.
5354
#' @param names_sep String specifying the separator to use in `tidyr::unnest()`
5455
#' when `as_list_col = FALSE`. Default is "_". Using `NULL` drops the prefix

README.md

+4-4
Original file line numberDiff line numberDiff line change
@@ -45,10 +45,10 @@ The second main data structure in the package is called
4545
wrapped around a data table that stores the archive (version history) of some
4646
signal variables of interest.
4747

48-
By convention, functions in the `epiprocess` package that operate on `epi_df`
49-
objects begin with `epix` (the "x" is meant to remind you of "archive"). These
50-
are just wrapper functions around the public methods for the `epi_archive` R6
51-
class. For example:
48+
By convention, functions in the `epiprocess` package that operate on
49+
`epi_archive` objects begin with `epix` (the "x" is meant to remind you of
50+
"archive"). These are just wrapper functions around the public methods for the
51+
`epi_archive` R6 class. For example:
5252

5353
- `epix_as_of()`, for generating a snapshot in `epi_df` from the data archive,
5454
which represents the most up-to-date values of the signal variables, as of the

_pkgdown.yml

+7-5
Original file line numberDiff line numberDiff line change
@@ -6,9 +6,11 @@ url: https://cmu-delphi.github.io/epiprocess/
66
home:
77
links:
88
- text: Get the epipredict R package
9-
href: https://cmu-delphi.github.io/epiprocess/
9+
href: https://cmu-delphi.github.io/epipredict/
1010
- text: Get the covidcast R package
1111
href: https://cmu-delphi.github.io/covidcast/covidcastR/
12+
- text: Get the epidatr R package
13+
href: https://github.com/cmu-delphi/epidatr
1214

1315
articles:
1416
- title: Using the package
@@ -34,11 +36,11 @@ repo:
3436

3537

3638
reference:
37-
- title: epi_df basics
39+
- title: "`epi_df` basics"
3840
desc: Details on `epi_df` format, and basic functionality.
3941
- contents:
4042
- matches("epi_df")
41-
- title: epi_*() functions
43+
- title: "`epi_*()` functions"
4244
desc: Functions that act on `epi_df` objects.
4345
- contents:
4446
- epi_slide
@@ -50,11 +52,11 @@ reference:
5052
- detect_outlr
5153
- detect_outlr_rm
5254
- detect_outlr_stl
53-
- title: epi_archive basics
55+
- title: "`epi_archive` basics"
5456
desc: Details on `epi_archive`, and basic functionality.
5557
- contents:
5658
- matches("archive")
57-
- title: epix_*() functions
59+
- title: "`epix_*()` functions"
5860
desc: Functions that act on an `epi_archive` object.
5961
- contents:
6062
- starts_with("epix")

man/epi_slide.Rd

+4-3
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

man/epix_fill_through_version.Rd

+1-1
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

man/epix_slide.Rd

+4-3
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

vignettes/aggregation.Rmd

+1-1
Original file line numberDiff line numberDiff line change
@@ -241,7 +241,7 @@ head(xt_filled_month)
241241
TODO
242242

243243
## Attribution
244-
This document contains dataset that is a modified part of the [COVID-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University](https://github.com/CSSEGISandData/COVID-19) as [republished in the COVIDcast Epidata API](https://cmu-delphi.github.io/delphi-epidata/api/covidcast-signals/jhu-csse.html). This data set is licensed under the terms of the [Creative Commons Attribution 4.0 International license](https://creativecommons.org/licenses/by/4.0/) by the Johns Hopkins University on behalf of its Center for Systems Science in Engineering. Copyright Johns Hopkins University 2020.
244+
This document contains a dataset that is a modified part of the [COVID-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University](https://github.com/CSSEGISandData/COVID-19) as [republished in the COVIDcast Epidata API](https://cmu-delphi.github.io/delphi-epidata/api/covidcast-signals/jhu-csse.html). This data set is licensed under the terms of the [Creative Commons Attribution 4.0 International license](https://creativecommons.org/licenses/by/4.0/) by the Johns Hopkins University on behalf of its Center for Systems Science in Engineering. Copyright Johns Hopkins University 2020.
245245

246246
[From the COVIDcast Epidata API](https://cmu-delphi.github.io/delphi-epidata/api/covidcast-signals/jhu-csse.html):
247247
These signals are taken directly from the JHU CSSE [COVID-19 GitHub repository](https://github.com/CSSEGISandData/COVID-19) without changes.

vignettes/archive.Rmd

+7-14
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,8 @@ library(epidatr)
3030
library(epiprocess)
3131
library(data.table)
3232
library(dplyr)
33+
library(purrr)
34+
library(ggplot2)
3335
3436
dv <- covidcast(
3537
data_source = "doctor-visits",
@@ -48,6 +50,8 @@ library(epidatr)
4850
library(epiprocess)
4951
library(data.table)
5052
library(dplyr)
53+
library(purrr)
54+
library(ggplot2)
5155
```
5256

5357
## Getting data into `epi_archive` format
@@ -141,15 +145,6 @@ as in `x$geo_type` or `x$time_type`, etc. Just like `as_epi_df()`, the function
141145
object is instantiated, if they are not explicitly specified in the function
142146
call (as it did in the case above).
143147

144-
Note that `compactify` is **NOT** metadata and is an argument passed when creating
145-
the dataset, without being stored in the end:
146-
147-
```{r,message=FALSE}
148-
# `dt` here is taken from the tests
149-
as_epi_archive(archive_cases_dv_subset$DT,compactify=TRUE)$geo_type # "state"
150-
as_epi_archive(archive_cases_dv_subset$DT,compactify=TRUE)$compactify # NULL
151-
```
152-
153148
## Producing snapshots in `epi_df` form
154149

155150
A key method of an `epi_archive` class is `as_of()`, which generates a snapshot
@@ -189,8 +184,6 @@ marked by dotted vertical lines, and draw the latest curve in black (from the
189184
latest snapshot `x_latest` that the archive can provide).
190185

191186
```{r, fig.width = 8, fig.height = 7}
192-
library(purrr)
193-
library(ggplot2)
194187
theme_set(theme_bw())
195188
196189
self_max = max(x$DT$version)
@@ -203,15 +196,15 @@ snapshots <- map_dfr(versions, function(v) {
203196
204197
ggplot(snapshots %>% filter(!latest),
205198
aes(x = time_value, y = percent_cli)) +
206-
geom_line(aes(color = factor(version))) +
199+
geom_line(aes(color = factor(version)), na.rm=TRUE) +
207200
geom_vline(aes(color = factor(version), xintercept = version), lty = 2) +
208201
facet_wrap(~ geo_value, scales = "free_y", ncol = 1) +
209202
scale_x_date(minor_breaks = "month", date_labels = "%b %y") +
210203
labs(x = "Date", y = "% of doctor's visits with CLI") +
211204
theme(legend.position = "none") +
212205
geom_line(data = snapshots %>% filter(latest),
213206
aes(x = time_value, y = percent_cli),
214-
inherit.aes = FALSE, color = "black")
207+
inherit.aes = FALSE, color = "black", na.rm=TRUE)
215208
```
216209

217210
We can see some interesting and highly nontrivial revision behavior: at some
@@ -444,7 +437,7 @@ the vignettes in the current package are only meant to demo the slide
444437
functionality with some of the most basic forecasting methodology possible.
445438

446439
## Attribution
447-
This document contains dataset that is a modified part of the [COVID-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University](https://github.com/CSSEGISandData/COVID-19) as [republished in the COVIDcast Epidata API](https://cmu-delphi.github.io/delphi-epidata/api/covidcast-signals/jhu-csse.html). This data set is licensed under the terms of the [Creative Commons Attribution 4.0 International license](https://creativecommons.org/licenses/by/4.0/) by the Johns Hopkins University on behalf of its Center for Systems Science in Engineering. Copyright Johns Hopkins University 2020.
440+
This document contains a dataset that is a modified part of the [COVID-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University](https://github.com/CSSEGISandData/COVID-19) as [republished in the COVIDcast Epidata API](https://cmu-delphi.github.io/delphi-epidata/api/covidcast-signals/jhu-csse.html). This data set is licensed under the terms of the [Creative Commons Attribution 4.0 International license](https://creativecommons.org/licenses/by/4.0/) by the Johns Hopkins University on behalf of its Center for Systems Science in Engineering. Copyright Johns Hopkins University 2020.
448441

449442
The `percent_cli` data is a modified part of the [COVIDcast Epidata API Doctor Visits data](https://cmu-delphi.github.io/delphi-epidata/api/covidcast-signals/doctor-visits.html). This dataset is licensed under the terms of the [Creative Commons Attribution 4.0 International license](https://creativecommons.org/licenses/by/4.0/). Copyright Delphi Research Group at Carnegie Mellon University 2020.
450443

vignettes/compactify.Rmd

+1-1
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@ For this example, we have one chart using LOCF values, while another doesn't
2828
use them to illustrate LOCF. Notice how the head of the first dataset differs
2929
from the second from the third value included.
3030

31-
```{r}
31+
```{r, message=FALSE}
3232
library(epiprocess)
3333
library(dplyr)
3434

0 commit comments

Comments
 (0)