Skip to content

Commit 6231927

Browse files
committed
Copy edit docs/clean code
1 parent 91b7288 commit 6231927

File tree

4 files changed

+103
-155
lines changed

4 files changed

+103
-155
lines changed

R/fit.R

Lines changed: 36 additions & 94 deletions
Original file line numberDiff line numberDiff line change
@@ -3,11 +3,9 @@ NULL
33

44
#' Fit a spatial or spatiotemporal GLMM with TMB
55
#'
6-
#' Fit a spatial or spatiotemporal Gaussian random field generalized linear
7-
#' mixed effects model (GLMM) with the TMB (Template Model Builder) R package and
8-
#' the SPDE (stochastic partial differential equation) approach. This can be
9-
#' useful for (dynamic) species distribution models and relative abundance index
10-
#' standardization among many other uses.
6+
#' Fit a spatial or spatiotemporal generalized linear mixed effects model (GLMM)
7+
#' with the TMB (Template Model Builder) R package and the SPDE (stochastic
8+
#' partial differential equation) approximation to Gaussian random fields.
119
#'
1210
#' @param formula Model formula. IID random intercepts are possible using
1311
#' \pkg{lme4} syntax, e.g., `+ (1 | g)` where `g` is a column of class
@@ -82,8 +80,7 @@ NULL
8280
#' \doi{10.1111/ecog.05176} and the [spatial trends
8381
#' vignette](https://pbs-assess.github.io/sdmTMB/articles/spatial-trend-models.html).
8482
#' Note this predictor should usually be centered to have mean zero and have a
85-
#' standard deviation of approximately 1 and should likely also be included as
86-
#' a main effect.
83+
#' standard deviation of approximately 1.
8784
#' **The spatial intercept is controlled by the `spatial` argument**; therefore,
8885
#' include or exclude the spatial intercept by setting `spatial = 'on'` or
8986
#' `'off'`. The only time when it matters whether `spatial_varying` excludes
@@ -97,8 +94,8 @@ NULL
9794
#' a name of the variable in the data frame. See the Details section below.
9895
#' @param offset A numeric vector representing the model offset *or* a character
9996
#' value representing the column name of the offset. In delta/hurdle models,
100-
#' this applies only to the positive component. Usually a log
101-
#' transformed variable.
97+
#' this applies only to the positive component. Usually a log transformed
98+
#' variable.
10299
#' @param extra_time Optional extra time slices (e.g., years) to include for
103100
#' interpolation or forecasting with the predict function. See the Details
104101
#' section below.
@@ -163,7 +160,7 @@ NULL
163160
#' An object (list) of class `sdmTMB`. Useful elements include:
164161
#'
165162
#' * `sd_report`: output from [TMB::sdreport()]
166-
#' * `gradients`: log likelihood gradients with respect to each fixed effect
163+
#' * `gradients`: marginal log likelihood gradients with respect to each fixed effect
167164
#' * `model`: output from [stats::nlminb()]
168165
#' * `data`: the fitted data
169166
#' * `mesh`: the object that was supplied to the `mesh` argument
@@ -202,8 +199,8 @@ NULL
202199
#' with `+ s(x, y)` or `+ t2(x, y)`; smooths can be specific to various factor
203200
#' levels, `+ s(x, by = group)`; the basis function dimensions may be specified,
204201
#' e.g. `+ s(x, k = 4)`; and various types of splines may be constructed such as
205-
#' cyclic splines to model seasonality, `+ s(month, bs = "cc", k = 12)` (perhaps
206-
#' with the `knots` argument also be supplied).
202+
#' cyclic splines to model seasonality (perhaps with the `knots` argument also
203+
#' be supplied).
207204
#'
208205
#' **Threshold models**
209206
#'
@@ -216,7 +213,8 @@ NULL
216213
#' `+ logistic(variable)`. This option models the relationship as a logistic
217214
#' function of the 50% and 95% values. This is similar to length- or size-based
218215
#' selectivity in fisheries, and is parameterized by the points at which f(x) =
219-
#' 0.5 or 0.95. See the [threshold vignette](https://pbs-assess.github.io/sdmTMB/articles/threshold-models.html).
216+
#' 0.5 or 0.95. See the
217+
#' [threshold vignette](https://pbs-assess.github.io/sdmTMB/articles/threshold-models.html).
220218
#'
221219
#' Note that only a single threshold covariate can be included and the same covariate
222220
#' is included in both components for the delta families.
@@ -236,16 +234,7 @@ NULL
236234
#' time slices with process error.
237235
#'
238236
#' `extra_time` can also be used to fill in missing time steps for the purposes
239-
#' of a random walk or AR(1) process if their inclusion makes the gaps between
240-
#' time steps even.
241-
#'
242-
#' **Index standardization**
243-
#'
244-
#' For index standardization, you may wish to include `0 + as.factor(year)`
245-
#' (or whatever the time column is called) in the formula. See a basic
246-
#' example of index standardization in the relevant
247-
#' [package vignette](https://pbs-assess.github.io/sdmTMB/articles/index-standardization.html).
248-
#' You will need to specify the `time` argument. See [get_index()].
237+
#' of a random walk or AR(1) process if the gaps between time steps are uneven.
249238
#'
250239
#' **Regularization and priors**
251240
#'
@@ -279,11 +268,19 @@ NULL
279268
#' The main advantage of specifying such models using a delta family (compared
280269
#' to fitting two separate models) is (1) coding simplicity and (2) calculation
281270
#' of uncertainty on derived quantities such as an index of abundance with
282-
#' [get_index()] using the generalized delta method within TMB. Also, parameters
283-
#' can be shared across the models.
271+
#' [get_index()] using the generalized delta method within TMB. Also, selected
272+
#' parameters can be shared across the models.
284273
#'
285274
#' See the [delta-model vignette](https://pbs-assess.github.io/sdmTMB/articles/delta-models.html).
286275
#'
276+
#' **Index standardization**
277+
#'
278+
#' For index standardization, you may wish to include `0 + as.factor(year)`
279+
#' (or whatever the time column is called) in the formula. See a basic
280+
#' example of index standardization in the relevant
281+
#' [package vignette](https://pbs-assess.github.io/sdmTMB/articles/index-standardization.html).
282+
#' You will need to specify the `time` argument. See [get_index()].
283+
#'
287284
#' @references
288285
#'
289286
#' **Main reference introducing the package to cite when using sdmTMB:**
@@ -305,7 +302,7 @@ NULL
305302
#' English, P., E.J. Ward, C.N. Rooper, R.E. Forrest, L.A. Rogers, K.L. Hunter,
306303
#' A.M. Edwards, B.M. Connors, S.C. Anderson. 2021. Contrasting climate velocity
307304
#' impacts in warm and cool locations show that effects of marine warming are
308-
#' worse in already warmer temperate waters. In press at Fish and Fisheries.
305+
#' worse in already warmer temperate waters. Fish and Fisheries. 23(1) 239-255.
309306
#' \doi{10.1111/faf.12613}.
310307
#'
311308
#' *Discussion of and illustration of some decision points when fitting these
@@ -319,17 +316,18 @@ NULL
319316
#' *Application and description of threshold/break-point models:*
320317
#'
321318
#' Essington, T.E. S.C. Anderson, L.A.K. Barnett, H.M. Berger, S.A. Siedlecki,
322-
#' E.J. Ward. Advancing statistical models to reveal the effect of dissolved
323-
#' oxygen on the spatial distribution of marine taxa using thresholds and a
324-
#' physiologically based index. In press at Ecography. \doi{10.1111/ecog.06249}.
319+
#' E.J. Ward. 2022. Advancing statistical models to reveal the effect of
320+
#' dissolved oxygen on the spatial distribution of marine taxa using thresholds
321+
#' and a physiologically based index. Ecography. 2022: e06249
322+
#' \doi{10.1111/ecog.06249}.
325323
#'
326324
#' *Application to fish body condition:*
327325
#'
328326
#' Lindmark, M., S.C. Anderson, M. Gogina, M. Casini. Evaluating drivers of
329327
#' spatiotemporal individual condition of a bottom-associated marine fish.
330328
#' bioRxiv 2022.04.19.488709. \doi{10.1101/2022.04.19.488709}.
331329
#'
332-
#' *A number of sections of the original TMB model code were adapted from the
330+
#' *Several sections of the original TMB model code were adapted from the
333331
#' VAST R package:*
334332
#'
335333
#' Thorson, J.T., 2019. Guidance for decisions using the Vector Autoregressive
@@ -363,13 +361,14 @@ NULL
363361
#'
364362
#' # Build a mesh to implement the SPDE approach:
365363
#' mesh <- make_mesh(pcod_2011, c("X", "Y"), cutoff = 20)
366-
#' # * this example uses a fairly coarse mesh so these examples run quickly
367-
#' # * `cutoff` is the minimum distance between mesh vertices in units of the
364+
#'
365+
#' # - this example uses a fairly coarse mesh so these examples run quickly
366+
#' # - 'cutoff' is the minimum distance between mesh vertices in units of the
368367
#' # x and y coordinates
369-
#' # * `cutoff = 10` or `cutoff = 15` might make more sense in applied situations
370-
#' # for this dataset
371-
#' # * or build any mesh in 'fmesher' and pass it to the `mesh` argument in `make_mesh()`
372-
#' # * not needed if you will be turning off all spatial/spatiotemporal random fields
368+
#' # - 'cutoff = 10' might make more sense in applied situations for this dataset
369+
#' # - or build any mesh in 'fmesher' and pass it to the 'mesh' argument in make_mesh()`
370+
#' # - the mesh is not needed if you will be turning off all
371+
#' # spatial/spatiotemporal random fields
373372
#'
374373
#' # Quick mesh plot:
375374
#' plot(mesh)
@@ -607,14 +606,10 @@ sdmTMB <- function(
607606
spatiotemporal <- rep("iid", n_m)
608607
}
609608

610-
611609
if (is.null(time)) {
612610
spatial_only <- rep(TRUE, n_m)
613611
} else {
614612
spatial_only <- ifelse(spatiotemporal == "off", TRUE, FALSE)
615-
# if(all(spatiotemporal == "off")) {
616-
# cli_abort("Time needs to be null if spatiotemporal fields are not included")
617-
# }
618613
}
619614

620615
if (is.list(spatial)) {
@@ -634,8 +629,6 @@ sdmTMB <- function(
634629
}
635630

636631
if (!include_spatial && all(spatiotemporal == "off") || !include_spatial && all(spatial_only)) {
637-
# message("Both spatial and spatiotemporal fields are set to 'off'.")
638-
# control$map_rf <- TRUE
639632
no_spatial <- TRUE
640633
if (missing(mesh)) {
641634
mesh <- sdmTMB::pcod_mesh_2011 # internal data; fake!
@@ -787,20 +780,6 @@ sdmTMB <- function(
787780
# "As of version 0.3.1, sdmTMB turns off the constant spatial field `omega_s` when `spatial_varying` is specified so that the intercept or factor-level means are fully described by the spatially varying random fields `zeta_s`.")
788781
cli_inform(paste(msg, collapse = " "))
789782
}
790-
# if (!omit_spatial_intercept & !length(attr(z_i, "contrasts"))) {
791-
# msg <- c("The spatial intercept is now in the first element of the spatially varying random fields `zeta_s` instead of the constant spatial random field `omega_s`. This change in the output format occurred in version 0.3.1.")
792-
# cli_inform(msg)
793-
# }
794-
# .int <- sum(grep("(Intercept)", colnames(z_i)) > 0)
795-
# if (.int && !omit_spatial_intercept) {
796-
# # msg <- c("Detected an intercept in `spatial_varying`.",
797-
# # "Make sure you have `spatial = 'off'` set since this also represents a spatial intercept.")
798-
# # cli_warn(msg)
799-
# # actually, just do it!
800-
# omit_spatial_intercept <- TRUE
801-
# include_spatial <- TRUE
802-
# spatial <- "on"
803-
# }
804783
.int <- grep("(Intercept)", colnames(z_i))
805784
if (sum(.int) > 0) z_i <- z_i[,-.int,drop=FALSE]
806785
spatial_varying <- colnames(z_i)
@@ -1042,7 +1021,6 @@ sdmTMB <- function(
10421021

10431022
# TODO: make this cleaner
10441023
X_ij_list <- list()
1045-
#X_ij_array <- array(data = NA, dim = c(nrow(X_ij[[1]]), ncol(X_ij[[1]]), n_m))
10461024
for (i in seq_len(n_m)) X_ij_list[[i]] <- X_ij[[i]]
10471025

10481026
n_t <- length(unique(data[[time]]))
@@ -1174,8 +1152,6 @@ sdmTMB <- function(
11741152
tmb_params$b_j <- stats::coef(temp)
11751153
}
11761154

1177-
#if (delta && !is.null(thresh$threshold_parameter)) cli_abort("Thresholds not implemented with delta models yet.") # TODO DELTA
1178-
11791155
# Map off parameters not needed
11801156
tmb_map <- map_all_params(tmb_params)
11811157
tmb_map$b_j <- NULL
@@ -1203,14 +1179,6 @@ sdmTMB <- function(
12031179
tmb_map$ln_phi <- as.factor(tmb_map$ln_phi)
12041180
if (!is.null(thresh[[1]]$threshold_parameter)) tmb_map$b_threshold <- NULL
12051181

1206-
# optional models on spatiotemporal sd parameter
1207-
# if (est_epsilon_re == 0L) {
1208-
# tmb_map <- c(tmb_map,
1209-
# list(
1210-
# ln_epsilon_re_sigma = factor(rep(NA, n_m)),
1211-
# epsilon_re = factor(rep(NA, tmb_data$n_t))
1212-
# ))
1213-
# }
12141182
if (est_epsilon_re == 1L) {
12151183
tmb_map <- unmap(tmb_map, c("ln_epsilon_re_sigma","epsilon_re"))
12161184
}
@@ -1222,7 +1190,7 @@ sdmTMB <- function(
12221190

12231191

12241192
original_tmb_data <- tmb_data
1225-
# # much faster on first phase!?
1193+
# much faster on first phase!?
12261194
tmb_data$no_spatial <- 1L
12271195
# tmb_data$include_spatial <- 0L
12281196
tmb_data$include_spatial <- rep(0L, length(spatial)) # for 1st phase
@@ -1287,11 +1255,6 @@ sdmTMB <- function(
12871255
if (reml) tmb_random <- c(tmb_random, "b_j")
12881256
if (reml && delta) tmb_random <- c(tmb_random, "b_j2")
12891257

1290-
## if (est_epsilon_model >= 2) {
1291-
## # model 2 = re model, model 3 = loglinear-re
1292-
## tmb_random <- c(tmb_random, "epsilon_rw")
1293-
## }
1294-
12951258
if (sm$has_smooths) {
12961259
if (reml) tmb_random <- c(tmb_random, "bs")
12971260
tmb_random <- c(tmb_random, "b_smooth") # smooth random effects
@@ -1380,8 +1343,6 @@ sdmTMB <- function(
13801343
prof <- c("b_j")
13811344
if (delta) prof <- c(prof, "b_j2")
13821345

1383-
1384-
13851346
out_structure <- structure(list(
13861347
data = data,
13871348
spde = spde,
@@ -1445,7 +1406,6 @@ sdmTMB <- function(
14451406
out_structure$do_index <- FALSE
14461407
}
14471408

1448-
14491409
tmb_obj <- TMB::MakeADFun(
14501410
data = tmb_data, parameters = tmb_params, map = tmb_map,
14511411
profile = if (control$profile) prof else NULL,
@@ -1546,23 +1506,6 @@ set_limits <- function(tmb_obj, lower, upper, loc = NULL, silent = TRUE) {
15461506
.upper["ar1_phi"] <- stats::qlogis((0.999 + 1) / 2)
15471507
}
15481508

1549-
# if (!is.null(loc) && !"ln_kappa" %in% union(names(lower), names(upper))) {
1550-
# .dist <- stats::dist(loc)
1551-
# range_limits <- c(min(.dist) * 0.25, max(.dist) * 4)
1552-
# .upper[names(.upper) == "ln_kappa"] <- log(sqrt(8) / range_limits[1])
1553-
# .lower[names(.upper) == "ln_kappa"] <- log(sqrt(8) / range_limits[2])
1554-
# message("Setting limits on range to ",
1555-
# round(range_limits[1], 1), " and ",
1556-
# round(range_limits[2], 1), ":\n",
1557-
# "half the minimum and twice the maximum knot distance."
1558-
# )
1559-
# if (.upper[names(.upper) == "ln_kappa"][1] <= tmb_obj$par[["ln_kappa"]][1]) {
1560-
# .upper[names(.upper) == "ln_kappa"] <- tmb_obj$par[["ln_kappa"]][1] + 0.1
1561-
# }
1562-
# if (.lower[names(.lower) == "ln_kappa"][1] >= tmb_obj$par[["ln_kappa"]][1]) {
1563-
# .lower[names(.lower) == "ln_kappa"] <- tmb_obj$par[["ln_kappa"]][1] - 0.1
1564-
# }
1565-
# }
15661509
list(lower = .lower, upper = .upper)
15671510
}
15681511

@@ -1624,7 +1567,6 @@ check_irregalar_time <- function(data, time, spatiotemporal, time_varying) {
16241567
find_missing_time <- function(x) {
16251568
if (!is.factor(x)) {
16261569
ti <- sort(unique(x))
1627-
# mindiff <- min(diff(ti))
16281570
mindiff <- 1L
16291571
allx <- seq(min(ti), max(ti), by = mindiff)
16301572
setdiff(allx, ti)

R/predict.R

Lines changed: 16 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,8 @@
44

55
#' Predict from an sdmTMB model
66
#'
7-
#' Make predictions from an sdmTMB model; can predict on the original or new
8-
#' data.
7+
#' Make predictions from an \pkg{sdmTMB} model; can predict on the original or
8+
#' new data.
99
#'
1010
#' @param object A model fitted with [sdmTMB()].
1111
#' @param newdata A data frame to make predictions on. This should be a data
@@ -29,32 +29,36 @@
2929
#' predictions. `~0` or `NA` for population-level predictions. No other
3030
#' options (e.g., some but not all random intercepts) are implemented yet.
3131
#' Only affects predictions with `newdata`. This *does* affects [get_index()].
32-
#' @param nsim Experimental: If `> 0`, simulate from the joint precision
32+
#' @param nsim If `> 0`, simulate from the joint precision
3333
#' matrix with `nsim` draws. Returns a matrix of `nrow(data)` by `nsim`
3434
#' representing the estimates of the linear predictor (i.e., in link space).
35-
#' Can be useful for deriving uncertainty on predictions (e.g., `apply(x, 1,
36-
#' sd)`) or propagating uncertainty. This is currently the fastest way to
37-
#' characterize uncertainty on predictions in space with sdmTMB.
35+
#' Can be useful for deriving uncertainty on predictions
36+
#' (e.g., `apply(x, 1, sd)`) or propagating uncertainty. This is currently
37+
#' the fastest way to characterize uncertainty on predictions in space with
38+
#' sdmTMB.
3839
#' @param sims_var Experimental: Which TMB reported variable from the model
3940
#' should be extracted from the joint precision matrix simulation draws?
40-
#' Defaults to the link-space predictions. Options include: `"omega_s"`,
41+
#' Defaults to link-space predictions. Options include: `"omega_s"`,
4142
#' `"zeta_s"`, `"epsilon_st"`, and `"est_rf"` (as described below).
4243
#' Other options will be passed verbatim.
4344
#' @param tmbstan_model Deprecated. See `mcmc_samples`.
4445
#' @param mcmc_samples See `extract_mcmc()` in the
4546
#' \href{https://github.com/pbs-assess/sdmTMBextra}{sdmTMBextra} package for
46-
#' more details and the Bayesian vignette. If specified, the predict function
47-
#' will return a matrix of a similar form as if `nsim > 0` but representing
48-
#' Bayesian posterior samples from the Stan model.
47+
#' more details and the
48+
#' \href{https://pbs-assess.github.io/sdmTMB/articles/web_only/bayesian.html}{Bayesian vignette}.
49+
#' If specified, the predict function will return a matrix of a similar form
50+
#' as if `nsim > 0` but representing Bayesian posterior samples from the Stan
51+
#' model.
4952
#' @param model Type of prediction if a delta/hurdle model *and* `nsim > 0` or
5053
#' `mcmc_samples` is supplied: `NA` returns the combined prediction from both
5154
#' components on the link scale for the positive component; `1` or `2` return
5255
#' the first or second model component only on the link or response scale
53-
#' depending on the argument `type`.
56+
#' depending on the argument `type`. For regular prediction from delta models,
57+
#' both sets of predictions are returned.
5458
#' @param offset A numeric vector of optional offset values. If left at default
5559
#' `NULL`, the offset is implicitly left at 0.
5660
#' @param return_tmb_report Logical: return the output from the TMB
57-
#' report? For regular prediction this is all the reported variables
61+
#' report? For regular prediction, this is all the reported variables
5862
#' at the MLE parameter values. For `nsim > 0` or when `mcmc_samples`
5963
#' is supplied, this is a list where each element is a sample and the
6064
#' contents of each element is the output of the report for that sample.
@@ -499,8 +503,6 @@ predict.sdmTMB <- function(object, newdata = NULL,
499503
tmb_data$calc_se <- as.integer(se_fit)
500504
tmb_data$pop_pred <- as.integer(pop_pred)
501505
tmb_data$exclude_RE <- exclude_RE
502-
# tmb_data$calc_index_totals <- as.integer(!se_fit)
503-
# tmb_data$calc_cog <- as.integer(!se_fit)
504506
tmb_data$proj_spatial_index <- newdata$sdm_spatial_id
505507
tmb_data$proj_Zs <- sm$Zs
506508
tmb_data$proj_Xs <- sm$Xs

0 commit comments

Comments
 (0)