-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathsstmodels.qmd
409 lines (319 loc) · 12.7 KB
/
sstmodels.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
---
title: "Spatial and Spatio-temporal Modelling"
author:
- name: "Garyfallos Konstantinoudis"
email: "[email protected]"
institute: "Imperial College London"
date: "08-22-2024"
date-format: medium
title-slide-attributes:
data-background-color: "#f3f4f4"
data-background-image: "../../assets/bmeh_normal.png"
data-background-size: 80%
data-background-position: 60% 120%
subparagraph: yes
format:
revealjs:
slide-number: true
incremental: false
chalkboard:
buttons: false
preview-links: auto
logo: "../../assets/bmeh_normal.png"
theme: [default, ../../assets/style.scss]
---
# Introduction
## What is spatial modelling?
<center>
data:image/s3,"s3://crabby-images/fd7c2/fd7c2f68a48f7bda311b746bf22320d005574066" alt=""{width=90%}
</center>
## It is also geostatistics
<center>
data:image/s3,"s3://crabby-images/cc882/cc882946b74b40732eb84a2ebf2b2bd59ab978f8" alt=""{width=85%}
</center>
## Type of data
<center>
data:image/s3,"s3://crabby-images/804a0/804a0adde6ac7c5e8895900a7505ee0fe1bcba8b" alt=""
</center>
## Why do we need more complex models?
- In disease mapping estimates of disease risk can be unstable, leading to non-meaningful maps
- Accounting for residual spatial autocorrelation is important to have the correct uncertainty
- Sometimes they pick up unknown spatial confounding, leading to unbiased estimators
## Structure of the talk
- Disease mapping and smoothing
- Global smoothing
- Local smoothing
- Examples
# Disease mapping and smoothing
## Disease mapping - background
- To summarise spatial and spatio-temporal variation in disease risk
- Rare disease and/or small areas: Poisson framework
$$O_i \sim \text{Poisson}(\lambda_i E_i)$$
where $E_i$ is the expected nb of cases and $\lambda_i=\text{SMR}$ in area $i$.
- Non smoothed estimates of the SMR:
$$\text{SMR}_i = \frac{O_i}{E_i},\;\text{and} \;\text{Var(SMR)}_i = \frac{O_i}{E_i^2}$$
## Issues with crude RR estimates
Calculating crude (non-smoothed) estimates of the SMR can be problematic:
- SMR is estimated independently: does not account for spatial correlation, ie makes no use of SMR estimates in other areas of the map, even though these are likely to be similar.
- They are very imprecise: areas with small $E_i$ have high associated variance.
## Example with childhood cancers in Switzerland
- Rare cancers (6,000 cases in 31 years)
- Little known about causes, potential clusters
<center>
data:image/s3,"s3://crabby-images/c692e/c692e202f7915a75143042dcf5eac09de0bab221" alt=""{width=75%}
</center>
## Example with childhood cancers in Switzerland
- Quartiles and 0s a separate category
- Is this map informative?
<center>
data:image/s3,"s3://crabby-images/4612a/4612a40f6ae84fed45b9ef03a9aebd8b96ebc0a1" alt=""{width=75%}
</center>
## Hierarchical modelling to smooth rates
- **Globally**: Smooth the rates towards the global mean.
- **Locally**: Smooth the rates towards the local mean, using information about the adjacency structure.
- **Globally + Locally**: Smooth the rates using information about the global mean and the adjacency structure
## Global smoothing
$$
O_i \sim \text{Poisson}(\lambda_i E_i) \\
\log \lambda_i = \alpha + \theta_i \\
\theta_i \sim \text{Normal}(0,\sigma_{\theta}^2)
$$
$\theta_i$: area specific random effects to account for overdispersion
- excess variation in the observed counts due to random noise
- latent variable which captures the effects of unknown or unmeasured area level covariates
$\exp(\theta_i)$: RR in area i, relative to the overall SMR/SIR
## Example with childhood cancers in Switzerland
- Plot $exp(\alpha + \theta_i)$
<center>
data:image/s3,"s3://crabby-images/c44df/c44df1859d04271048f9fd4c603363dff78308fb" alt=""{width=75%}
</center>
## Issues with global smoothing
- Poisson-logNormal model assumes identically distributed and independent data.
- Data that occur close together in space are likely to be correlated (Dependence assumption more realistic).
- Ignoring spatial dependence can lead to biased and inefficient inference.
- Interest in estimating the relationship between location and outcome.
**Local smoothing**: prior distribution for the random effects allowing for spatial correlation to help uncover spatial patterns
## Specify a distribution which takes into account information
- Specify a distribution which takes into account information on neighbouring areas (**adjacency matrix**).
- Rule for determining the neighbours of each area: most common based on common boundary.
- Estimate **spatial random effect** for each area as if we knew the values of the spatial random effects in neighbouring areas
- Use of conditional autoregressive distributions (we are conditioning on knowing the neighbours)
## Example: Queen contiguity in Switzerland (Cantons)
Let $\partial_i=$ set if area adjacent to $i$
:::: {.columns}
::: {.column width="30%"}
$$
w = \begin{cases} 1, \; \text{if} \; j\in \partial_i \\\\0 \; \text{otherwise} \end{cases}
$$
:::
::: {.column width="70%"}
<center>
data:image/s3,"s3://crabby-images/fa949/fa949d8b82173e1c0838f88270f5a6d4a3c8920c" alt=""{width=80%}
</center>
:::
::::
## Intrinsic ICAR model (Besag et al. 1991)
$$
\phi \sim \text{ICAR}(W, \sigma^2_{\phi}) \Leftrightarrow \phi_i | \phi_{j\; j\ne i} \sim \text{Normal}(\phi_i, \sigma_{\phi}^2)
$$
:::: {.columns}
::: {.column width="30%"}
$$
E(\phi_i|\phi_j) = \sum_{j \in \partial_i \phi_j} \phi_j/n_i \\
Var(\phi_i|\phi_j) = \sigma_{\phi}^2/n_i
$$
:::
::: {.column width="70%"}
<center>
data:image/s3,"s3://crabby-images/e3800/e3800d9e7441d1bf56fcda2849296956afd0ba15" alt=""{width=60%}
</center>
:::
::::
## Example with childhood cancers in Switzerland (cont)
- Plot $exp(\alpha + \phi_i)$
<center>
data:image/s3,"s3://crabby-images/aecc9/aecc956245f243bbac254d4ad002754cff859c0a" alt=""{width=75%}
</center>
## The Besag-York-Mollie model
Local+Global smoothing:
$$
O_i \sim \text{Poisson}(\lambda_i E_i) \\
\log \lambda_i = \alpha + \theta_i + \phi_i \\
\theta_i \sim \text{Normal}(0,\sigma_{\theta}^2)\\
\phi \sim \text{ICAR}(W, \sigma^2_{\phi})
$$
priors for $\alpha, \sigma_{\theta}^2 \; \text{and} \; \sigma^2_{\phi}$
## Example with childhood cancers in Switzerland (cont)
- Plot $exp(\alpha + \theta_i + \phi_i)$
<center>
data:image/s3,"s3://crabby-images/14b06/14b06ec7326a7e0ebd1cc4796cca0aa7ef6be46d" alt=""{width=75%}
</center>
## Use BYM with care
- ICAR model defined above is improper: the overall mean of the $\phi_i$ is not defined: **sum-to-zero contraints**
- Inference may be sensitive to choice of hyperprior for the random effects variance or precision parameters
- Other known issues we have not discussed about: different interpretation of the variances (marginal conditional), scalling the covariance matrix, etc.
## Other CAR models
Let $\log \lambda_i = \alpha + b_i$:
- The proper CAR
$$
b_i|b_j \sim \text{Normal}(\rho \sum_{j \in \partial_i}b_j/n_i, \sigma_b^2/n_i)
$$
- Leroux CAR prior
$$
b_i|b_j \sim \text{Normal}\Bigg(\frac{\rho \sum_{j \in \partial_i}b_j}{\rho n_i + (1 - \rho)}, \frac{\sigma_b^2}{{\rho n_i + (1 - \rho)}}\Bigg)
$$
## Other CAR models
- BYM2
$$
b = \sigma_b*(\sqrt{1 - \rho}\theta + \sqrt{\rho}\phi) \\
\theta \sim \text{Normal}(\sigma_{\theta}^2)\\
\phi \sim \text{ICAR}(W^*, \sigma_{\phi}^2)
$$
- Adaptive CARs
## Posterior Probability
- Mapping the posterior mean RR does (or residual RR) not make full use of the output of the Bayesian analysis that provides, for each area, samples from the whole posterior distribution of the relative risk.
- Mapping the probability that a RR is greater than a specified threshold of interest has been proposed by several authors (e.g. Clayton and Bernardinelli (1992)).
- Very effective method to identify areas characterised by elevated risk
## How to classify areas as having elevated risk
- We define the decision rule $D(c, \text{RR}_0)$, which depends:
+ on a cutoff probability $c$
+ a reference threshold $\text{RR}_0$
- Area $i$ is classified as having an elevated risk according to $D(c, \text{RR}_0) : Prob(\text{RR}_i > \text{RR}_0) > c$
- Recommended values $c = 0.8$ and $\text{RR}_0 = 1$ (Richardson et al. 2004)
- Posterior probability of interest: $\text{Prob}(\exp(\theta_i+ \phi_i) > 1)$
## Example with childhood cancers in Switzerland
<center>
data:image/s3,"s3://crabby-images/3ca7d/3ca7dcb2699b9f170dd7d9967b8a2ff2788f6b42" alt=""{width=75%}
</center>
## Example with childhood cancers in Switzerland
<center>
data:image/s3,"s3://crabby-images/925f7/925f7008e031726c89cfb8d6abb5adf8058b5362" alt=""{width=75%}
</center>
# Ecological regression
## Extending to ecological regression
Straightforward extension of the BYM (read any CAR) model:
$$
O_i \sim \text{Poisson}(\lambda_i E_i) \\
\log \lambda_i = \alpha + \color{red}{\beta X_i} + \theta_i + \phi_i \\
\theta_i \sim \text{Normal}(0,\sigma_{\theta}^2)\\
\phi \sim \text{ICAR}(W, \sigma^2_{\phi})
$$
- $\color{red}{X}$ area-level covariate of interest
- $\color{red}{\beta}$: parameter associated with the covariate (assigned a prior)
## Interpretation of the parameters
- $\exp(\beta)$ is the relative risk of disease associated with a unit increase in exposure $X$ (when X continuous)
- $\exp(\theta_i + \phi_i)$ is the residual or adjusted relative risk of disease in area $i$ after accounting for the effect of $X$.
## Extension to several variables
$$
O_i \sim \text{Poisson}(\lambda_i E_i) \\
\log \lambda_i = \alpha + \color{red}{\beta_1 X_{1i} + \dots + \beta_k X_{ki}} + \theta_i + \phi_i \\
\theta_i \sim \text{Normal}(0,\sigma_{\theta}^2)\\
\phi \sim \text{ICAR}(W, \sigma^2_{\phi})
$$
- $\exp(\beta_1)$ is the relative risk of disease associated with a unit increase in exposure $X$, after adjustment for $X_2, \dots ,X_k$
- Same for $\exp(\beta_k)$, after adjustment for $X_1, \dots ,X_{k-1}$
- $\exp(\theta_i + \phi_i)$ is the adjusted relative risk of disease in area $i$ after accounting for the effects of measured covariates
## Example: HPV vaccination in Switzerland
$$
Y_{kit} \sim \text{Binomial}(p_{kit}) \\
\text{logit}(p_{kit}) = \alpha + \beta_1 X_{1it} + \dots + \beta_k X_{kit} + \theta_i + \phi_i \\
\theta_i \sim \text{Normal}(0,\sigma_{\theta}^2)\\
\phi \sim \text{ICAR}(W, \sigma^2_{\phi})
$$
- Covariates related with age, sex, period of survey, deprivation, religion, language region, etc.
## HPV vaccination uptake
- Model without any covariates just the BYM prior
- Plot $\theta_s + \phi_s$ (on the log-odds)
<center>
data:image/s3,"s3://crabby-images/adac2/adac2fc866719909a7251981a5e5860d986b061e" alt=""{width=75%}
</center>
## Effect of covariates
<center>
data:image/s3,"s3://crabby-images/c2625/c26250035184b9ff0515ed8fbc500bad2e9e988f" alt=""{width=100%}
</center>
## Residual spatial autocorrelation
<center>
data:image/s3,"s3://crabby-images/018ef/018ef696d2a5d7aeb50a72e35c880d509f7a14b9" alt=""{width=75%}
</center>
# Building complex spatiotemporal models
## Cervical cancers among HIV positive women
- Global cohort composed of 1,700,000 persons living with HIV/AIDS
- SAM study, evaluating HIV related cancers in South Africa
- 17% of adult women in South Africa over the age of 25 years are living with HIV
- High burden of cervical cancer: 43.1 cases per 100,000
- 5 times more likely to develop cervical cancer
## Exploratory analysis of the crude rates
<center>
data:image/s3,"s3://crabby-images/aa7ee/aa7eeba3b6ced32969dabb0bfa4c6266a594e898" alt=""{width=100%}
</center>
## Exploratory analysis of the crude rates (cont)
<center>
data:image/s3,"s3://crabby-images/038e4/038e4ea5f2d644862ef915737f461fc62de7c9c7" alt=""{width=75%}
</center>
## A spatiotemporal model
$$
O_{it} \sim \text{Poisson}(\lambda_{it} E_{it}) \\
\log \lambda_{it} = \alpha + \theta_i + \phi_i + \color{red}{\gamma_t + \xi_t} \\
\theta_i \sim \text{Normal}(0,\sigma_{\theta}^2)\\
\phi \sim \text{ICAR}(W, \sigma^2_{\phi})\\
\gamma_t \sim \text{Normal}(0,\sigma_{\gamma}^2)\\
\xi_t \sim \text{RW1}(\sigma^2_{\xi})
$$
Similar with the BYM model in space $\gamma_t$ is the temporal unstructured and $\xi_t$ the temporal structured random effect, capturing the temporal autocorrelation.
## Space and time seperately
<center>
data:image/s3,"s3://crabby-images/861fb/861fb7c9a8ba780221c59e5f5cb284904de4e91c" alt=""{width=100%}
</center>
## Spacetime (Spaghetti plot)
<center>
data:image/s3,"s3://crabby-images/a580e/a580e9a894e65d950830f54c88c0729147ed49e2" alt=""{width=75%}
</center>
## Spacetime (Space by year)
<center>
data:image/s3,"s3://crabby-images/60ae2/60ae210304287fbbfc9971c142c974929a9a50b6" alt=""{width=90%}
</center>
## Spacetime interactions
```{r}
data.frame(
Interaction = paste("type", 1:4),
Parameters = c(
"$\\nu_{it} = \\theta_i \\otimes \\gamma_t$",
"$\\nu_{it} = \\theta_i \\otimes \\xi_t$",
"$\\nu_{it} = \\phi_i \\otimes \\gamma_t$",
"$\\nu_{it} = \\phi_i \\otimes \\xi_t$"
)
) -> tab
```
::::: {style="font-size:78%"}
$$
O_{it} \sim \text{Poisson}(\lambda_{it} E_{it}) \\
\log \lambda_{it} = \alpha + \theta_i + \phi_i + \gamma_t + \xi_t + \color{red}{\nu_{it}} \\
\theta_i \sim \text{Normal}(0,\sigma_{\theta}^2)\\
\phi \sim \text{ICAR}(W, \sigma^2_{\phi})\\
\gamma_t \sim \text{Normal}(0,\sigma_{\gamma}^2)\\
\xi_t \sim \text{RW1}(\sigma^2_{\xi})
$$
```{r}
knitr::kable(tab, escape = FALSE, align = "c")
```
:::::
## Summary
- Introduction to disease mapping and CAR priors
- Extension to ecological regression and relevant interpretations
- Separable spatiotemporal models
- Higher spatiotemporal interactions
- Examples using childhood cancers, HPV vaccination and cervical cancers in South Africa
::::: {style="color: #58b364"}
Next : Coding these models in NIMBLE.
:::::
# Getting ready for the lab
## The lab for this session {.smaller}
::: incremental
- This goal of this lab is to use `NIMBLE` to carry out a disease mapping study.
- During this lab session, we will:
1. Explore ways of visualizing spatial data;
2. Define the neighborhood matrix in `R`;
3. Fit and interpret the BYM model in `NIMBLE`; and
4. Perform spatial ecological regression
:::
# Questions?