You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The documentation for the stable version is at <https://cmu-delphi.github.io/epipredict>, while the development version is at <https://cmu-delphi.github.io/epipredict/dev>.
108
+
109
+
The documentation for the stable version is at
110
+
<https://cmu-delphi.github.io/epipredict>, while the development version is at
111
+
<https://cmu-delphi.github.io/epipredict/dev>.
109
112
110
113
111
114
## Motivating example
112
115
113
-
To demonstrate the kind of forecast epipredict can make, say we're predicting COVID deaths per 100k for each state on
116
+
To demonstrate the kind of forecast epipredict can make, say we're predicting
117
+
COVID deaths per 100k for each state on
118
+
114
119
```{r fc_date}
115
120
forecast_date <- as.Date("2021-08-01")
116
121
```
117
-
Below the fold, we construct this dataset as an `epiprocess::epi_df` from JHU data.
122
+
123
+
Below the fold, we construct this dataset as an `epiprocess::epi_df` from JHU
124
+
data.
118
125
119
126
<details>
120
127
<summary> Creating the dataset using `{epidatr}` and `{epiprocess}` </summary>
121
-
This dataset can be found in the package as <TODO DOESN'T EXIST>; we demonstrate some of the typically ubiquitous cleaning operations needed to be able to forecast.
122
-
First we pull both jhu-csse cases and deaths from [`{epidatr}`](https://cmu-delphi.github.io/epidatr/) package:
123
-
```{r case_death}
128
+
129
+
This dataset can be found in the package as <TODO DOESN'T EXIST>; we demonstrate
130
+
some of the typically ubiquitous cleaning operations needed to be able to
As with basically any dataset, there is some cleaning that we will need to do to make it actually usable; we'll use some utilities from [`{epiprocess}`](https://cmu-delphi.github.io/epiprocess/) for this.
158
-
First, to eliminate some of the noise coming from daily reporting, we do 7 day averaging over a trailing window[^1]:
159
169
160
-
[^1]: This makes it so that any given day of the processed timeseries only depends on the previous week, which means that we avoid leaking future values when making a forecast.
170
+
As with basically any dataset, there is some cleaning that we will need to do to
171
+
make it actually usable; we'll use some utilities from
172
+
[`{epiprocess}`](https://cmu-delphi.github.io/epiprocess/) for this. First, to
173
+
eliminate some of the noise coming from daily reporting, we do 7 day averaging
174
+
over a trailing window[^1]:
175
+
176
+
[^1]: This makes it so that any given day of the processed timeseries only
177
+
depends on the previous week, which means that we avoid leaking future
178
+
values when making a forecast.
161
179
162
180
```{r smooth}
163
181
cases_deaths <-
@@ -174,13 +192,18 @@ cases_deaths <-
174
192
```
175
193
176
194
Then trimming outliers, most especially negative values:
To make a forecast, we will use a "canned" simple auto-regressive forecaster to predict the death rate four weeks into the future using lagged[^3] deaths and cases
253
+
To make a forecast, we will use a "canned" simple auto-regressive forecaster to
254
+
predict the death rate four weeks into the future using lagged[^3] deaths and
255
+
cases
229
256
230
257
[^3]: lagged by 3 in this context meaning using the value from 3 days ago.
231
258
@@ -251,11 +278,16 @@ predicted values (and prediction intervals) for each location 28 days after the
251
278
forecast date.
252
279
Plotting the prediction intervals on our subset above[^2]:
253
280
254
-
[^2]: Alternatively, you could call `auto_plot(four_week_ahead)` to get the full collection of forecasts. This is too busy for the space we have for plotting here.
281
+
[^2]: Alternatively, you could call `auto_plot(four_week_ahead)` to get the full
282
+
collection of forecasts. This is too busy for the space we have for plotting
283
+
here.
255
284
256
285
<details>
257
286
<summary> Plot </summary>
258
-
This is the same kind of plot as `processed_data_plot` above, but with the past data narrowed somewhat
287
+
288
+
This is the same kind of plot as `processed_data_plot` above, but with the past
289
+
data narrowed somewhat
290
+
259
291
```{r}
260
292
narrow_data_plot <-
261
293
cases_deaths |>
@@ -267,13 +299,17 @@ narrow_data_plot <-
267
299
facet_grid(source ~ geo_value, scale = "free") +
268
300
geom_vline(aes(xintercept = forecast_date)) +
269
301
geom_text(
270
-
data = forecast_date_label, aes(x = dates, label = "forecast\ndate", y = heights), size = 3, hjust = "right"
302
+
data = forecast_date_label,
303
+
aes(x = dates, label = "forecast\ndate", y = heights),
0 commit comments