Skip to content

Carefully consider the metadata for the epi_df resulting from sliding a forecaster over an epi_archive  #213

Open
@rachlobay

Description

@rachlobay

As originally stated in Issue #208, after making the change to allow epix_slide to epi_df so that we can slide a forecaster using arx_forecaster() over an epi_archive, it is good to consider whether the resulting metadata (namely, as_of) is what we want.

In the below example, as_of is related to the first fc_time_values 2020-08-01… but do we really want that? Does it make the most sense to use here?

library(epipredict)
library(epiprocess)
library(covidcast)
library(data.table)
library(dplyr)
library(tidyr)
library(ggplot2)

y <- covidcast_signals(
  c("doctor-visits", "jhu-csse"),
  c("smoothed_adj_cli", "confirmed_7dav_incidence_prop"),
  start_day = "2020-06-01",
  end_day = "2021-12-01",
  issues = c("2020-06-01", "2021-12-01"),
  geo_type = "state",
  geo_values = c("ca", "fl"))

z <- y[[1]] %>%
  select(geo_value, time_value, version = issue, percent_cli = value) %>%
  as_epi_archive()

z <- epix_merge(
  z, y[[2]] %>%
    select(geo_value, time_value, version = issue, case_rate = value) %>%
    as_epi_archive(), sync = "locf")

fc_time_values <- seq(as.Date("2020-08-01"), as.Date("2021-12-01"),
                      by = "1 month")
ahead = 7

# Slide arx_forecaster over archive z
z %>% 
  epix_slide(function(x, ...) 
    arx_forecaster(x, outcome = "case_rate", 
                   predictors = "case_rate",
                   args_list = arx_args_list(ahead = ahead))$predictions %>% 
      select(-c(geo_value, time_value)),
    n = 120, ref_time_values = fc_time_values, new_col_name = "fc") 

Metadata

Metadata

Assignees

Labels

P1medium prioritybugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions