Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

forecast error on exogenous categorical variables #356

Closed
robjhyndman opened this issue May 23, 2022 · 1 comment
Closed

forecast error on exogenous categorical variables #356

robjhyndman opened this issue May 23, 2022 · 1 comment

Comments

@robjhyndman
Copy link
Member

MRE

library(fpp3)
#> ── Attaching packages ─────────────────────────────────────── fpp3 0.4.0.9000 ──
#> ✔ tibble      3.1.7     ✔ tsibble     1.1.1
#> ✔ dplyr       1.0.9     ✔ tsibbledata 0.4.0
#> ✔ tidyr       1.2.0     ✔ feasts      0.2.2
#> ✔ lubridate   1.8.0     ✔ fable       0.3.1
#> ✔ ggplot2     3.3.6     ✔ fabletools  0.3.2
#> ── Conflicts ───────────────────────────────────────────────── fpp3_conflicts ──
#> ✖ lubridate::date()    masks base::date()
#> ✖ dplyr::filter()      masks stats::filter()
#> ✖ tsibble::intersect() masks base::intersect()
#> ✖ tsibble::interval()  masks lubridate::interval()
#> ✖ dplyr::lag()         masks stats::lag()
#> ✖ tsibble::setdiff()   masks base::setdiff()
#> ✖ tsibble::union()     masks base::union()
elec <- vic_elec %>%
  mutate(
    Day_Type = case_when(
      Holiday ~ "Holiday",
      wday(Date) %in% 2:6 ~ "Weekday",
      TRUE ~ "Weekend"
  )) 
fit <- elec %>%
  model(shf = TSLM(log(Demand) ~ Day_Type))
fit %>% report()
#> Series: Demand 
#> Model: TSLM 
#> Transformation: log(Demand) 
#> 
#> Residuals:
#>      Min       1Q   Median       3Q      Max 
#> -0.47374 -0.11822  0.01978  0.10979  0.66244 
#> 
#> Coefficients:
#>                 Estimate Std. Error  t value Pr(>|t|)    
#> (Intercept)     8.293077   0.004406 1882.134  < 2e-16 ***
#> Day_TypeWeekday 0.187077   0.004496   41.610  < 2e-16 ***
#> Day_TypeWeekend 0.032076   0.004620    6.943 3.88e-12 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 0.17 on 52605 degrees of freedom
#> Multiple R-squared: 0.1572,  Adjusted R-squared: 0.1571
#> F-statistic:  4905 on 2 and 52605 DF, p-value: < 2.22e-16
newdata <- tail(elec, 48)
fit %>%
  forecast(new_data = newdata)
#> Error in `mutate()`:
#> ! Problem while computing `shf = (function (object, ...) ...`.
#> Caused by error:
#> ! contrasts can be applied only to factors with 2 or more levels
#>   Unable to compute required variables from provided `new_data`.
#>   Does your model require extra variables to produce forecasts?

Created on 2022-05-23 by the reprex package (v2.0.1)

@mitchelloharawild mitchelloharawild added bug Something isn't working and removed bug Something isn't working labels Mar 2, 2024
@mitchelloharawild
Copy link
Member

mitchelloharawild commented Mar 2, 2024

Best practice is to use factors here, so that all possible values of Day_Type are known in both the modelling and forecasting stages - even if they are not observed in that time window.

That said, this behaviour is likely to cause issues (but is hard to fix), so I've opened another issue for it here: #398

library(fpp3)
#> -- Attaching packages ---------------------------------------------- fpp3 0.5 --
#> v tibble      3.2.1          v tsibble     1.1.4     
#> v dplyr       1.1.3          v tsibbledata 0.4.1     
#> v tidyr       1.3.0          v feasts      0.3.1.9000
#> v lubridate   1.9.3          v fable       0.3.3.9000
#> v ggplot2     3.5.0          v fabletools  0.4.0
#> -- Conflicts ------------------------------------------------- fpp3_conflicts --
#> x lubridate::date()    masks base::date()
#> x dplyr::filter()      masks stats::filter()
#> x tsibble::intersect() masks base::intersect()
#> x tsibble::interval()  masks lubridate::interval()
#> x dplyr::lag()         masks stats::lag()
#> x tsibble::setdiff()   masks base::setdiff()
#> x tsibble::union()     masks base::union()
elec <- tsibbledata::vic_elec %>%
  mutate(
    Day_Type = factor(case_when(
      Holiday ~ "Holiday",
      wday(Date) %in% 2:6 ~ "Weekday",
      TRUE ~ "Weekend"
    )) )
fit <- elec %>%
  model(shf = fable::TSLM(log(Demand) ~ Day_Type))
fit %>% report()
#> Series: Demand 
#> Model: TSLM 
#> Transformation: log(Demand) 
#> 
#> Residuals:
#>      Min       1Q   Median       3Q      Max 
#> -0.47374 -0.11822  0.01978  0.10979  0.66244 
#> 
#> Coefficients:
#>                 Estimate Std. Error  t value Pr(>|t|)    
#> (Intercept)     8.293077   0.004406 1882.134  < 2e-16 ***
#> Day_TypeWeekday 0.187077   0.004496   41.610  < 2e-16 ***
#> Day_TypeWeekend 0.032076   0.004620    6.943 3.88e-12 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 0.17 on 52605 degrees of freedom
#> Multiple R-squared: 0.1572,  Adjusted R-squared: 0.1571
#> F-statistic:  4905 on 2 and 52605 DF, p-value: < 2.22e-16

newdata <- tail(elec, 48)
fit %>%
  forecast(new_data = newdata)
#> # A fable: 48 x 8 [30m] <Australia/Melbourne>
#> # Key:     .model [1]
#>    .model Time                          Demand .mean Temperature Date      
#>    <chr>  <dttm>                        <dist> <dbl>       <dbl> <date>    
#>  1 shf    2014-12-31 00:00:00 t(N(8.5, 0.029)) 4888.        16.2 2014-12-31
#>  2 shf    2014-12-31 00:30:00 t(N(8.5, 0.029)) 4888.        16   2014-12-31
#>  3 shf    2014-12-31 01:00:00 t(N(8.5, 0.029)) 4888.        15.5 2014-12-31
#>  4 shf    2014-12-31 01:30:00 t(N(8.5, 0.029)) 4888.        15   2014-12-31
#>  5 shf    2014-12-31 02:00:00 t(N(8.5, 0.029)) 4888.        14.4 2014-12-31
#>  6 shf    2014-12-31 02:30:00 t(N(8.5, 0.029)) 4888.        14.3 2014-12-31
#>  7 shf    2014-12-31 03:00:00 t(N(8.5, 0.029)) 4888.        14   2014-12-31
#>  8 shf    2014-12-31 03:30:00 t(N(8.5, 0.029)) 4888.        13.8 2014-12-31
#>  9 shf    2014-12-31 04:00:00 t(N(8.5, 0.029)) 4888.        13.6 2014-12-31
#> 10 shf    2014-12-31 04:30:00 t(N(8.5, 0.029)) 4888.        13.3 2014-12-31
#> # i 38 more rows
#> # i 2 more variables: Holiday <lgl>, Day_Type <fct>

Created on 2024-03-02 with reprex v2.0.2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants