Skip to content

Ndefries/epix slide pass reftimevalue v1 #307

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

nmdefries
Copy link
Contributor

@nmdefries nmdefries commented Apr 27, 2023

#170

  • comp_value calculation in comp_one_grp passes ref_time_value to f
  • Add ref_time_value to f's environment manually. This doesn't take advantage of comp_one_grp passing ref_time_value to f. In fact, if we add ref_time_value to the execution env, users could use ref_time_value in all computation approaches (func, formula, dots) without taking it as an arg.
  • Modify check_sufficient_f_args to take a # of args arg. Consider this a temporary change. Until epi_slide also passes ref_time_value to user computations, it and epix_slide will expect different numbers of arguments from and pass different numbers of arguments to user function f.

@nmdefries
Copy link
Contributor Author

One negative with the implementation so far is that ref_time_value is referenced inconsistently. It's called t in a function computation, but called ref_time_value in formula and dots computations.

@brookslogan
Copy link
Contributor

The environment approach is nice for saving implementation effort (and I'd expect there's some refactor we can do, especially to the tidy approach, to deduplicate things), but I don't think it quite follows existing patterns.

  • Most concerning: it sounds like user would do something like f = function(x, g) {... ref_time_value ...}. (a) they may not realize that they even have access to ref_time_value. (b) it just seems unnatural. (c) they can't choose their own name.
  • Medium concerning: for tidy interface, the way tidyverse allowed reference to things other than columns was through cur_data() functions etc., now superseded by pick(); so just having ref_time_value available in environment is throwing yet more in the mix.
  • Less concerning: for formula, we might want to access it via ..3 or .ref_time_value (note prefixing dot on name), just like we can access the group data with ..1 or .x. Might be able to mirror/update rlang::as_function for formula.

Ideally I'd hope that we could follow the convention for functions and formulas, and maybe just not allow them to use ref_time_value in tidyeval approach because I think it's not going to be the effort to figure out how to do it.

Consistency with names: good point, though given the tidyverse conventions for the above, I'm not sure we can be fully consistent. But maybe we shouldn't say in the docs (I think we do now) that the names of the parameters of an f function HAVE to be x and g. We could have some example using ref_time_value instead to establish the connection.

@nmdefries
Copy link
Contributor Author

nmdefries commented Apr 28, 2023

it sounds like user would do something like f = function(x, g) {... ref_time_value ...}

9bda33c is all that's needed for user-provided f functions to have access to ref_time_value (with whatever name they want). fs are now required to take 3 args, so ref_time_value is discoverable and the specific example above isn't possible.

The environment stuff introduces the problems with having this extra global floating around, so a user could do something weird like f = function(x, g, t) {... t, ref_time_value ...}, using both the arg t and the global (with the same value). I definitely agree that that is odd and undesirable.

So removing the environment handling (introduced only to support formula and tidy computations) will fix this particular issue.

for formula, we might want to access it via ..3 or .ref_time_value (note prefixing dot on name), just like we can access the group data with ..1 or .x. Might be able to mirror/update rlang::as_function for formula.

Formulas can reference ref_time_value as ..3 without doing anything more than 9bda33c.

It sounds like you're saying that to allow .ref_time_value (or .t, etc) to access ref_time_value too, I should make a function that acts like rlang::as_function but creates a function that takes 3 args (.x, .y, .ref_time_value) instead of the current two (.x, .y). Is that the correct understanding?

@brookslogan
Copy link
Contributor

brookslogan commented Apr 28, 2023 via email

@nmdefries
Copy link
Contributor Author

Not sure on naming; I already do not like .y but that is what rlang and purrr and dplyr do already so I'm not sure changing is a good idea.

I'm not a big fan of the .x/.y naming either. It would make more sense for our formulas to use the same arg names as we say f does (so .x, .g, and .t), for clarity and consistency. And we're going to be adding the custom arg .t, which already doesn't fit into the existing naming scheme.

To resolve this, I think we should allow the user to reference each arg with a one of xyz and with a descriptive name. That is, we don't need to completely remove .y. Currently, formulas processed with as_function can use either .x or . to reference the first arg. We can easily make a modified version of as_function to extend that behavior, and reference the second argument with .y or .g and the third argument with .z or .t.

@brookslogan
Copy link
Contributor

I like that approach. I'm not sure about g and t as the more descriptive names now that we're discussing/committing to them more. Especially with g and t together, users might think:

  • "g must be geo_value" (not the correct group / group key)
  • "t probably refers to a time_value in the object", not ref_time_value which could lie outside in the case of epix_slide

Plus, there's a bit of mismatch between functions and formulas when we consider that these two args are probably substantially less used (except the ref time value in epix_slide to get the forecast date if not using an epipredict canned forecaster):

  • for functions, we want short names for things we're ignoring most the time
  • for formulas, we want longer, descriptive names for things we don't use often

But we also would want the names to match for consistency.

I think we should probably get some feedback from a broader group or maybe some potential public health users about preferences. Maybe post something on Slack to see if people would correctly interpret .x .g .t? Or just present a few naming options?

@nmdefries
Copy link
Contributor Author

Superseded by #313

@nmdefries nmdefries closed this May 5, 2023
@nmdefries nmdefries deleted the ndefries/epix-slide-pass-reftimevalue branch November 15, 2023 18:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants