Skip to content

Add complete.epi_df method so grouped completions don't drop epi_df… #488

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 5, 2024

Conversation

brookslogan
Copy link
Contributor

@brookslogan brookslogan commented Jul 20, 2024

…ness

Requesting review from @dshemetov since we talked about S3 stuff recently + to point at #391 where David's already done some of the work we were talking about. @nmdefries I'm not sure this really needs 2 people to review, but GitHub's not letting me unrequest.

[Actually this isn't ready for review yet. I should rebase on top of the time type changes and re-run tests first. And test the time type reinference.]

Checklist

Please:

  • Make sure this PR is against "dev", not "main" (unless this is a release
    PR).
  • Request a review from one of the current main reviewers:
    brookslogan, nmdefries.
  • Makes sure to bump the version number in DESCRIPTION. Always increment
    the patch version number (the third number), unless you are making a
    release PR from dev to main, in which case increment the minor version
    number (the second number).
  • Describe changes made in NEWS.md, making sure breaking changes
    (backwards-incompatible changes to the documented interface) are noted.
    Collect the changes under the next release number (e.g. if you are on
    1.7.2, then write your changes under the 1.8 heading).
  • See DEVELOPMENT.md for more information on the development
    process.

Change explanations for reviewer

Adds a complete.epi_df method that doesn't drop epi_dfness when grouped.

Too much context:

We've had to basically always recommend complete()ing before putting things into an epi_df, because of various things that could go wrong down the line. While epi_slide_opt() avoids bad behaviors when running on uncomplete()d data, we still aren't free: there is still cmu-delphi/epipredict#362 to avoid and complete()ion for other purposes, such as sliding functions epi_slide_opt() doesn't handle (and epi_slide() does bad things on), filling with 0s, making dplyr::lag actually work, naturally, maybe fixing some other 3rd-party library stuff, and standardizing time availability across epikeys (though we probably actually work for that last case already).

Grouped completions drop epi_dfness; it's annoying to have to carefully place the complete() before as_epi_df() when tweaking epi_df construction (especially if using the new as_epi_df renaming conveniences where you need to remember what the old names were when writing the complete()), and even more annoying/error-inducing to try to convert back to an epi_df after losing epi_dfness (since it's easy to forget and annoying to provide metadata).

Magic GitHub syntax to mark associated Issue(s) as resolved when this is merged into the default branch

@brookslogan brookslogan force-pushed the lcb/complete-grouped-epi_dfs branch from b414f70 to 1e0bf61 Compare July 22, 2024 18:35
@brookslogan brookslogan marked this pull request as ready for review July 22, 2024 19:16
@brookslogan
Copy link
Contributor Author

brookslogan commented Jul 22, 2024

Okay, think this is probably ready now.

failed brainstorm, feel free to disregard: I realize now I'm not taking advantage of the time_type attribute to e.g., set a default period, and there may be some overlap with the reasoning used to transform before for epi_slide; we might want to export some helper function that gives an object representing a time difference of 1 period given the time type (such that adding/subtracting it from corresponding time_values "does the right thing"). (or alternatively a function taking an integerish (vector) number of time periods and a time_type, and outputs an object (vector) representing time differences). For the sake of keeping this PR simple, I won't add this unless you think the period default or such a helper function would be a very handy feature. [Nevermind, a period default in complete is a bad idea, because it assumes they want full_seq which they might not. The helper I mention here sounds useful but wouldn't be used in this PR.]

@nmdefries nmdefries removed their request for review July 22, 2024 20:41
* add tests and documentation examples
* complete.epi_df now has its own docs page
* bump version and changelog

Co-authored-by: Dmitry Shemetov <[email protected]>
@dshemetov dshemetov force-pushed the lcb/complete-grouped-epi_dfs branch from 17b6a8b to 4335dc1 Compare August 5, 2024 21:04
@dshemetov dshemetov merged commit cfeda76 into dev Aug 5, 2024
5 checks passed
@dshemetov dshemetov deleted the lcb/complete-grouped-epi_dfs branch August 5, 2024 21:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants