Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Collapse dimensions common functionality for plot wrappers #405

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

willGraham01
Copy link
Contributor

Description

What is this PR

  • Bug fix
  • Addition of a new feature
  • Other

Why is this PR needed?

See the discussion on Zulip.

What does this PR do?

  • Introduces the collapse_extra_dimensions function that can be used by the plotting wrappers to "remove" superfluous dimensions prior to creating plots.
  • Also introduces the coord_of_dimension function which is a short wrapper that saves us from writing the same logic in each of our plotting wrappers when asking the user to identify one individual, or one keypoint (etc). The user may specify this either by its index (slice-style) or coordinate (DataArray.sel style), and this small tidbit of code was appearing in a number of places across the plotting wrappers. Now we can run individual = coord_of_dim(da, "individuals", individual) to ensure that da.sel(individuals=individual) will work.

References

How has this PR been tested?

Local tests added and are passing.

Is this a breaking change?

No

Does this PR require an update to the documentation?

No

Checklist:

  • The code has been tested locally
  • Tests have been added to cover all new functionality
  • The documentation has been updated to reflect any changes
  • The code has been formatted with pre-commit

This comment was marked as resolved.

@willGraham01 willGraham01 requested a review from niksirbi February 5, 2025 09:24
Copy link
Member

@niksirbi niksirbi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @willGraham01, I found some issues with this implementation, which we'll have to clarify. The main problem in my opinion is that **selection is too flexible, allowing coordinates to be specified both by name and index, which leads to some ambiguities.

I am of the opinion that we should restrict it to only accept coordinate names, in which case there is no more need for coord_of_dimension (see comments for more details).

Another thing I realised while reviewing this PR is that the collapse_extra_dimensions utility may not be as useful as we thought for plot.trajectory, at least not in the current form of that function (see PR 394). The reason is that in plot.trajectory we plan to also accept lists of keypoints as a valid selection, while collapse_extra_dimensions will always collapse to max 1 keypoint. Personally, I think collapse_extra_dimensions should keep its behaviour—i.e. collapse all but the preserved dims to 1—but this means that we either have to work around that in plot.trajectory or make plot.trajectory less flexible in handling keypoints, cc @stellaprins.

Let me know what you think about it.

movement/utils/dimensions.py Outdated Show resolved Hide resolved
movement/utils/dimensions.py Show resolved Hide resolved
movement/utils/dimensions.py Show resolved Hide resolved
movement/utils/dimensions.py Outdated Show resolved Hide resolved
movement/utils/dimensions.py Outdated Show resolved Hide resolved
tests/test_unit/test_collapse_dimensions.py Show resolved Hide resolved
@willGraham01
Copy link
Contributor Author

The isel and sel issue is something I overlooked, I think your decision to enforce a sel-like behaviour is the sensible one (we still default to the 0th-indexed-coordinate if the user gives no selection).

The reason is that in plot.trajectory we plan to also accept lists of keypoints as a valid selection, while collapse_extra_dimensions will always collapse to max 1 keypoint. Personally, I think collapse_extra_dimensions should keep its behaviour—i.e. collapse all but the preserved dims to 1—but this means that we either have to work around that in plot.trajectory or make plot.trajectory less flexible in handling keypoints, cc @stellaprins.

Let me know what you think about it.

I think it still has merit in here, at least at removing dimensions that aren't time, space, nor keypoints. For example (inside plot_trajectory):

keypoints = selection.pop("keypoints", None) # Be harsher here since we now are enforcing selection by coordinate label only, hence 'None' default so we know to convert it to a coordinate label, not a coordinate index. This also removes keypoints from the dict, so we can pass selection to collapse_extra_dimensions as normal.
if keypoints is None:
    # Use default 0th keypoint
    keypoint = coord_of_dimension(da, "keypoints", 0)

if keypoints in da.keypoints:
    # If keypoints is a coordinate label, we have been given a single centroid. Otherwise, keypoints is an iterable so we'll drop into the else block
    # Centroid of one point it itself
    centroid = collapse_extra_dimensions(keypoints=the_single_keypoint_given, **selections)
else:
    # Centroid of many keypoints
    centroid = collapse_extra_dimensions(da, preserve_dims=["time", "space", "keypoints"], **selections)
    centroid = centroid.mean(dim="keypoints") # not sure if this is the precise syntax but you get the idea

Though I don't know if this is going to cause a mass re-write for @stellaprins though.

@niksirbi
Copy link
Member

niksirbi commented Feb 6, 2025

The isel and sel issue is something I overlooked, I think your decision to enforce a sel-like behaviour is the sensible one (we still default to the 0th-indexed-coordinate if the user gives no selection).

Agreed, let's go for sel-like behaviour, but still default to the 0-th index (first listed keypoint) if no selection.

I think it still has merit in here, at least at removing dimensions that aren't time, space, or keypoints. For example (inside plot_trajectory):

I agree that is still has merit, and we could go with your solution for plot.trajectory. @stellaprins I think you can forgo my suggestion for defaulting to plot the centroid of all keypoints if keypoints=None. It complicates the mental model too much, for us and for users. Instead we go for Will's mental model, i.e.:

  • If no keypoint is explicitly selected, plot the first one
  • If 1 keypoint is requested (by its label) we plot that one
  • If multiple keypoints are selected by label, plot their centroid

For individuals we only allow 1, which is the 0-th by default or the selected one if it's specified by label.

This way the mental models is the same for keypoints and individuals, except for the fact that you can explicitly request to see the centroid of several keypoints.

Are we all fine with this compromise?

@stellaprins
Copy link
Contributor

stellaprins commented Feb 6, 2025

* If no keypoint is explicitly selected, plot the first one

* If 1 keypoint is requested (by its label) we plot that one

* If multiple keypoints are selected by label, plot their centroid

For individuals we only allow 1, which is the 0-th by default or the selected one if it's specified by label.

This way the mental models is the same for keypoints and individuals, except for the fact that you can explicitly request to see the centroid of several keypoints.

Are we all fine with this compromise?

Sounds good to me! I quite liked the default of selecting the midpoint of all keypoints, because it would give a more sensible plot for the user if they do not specify any keypoints and makes it very easy for them to plot the centroid. As user I'd rather have the centroid plotted by default than the left paw if that happens to be the first keypoint. But I can see advantages to the above too.

Will have a look at this PR right now and address your comments @niksirbi and then after it is merged see how I can apply it to plot.trajectory.

@niksirbi
Copy link
Member

niksirbi commented Feb 6, 2025

Sounds good to me! I quite liked the default of selecting the midpoint of all keypoints, because it would give a more sensible plot for the user (? as user I'd rather have the centroid plotted by default than the left paw if that happens to be the first keypoint) if they do not specify any keypoints and makes it very easy for them to plot the centroid. But I can see advantages to the above too.

Actually this made me reconsider, from the user's perspective what you are describing makes sense. Perhaps try to implement plot.trajectory in that way (after this is merged), and we see how it looks when we have the code at hand?

Copy link

sonarqubecloud bot commented Feb 6, 2025

@stellaprins stellaprins requested a review from niksirbi February 6, 2025 16:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants