Skip to content

[BUG] Incorrect usage in cdist fuction #775

@HARSHDIPSAHA

Description

@HARSHDIPSAHA

small Bug: Incorrect coordinate assignment in _cdist function

Describe the bug

The _cdist function in movement/kinematics/distances.py incorrectly assigns coordinates for the second element (elem2) in pairwise distance calculations. The function uses coordinates from array a for both elem1 and elem2, when elem2 should use coordinates from array b.

Location: movement/kinematics/distances.py:102

Current (incorrect) code:

result = result.assign_coords(
    {
        elem1: getattr(a, labels_dim).values,
        elem2: getattr(a, labels_dim).values,  # BUG: Should be from b
    }
)

Fixed code:

result = result.assign_coords(
    {
        elem1: getattr(a, labels_dim).values,
        elem2: getattr(b, labels_dim).values,  # Fixed
    }
)

To Reproduce

The bug is semantic in nature. In practice, when a and b come from the same dataset (which is the typical use case), they have identical labels_dim coordinates after xarray alignment, so the bug may not cause functional issues. However, the code is semantically incorrect and could cause problems if:

  1. Arrays a and b have different labels_dim values (though this would be caught by xarray alignment)
  2. The code is refactored in the future
  3. Edge cases are encountered where alignment doesn't ensure identical coordinates

example:

import numpy as np
import xarray as xr
from movement.kinematics.distances import _cdist

# Create two arrays with different keypoints (would fail alignment, but demonstrates the issue)
a = xr.DataArray(
    [[[1.0, 2.0], [3.0, 4.0]]],
    dims=["time", "keypoints", "space"],
    coords={"time": [0], "keypoints": ["kp1", "kp2"], "space": ["x", "y"], "individuals": "ind1"}
)

b = xr.DataArray(
    [[[5.0, 6.0], [7.0, 8.0]]],
    dims=["time", "keypoints", "space"],
    coords={"time": [0], "keypoints": ["kp3", "kp4"], "space": ["x", "y"], "individuals": "ind2"}
)

# Before fix: elem2 would incorrectly get coordinates from a (["kp1", "kp2"])
# After fix: elem2 correctly gets coordinates from b (["kp3", "kp4"])
result = _cdist(a, b, dim="individuals")

Expected behaviour

The elem2 dimension should be assigned coordinates from array b's labels_dim, not from array a's labels_dim. This ensures semantic correctness and proper coordinate assignment for the second element in pairwise distance calculations.

Computer used

  • OS: Any (bug is platform-independent)
  • Version: N/A
  • Hardware specs: N/A

Additional context

  • Testing: The fix should not break existing tests since both arrays have the same coordinates in typical use cases. However, the fix ensures semantic correctness and future-proofs the code.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    Status

    🤔 Triage

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions