Skip to content

Add set data test cases #61

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Aug 13, 2024
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions src/diffpy/srmise/dataclusters.py
Original file line number Diff line number Diff line change
Expand Up @@ -140,6 +140,8 @@ def setdata(self, x, y, res):
self.x = x
self.y = y
self.res = res
if x.size > 0 and res == 0:
raise ValueError("Make trivial clustering, please make positive resolution.")
# If x sequence size is empty, set the object into Initialized state.
if x.size == 0 and res == 0:
self.data_order = np.array([])
Expand Down
88 changes: 86 additions & 2 deletions src/diffpy/srmise/tests/test_dataclusters.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
from copy import copy

import numpy as np
import pytest

from diffpy.srmise.dataclusters import DataClusters

Expand All @@ -15,8 +16,8 @@ def test_clear():


def test___eq__():
actual = DataClusters(np.array([1, 2, 3]), np.array([3, 2, 1]), 0)
expected = DataClusters(np.array([1, 2, 3]), np.array([3, 2, 1]), 0)
actual = DataClusters(np.array([1, 2, 3]), np.array([3, 2, 1]), 1)
expected = DataClusters(np.array([1, 2, 3]), np.array([3, 2, 1]), 1)
assert expected == actual
attributes = vars(actual)
for attr_key, attr_val in attributes.items():
Expand All @@ -32,3 +33,86 @@ def test___eq__():
print(f"not-equal test failed on {attr_key}")
assert not expected == actual
attributes.update({attr_key: reset})

# In the set data test, we test for these cases.
# (1) x and y are non-empty array values, and res is positive (the most generic case)
# (2) x and y are non-empty array values, and res is 0 (will produce a msg that makes trivial clustering)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not raise errors or warnings for this. It is just the identity (gives back the input unchanged) which is not that useful but not anything invalide. Please make an issue to address this in the docs, but not in the code.

# (3) x and y are non-empty array values, and res is negative (will produce a ValueError,
# msg = please enter a non-negative res value)
# (4, 5) One of x and y is empty array, and res is positive
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I woud test for len(x) != len(y) (ValueError) and not worry about the other cases. The user won't give a len(0) array on purpose and the code will error in that case anyway (on sort for example) in case the user does it inadvertently.

# (produce ValueError & msg "Sequences x and y must have the same length.", something like that)
# (6) Both x and y are empty array, and res is zero.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please remove these blank lines so the tests are as compact as possible.


@pytest.mark.parametrize(
"inputs, expected",
[
(
# case (1)
{
"input_x": np.array([1, 2, 3]),
"input_y": np.array([3, 2, 1]),
"input_res": 4,
},
DataClusters(np.array([1, 2, 3]), np.array([3, 2, 1]), 4),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't appear to be testing anything as your actual and expected are both just running the same functions.

),
(
# case (6)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not needed (because we are not catching res 0 any more)

{
"input_x": np.array([]),
"input_y": np.array([]),
"input_res": 0,
},
DataClusters(np.array([]), np.array([]), 0),
),
],
)
def test_set_data(inputs, expected):
actual = DataClusters(x=inputs["input_x"], y=inputs["input_y"], res=inputs["input_res"])
assert actual == expected
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a note, the expected we would like to explicitly set all its attributes. this will also implicitly test other parts of the code that generate those attributes.



@pytest.mark.parametrize(
"inputs, msg",
[
(
# case (4)
{
"input_x": np.array([]),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as I mentioned before, please use "x" here, not "input_x" to make the code more readable. Also, don't make the x-array empty, but just make them have different lengths.

"input_y": np.array([3, 2]),
"input_res": 4,
},
"Sequences x and y must have the same length.",
),
(
# case (5)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not needed. Please delete.

{
"input_x": np.array([1, 2]),
"input_y": np.array([]),
"input_res": 4,
},
"Sequences x and y must have the same length.",
),
(
# case (3)
{
"input_x": np.array([1]),
"input_y": np.array([3]),
"input_res": -1,
},
"Resolution res must be non-negative.",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please expand a bit. Will the user know what "resolution" means?

),
(
# case (2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not needed, please remove. This will be handled in docs not in code.

{
"input_x": np.array([1, 2, 3]),
"input_y": np.array([3, 2, 1]),
"input_res": 0,
},
"Make trivial clustering, please make positive resolution.",
),
],
)
def test_set_data_order_bad(inputs, msg):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a good test, nice job.

with pytest.raises(ValueError, match=msg):
DataClusters(x=inputs["input_x"], y=inputs["input_y"], res=inputs["input_res"])
Loading