tests for scale_to #211

yucongalicechen · 2024-12-08T21:03:40Z

closes #49, closes #186

codecov · 2024-12-08T21:05:16Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 100.00%. Comparing base (5f67f6c) to head (53413fb).
Report is 39 commits behind head on main.

Additional details and impacted files

@@             Coverage Diff             @@
##             main      #211      +/-   ##
===========================================
+ Coverage   99.69%   100.00%   +0.30%     
===========================================
  Files           8         8              
  Lines         325       352      +27     
===========================================
+ Hits          324       352      +28     
+ Misses          1         0       -1

Files with missing lines	Coverage Δ
tests/test_diffraction_objects.py	`100.00% <100.00%> (ø)`

... and 1 file with indirect coverage changes

yucongalicechen · 2024-12-08T21:08:41Z

src/diffpy/utils/diffraction_objects.py

@@ -390,14 +390,15 @@ def scale_to(self, target_diff_object, xtype=None, xvalue=None):

        data = self.on_xtype(xtype)
        target = target_diff_object.on_xtype(xtype)
+        if len(data[0]) == 0 or len(target[0]) == 0 or len(data[0]) != len(target[0]):
+            raise ValueError("I cannot scale two diffraction objects with empty or different lengths.")


@sbillinge I added this error message here but I'm not sure if this is appropriate. Does it make more sense that we find xindex for each diffraction object? (see comment below)

if the arrays are empty I think it will blow up anyway, right? so we don't need to blow it up?

If they are different lengths, I think we can still handle it maybe? If the xvalue given is outside one of the two different array lengths it will blow up. But in general I think it would be handy to be able to do this on different length arrays.

It does make it more complicated because we would have to interpolate one on to the other before doing the comparison. we could move that more awkward case to an issue on a later release, or just go for it. We can check. I think it kept adding and subtracting etc. to only between arrays on the same grids. But we often wan to compare data on different grids...that is kind of the point of these diffraction objects....makinng those tricky things easy. For example, I may have a diffraction pattern I got from a paper from my sample, and then I am at the synchrotron and I measure something and I want to know if it is what I am expecting and I just want to scale them and plot them on top of each other but it is a super hassle because they are on different length arrays and one is on tth and the other on q and so on on. The DO's are supposed to make those hassles all go away.

src/diffpy/utils/diffraction_objects.py

sbillinge · 2024-12-09T09:16:33Z

@yucongalicechen I was explaining this functionality to colleagues in an email and the syntax that came most naturally to me was:

pyplot.plot(xpd_data.on_q[0], xpd_data.on_q[1]) 
pyplot.plot(lab_data.on_q[0], lab_data.scaled_to(xpd_data, q=10, offset=5).on_q[1])
pyplot.show()

which looks more intuitive to me (though less extensible) than lab_data.scaled_to(xpd_data, xvalue=10, xtype=q, offset=5).on_q[1]). Shall we do it this way? We will have to define as options tth=None and logic to figure out which one to use. It is a tradeoff, easier for the user vs. easier for us....

Also, please let's be in the habit of making the news early in the process to make it easier to see when tests are passing.

yucongalicechen · 2024-12-09T16:48:00Z

@yucongalicechen I was explaining this functionality to colleagues in an email and the syntax that came most naturally to me was:
pyplot.plot(xpd_data.on_q[0], xpd_data.on_q[1]) 
pyplot.plot(lab_data.on_q[0], lab_data.scaled_to(xpd_data, q=10, offset=5).on_q[1])
pyplot.show()
which looks more intuitive to me (though less extensible) than lab_data.scaled_to(xpd_data, xvalue=10, xtype=q, offset=5).on_q[1]). Shall we do it this way? We will have to define as options tth=None and logic to figure out which one to use. It is a tradeoff, easier for the user vs. easier for us....

Also, please let's be in the habit of making the news early in the process to make it easier to see when tests are passing.

Yeah I also like this better!

yucongalicechen · 2024-12-09T22:14:18Z

src/diffpy/utils/diffraction_objects.py

-        yself = data[1][xindex]
-        scaled.on_tth[1] = data[1] * ytarget / yself
-        scaled.on_q[1] = data[1] * ytarget / yself
+        x_data, x_target = (np.abs(data[0] - xvalue)).argmin(), (np.abs(target[0] - xvalue)).argmin()


compute different indices for the two diffraction objects

let's change the variable name to xindex_data to remind us that these are indices.

yucongalicechen · 2024-12-09T22:14:55Z

@sbillinge ready for another review

sbillinge

few comments

src/diffpy/utils/diffraction_objects.py

sbillinge · 2024-12-08T21:37:04Z

src/diffpy/utils/diffraction_objects.py

@@ -390,14 +390,15 @@ def scale_to(self, target_diff_object, xtype=None, xvalue=None):

        data = self.on_xtype(xtype)
        target = target_diff_object.on_xtype(xtype)
+        if len(data[0]) == 0 or len(target[0]) == 0 or len(data[0]) != len(target[0]):
+            raise ValueError("I cannot scale two diffraction objects with empty or different lengths.")


if the arrays are empty I think it will blow up anyway, right? so we don't need to blow it up?

If they are different lengths, I think we can still handle it maybe? If the xvalue given is outside one of the two different array lengths it will blow up. But in general I think it would be handy to be able to do this on different length arrays.

It does make it more complicated because we would have to interpolate one on to the other before doing the comparison. we could move that more awkward case to an issue on a later release, or just go for it. We can check. I think it kept adding and subtracting etc. to only between arrays on the same grids. But we often wan to compare data on different grids...that is kind of the point of these diffraction objects....makinng those tricky things easy. For example, I may have a diffraction pattern I got from a paper from my sample, and then I am at the synchrotron and I measure something and I want to know if it is what I am expecting and I just want to scale them and plot them on top of each other but it is a super hassle because they are on different length arrays and one is on tth and the other on q and so on on. The DO's are supposed to make those hassles all go away.

src/diffpy/utils/diffraction_objects.py

sbillinge · 2024-12-10T06:40:56Z

src/diffpy/utils/diffraction_objects.py


        Returns
        -------
        the rescaled DiffractionObject as a new object
-
        """
        scaled = deepcopy(self)


I think we can use @bobleesj nice copy method to do this now. ACtually, it just does a deepcopy, but let's model syntax that we would like users to use..... so scaled = self.copy() or sthg like that?

sbillinge · 2024-12-10T06:42:24Z

@yucongalicechen some of the comments might be out of date...i may have made them but not submitted. Please just use common sense and ask if not sure. If they don't make sense it is because we changed our desired api.

yucongalicechen · 2024-12-10T16:41:16Z

@sbillinge ready for review

yucongalicechen · 2024-12-10T16:43:59Z

tests/test_diffraction_objects.py

+        ["q", np.array([1, 2, 4, 6])],
+    ),
+    # UC4: different x-array lengths with approximate x-value match
+    (


A test example for scaling DOs with different array lengths. Here I think it makes more sense to scale them on q=61 (for self) & q=62 (for target).

src/diffpy/utils/diffraction_objects.py

bobleesj · 2024-12-12T18:51:20Z

@yucongalicechen this one just needs to be reviewed right?

sbillinge

Sorry, forgot to finish it up

src/diffpy/utils/diffraction_objects.py

yucongalicechen · 2024-12-12T20:08:17Z

tests/test_diffraction_objects.py

+    )
+    with pytest.raises(
+        ValueError, match="You can only specify one of 'q', 'tth', or 'd'. Please rerun specifying only one."
+    ):


added a test for error message

yucongalicechen · 2024-12-12T20:08:35Z

tests/test_diffraction_objects.py

+        # scaling factor is calculated at index = 5 for self and index = 6 for target
+        ["tth", np.array([1, 2, 3, 4, 5, 6, 10])],
+    ),
+    # UC5: user did not specify anything, use the midpoint of the current object's q-array


added a test for specifying nothing

yucongalicechen · 2024-12-12T20:09:10Z

@sbillinge ready for review!

sbillinge

please see inline

One other thought circling in my mind is to have the DO's carry around a kind of master array that kind of goes from 0 to q=45 (about the widest range we will ever encounter) with a fine grid and we interpolate the values onto there (linear or quadratic, whichever gives us sufficient accuracy. We could use it for various things, but this would be one of them because we could compute the scaling for the arrays on this grid (but apply the scaling to the data on all the grids). Don't make this change, but let's thin about it. The pro is it is cleaner and nicer and could be used for other things. The con is that we have to carry around another largish array in memory.

src/diffpy/utils/diffraction_objects.py

sbillinge · 2024-12-13T09:39:44Z

src/diffpy/utils/diffraction_objects.py

-        yself = data[1][xindex]
-        scaled.on_tth[1] = data[1] * ytarget / yself
-        scaled.on_q[1] = data[1] * ytarget / yself
+        x_data, x_target = (np.abs(data[0] - xvalue)).argmin(), (np.abs(target[0] - xvalue)).argmin()


let's change the variable name to xindex_data to remind us that these are indices.

sbillinge · 2024-12-13T09:40:28Z

src/diffpy/utils/diffraction_objects.py

-        scaled.on_tth[1] = data[1] * ytarget / yself
-        scaled.on_q[1] = data[1] * ytarget / yself
+        x_data, x_target = (np.abs(data[0] - xvalue)).argmin(), (np.abs(target[0] - xvalue)).argmin()
+        y_data, y_target = data[1][x_data], target[1][x_target]


I think this line makes things less readable. I would put the data[1][xindex_data] (and so on) directly in the expression below.

src/diffpy/utils/diffraction_objects.py

yucongalicechen · 2024-12-13T17:30:14Z

tests/test_diffraction_objects.py

+
+params_scale_to_bad = [
+    # UC1: user did not specify anything
+    (


add the bad test case for specifying nothing

yucongalicechen · 2024-12-13T17:35:14Z

please see inline

One other thought circling in my mind is to have the DO's carry around a kind of master array that kind of goes from 0 to q=45 (about the widest range we will ever encounter) with a fine grid and we interpolate the values onto there (linear or quadratic, whichever gives us sufficient accuracy. We could use it for various things, but this would be one of them because we could compute the scaling for the arrays on this grid (but apply the scaling to the data on all the grids). Don't make this change, but let's thin about it. The pro is it is cleaner and nicer and could be used for other things. The con is that we have to carry around another largish array in memory.

@sbillinge the new commit is ready for review. I like this idea and I think it will be very helpful for comparing absorption corrected curves. We can also have a function that allows user to specify which xarray to interpolate on?

bobleesj · 2024-12-13T22:00:40Z

tests/test_diffraction_objects.py

+    with pytest.raises(
+        ValueError, match="You must specify exactly one of 'q', 'tth', or 'd'. Please rerun specifying only one."
+    ):
+        orig_diff_object.scale_to(target_diff_object, q=inputs[8], tth=inputs[9], d=inputs[10], offset=inputs[11])


having inputs up to ,etc inputs[10] does not appear scalable to me and I found this was very hard to read and maintain in diffpy.snmf which I had to refactor: https://github.com/diffpy/diffpy.snmf/pull/120/files#diff-1bd6af744434d75c63490430b955f577f60277dfe95e9ad716e3f808a2ed9d48L85-L87

Discussion here:
#225 (comment)

One way to resolve this future nightware could be having reusable instances of DiffractionObject defined under conftest.py with specific UC cases. Then, we import these instances through the parameters in each test func. Thoughts?

yes, I agree in this case, this would be helpful.

btw, to make it more readable we could also pass the inputs as a dict so it would read input["wavelenght"] instead of input[0]. The intent of the former is much clearer.

sbillinge · 2024-12-13T23:56:39Z

please see inline
One other thought circling in my mind is to have the DO's carry around a kind of master array that kind of goes from 0 to q=45 (about the widest range we will ever encounter) with a fine grid and we interpolate the values onto there (linear or quadratic, whichever gives us sufficient accuracy. We could use it for various things, but this would be one of them because we could compute the scaling for the arrays on this grid (but apply the scaling to the data on all the grids). Don't make this change, but let's thin about it. The pro is it is cleaner and nicer and could be used for other things. The con is that we have to carry around another largish array in memory.

@sbillinge the new commit is ready for review. I like this idea and I think it will be very helpful for comparing absorption corrected curves. We can also have a function that allows user to specify which xarray to interpolate on?

yes, that had occured to me too. I think that was where my functions came from for building arrays using arange etc.. But for most flexibility for hte least code, I would suggest we just allow the user to pass in a master array. We can force them to do it on "q". If they want to specify a tth value to scale to we will have to convert.

@yucongalicechen will you put these on new issues? IN any case, let's put them on milestone 3.7.

Are you finished working on this PR and it can be merged?

sbillinge

please see inline.

sbillinge · 2024-12-14T00:00:31Z

src/diffpy/utils/diffraction_objects.py

+            the diffraction object you want to scale the current one onto
+
+        q, tth, d : float, optional, must specify exactly one of them
+            the xvalue (in `q`, `tth`, or `d` space) to align the current and target objects


"The value of the x-array where you want the curves to line up vertically. Specify a value on one of the allowed grids, q, tth, or d), e.g., q=10."

sbillinge · 2024-12-14T00:01:46Z

src/diffpy/utils/diffraction_objects.py

+                "You must specify exactly one of 'q', 'tth', or 'd'. Please rerun specifying only one."
+            )
+
+        xtype = "q" if q is not None else "tth" if tth is not None else "d" if d is not None else "q"


can we drop the last "else "q""? given our validation above?

sbillinge · 2024-12-14T00:02:14Z

src/diffpy/utils/diffraction_objects.py

+            )
+
+        xtype = "q" if q is not None else "tth" if tth is not None else "d" if d is not None else "q"
+        data, target = self.on_xtype(xtype), target_diff_object.on_xtype(xtype)


split to two lines for greater readability

sbillinge · 2024-12-14T00:06:32Z

tests/test_diffraction_objects.py

+    with pytest.raises(
+        ValueError, match="You must specify exactly one of 'q', 'tth', or 'd'. Please rerun specifying only one."
+    ):
+        orig_diff_object.scale_to(target_diff_object, q=inputs[8], tth=inputs[9], d=inputs[10], offset=inputs[11])


btw, to make it more readable we could also pass the inputs as a dict so it would read input["wavelenght"] instead of input[0]. The intent of the former is much clearer.

yucongalicechen · 2024-12-14T16:52:34Z

thanks for the suggestions @sbillinge @bobleesj . I've edited the docstring and tests to make the codes more readable. Also created a new issue #230 for the comment above. This PR is ready to be reviewed again.

initial commit, tests need discussion

d65dce1

yucongalicechen commented Dec 8, 2024

View reviewed changes

src/diffpy/utils/diffraction_objects.py Outdated Show resolved Hide resolved

add news and more tests

5d07b9a

yucongalicechen commented Dec 9, 2024

View reviewed changes

sbillinge reviewed Dec 10, 2024

View reviewed changes

yucongalicechen added 2 commits December 10, 2024 11:37

Merge branch 'main' into scaleto

8a3ebd5

remove error msg, change copy method

3c5a426

yucongalicechen commented Dec 10, 2024

View reviewed changes

src/diffpy/utils/diffraction_objects.py Show resolved Hide resolved

sbillinge reviewed Dec 12, 2024

View reviewed changes

src/diffpy/utils/diffraction_objects.py Outdated Show resolved Hide resolved

yucongalicechen added 3 commits December 12, 2024 15:01

add error msg

409cd42

remove expected parameters in bad tests

79161f1

edit docstring

1b7aa14

yucongalicechen commented Dec 12, 2024

View reviewed changes

sbillinge reviewed Dec 13, 2024

View reviewed changes

remove default value in the function

c22a95b

yucongalicechen commented Dec 13, 2024

View reviewed changes

src/diffpy/utils/diffraction_objects.py Show resolved Hide resolved

yucongalicechen commented Dec 13, 2024

View reviewed changes

src/diffpy/utils/diffraction_objects.py Show resolved Hide resolved

yucongalicechen commented Dec 13, 2024

View reviewed changes

bobleesj reviewed Dec 13, 2024

View reviewed changes

sbillinge reviewed Dec 14, 2024

View reviewed changes

yucongalicechen mentioned this pull request Dec 14, 2024

Implement a master array for data scaling and interpolation #230

Open

make tests more readable

53413fb

sbillinge merged commit 75e0ef8 into diffpy:main Dec 14, 2024
5 checks passed

yucongalicechen deleted the scaleto branch December 15, 2024 17:36

tests for scale_to #211

tests for scale_to #211

Uh oh!

Conversation

yucongalicechen commented Dec 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Dec 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

yucongalicechen Dec 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sbillinge commented Dec 9, 2024

Uh oh!

yucongalicechen commented Dec 9, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yucongalicechen commented Dec 9, 2024

Uh oh!

sbillinge left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sbillinge commented Dec 10, 2024

Uh oh!

yucongalicechen commented Dec 10, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

bobleesj commented Dec 12, 2024

Uh oh!

sbillinge left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yucongalicechen commented Dec 12, 2024

Uh oh!

sbillinge left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yucongalicechen commented Dec 13, 2024

Uh oh!

bobleesj Dec 13, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yucongalicechen commented Dec 8, 2024 •

edited

Loading

codecov bot commented Dec 8, 2024 •

edited

Loading

yucongalicechen Dec 8, 2024 •

edited

Loading

bobleesj Dec 13, 2024 •

edited

Loading