-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tests for scale_to #211
tests for scale_to #211
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #211 +/- ##
===========================================
+ Coverage 99.69% 100.00% +0.30%
===========================================
Files 8 8
Lines 325 352 +27
===========================================
+ Hits 324 352 +28
+ Misses 1 0 -1
|
@@ -390,14 +390,15 @@ def scale_to(self, target_diff_object, xtype=None, xvalue=None): | |||
|
|||
data = self.on_xtype(xtype) | |||
target = target_diff_object.on_xtype(xtype) | |||
if len(data[0]) == 0 or len(target[0]) == 0 or len(data[0]) != len(target[0]): | |||
raise ValueError("I cannot scale two diffraction objects with empty or different lengths.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sbillinge I added this error message here but I'm not sure if this is appropriate. Does it make more sense that we find xindex
for each diffraction object? (see comment below)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the arrays are empty I think it will blow up anyway, right? so we don't need to blow it up?
If they are different lengths, I think we can still handle it maybe? If the xvalue given is outside one of the two different array lengths it will blow up. But in general I think it would be handy to be able to do this on different length arrays.
It does make it more complicated because we would have to interpolate one on to the other before doing the comparison. we could move that more awkward case to an issue on a later release, or just go for it. We can check. I think it kept adding and subtracting etc. to only between arrays on the same grids. But we often wan to compare data on different grids...that is kind of the point of these diffraction objects....makinng those tricky things easy. For example, I may have a diffraction pattern I got from a paper from my sample, and then I am at the synchrotron and I measure something and I want to know if it is what I am expecting and I just want to scale them and plot them on top of each other but it is a super hassle because they are on different length arrays and one is on tth and the other on q and so on on. The DO's are supposed to make those hassles all go away.
@yucongalicechen I was explaining this functionality to colleagues in an email and the syntax that came most naturally to me was:
which looks more intuitive to me (though less extensible) than Also, please let's be in the habit of making the news early in the process to make it easier to see when tests are passing. |
Yeah I also like this better! |
yself = data[1][xindex] | ||
scaled.on_tth[1] = data[1] * ytarget / yself | ||
scaled.on_q[1] = data[1] * ytarget / yself | ||
x_data, x_target = (np.abs(data[0] - xvalue)).argmin(), (np.abs(target[0] - xvalue)).argmin() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
compute different indices for the two diffraction objects
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's change the variable name to xindex_data
to remind us that these are indices.
@sbillinge ready for another review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
few comments
@@ -390,14 +390,15 @@ def scale_to(self, target_diff_object, xtype=None, xvalue=None): | |||
|
|||
data = self.on_xtype(xtype) | |||
target = target_diff_object.on_xtype(xtype) | |||
if len(data[0]) == 0 or len(target[0]) == 0 or len(data[0]) != len(target[0]): | |||
raise ValueError("I cannot scale two diffraction objects with empty or different lengths.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the arrays are empty I think it will blow up anyway, right? so we don't need to blow it up?
If they are different lengths, I think we can still handle it maybe? If the xvalue given is outside one of the two different array lengths it will blow up. But in general I think it would be handy to be able to do this on different length arrays.
It does make it more complicated because we would have to interpolate one on to the other before doing the comparison. we could move that more awkward case to an issue on a later release, or just go for it. We can check. I think it kept adding and subtracting etc. to only between arrays on the same grids. But we often wan to compare data on different grids...that is kind of the point of these diffraction objects....makinng those tricky things easy. For example, I may have a diffraction pattern I got from a paper from my sample, and then I am at the synchrotron and I measure something and I want to know if it is what I am expecting and I just want to scale them and plot them on top of each other but it is a super hassle because they are on different length arrays and one is on tth and the other on q and so on on. The DO's are supposed to make those hassles all go away.
|
||
Returns | ||
------- | ||
the rescaled DiffractionObject as a new object | ||
|
||
""" | ||
scaled = deepcopy(self) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can use @bobleesj nice copy method to do this now. ACtually, it just does a deepcopy, but let's model syntax that we would like users to use..... so scaled = self.copy()
or sthg like that?
@yucongalicechen some of the comments might be out of date...i may have made them but not submitted. Please just use common sense and ask if not sure. If they don't make sense it is because we changed our desired api. |
@sbillinge ready for review |
["q", np.array([1, 2, 4, 6])], | ||
), | ||
# UC4: different x-array lengths with approximate x-value match | ||
( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A test example for scaling DOs with different array lengths. Here I think it makes more sense to scale them on q=61 (for self) & q=62 (for target).
@yucongalicechen this one just needs to be reviewed right? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, forgot to finish it up
) | ||
with pytest.raises( | ||
ValueError, match="You can only specify one of 'q', 'tth', or 'd'. Please rerun specifying only one." | ||
): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added a test for error message
tests/test_diffraction_objects.py
Outdated
# scaling factor is calculated at index = 5 for self and index = 6 for target | ||
["tth", np.array([1, 2, 3, 4, 5, 6, 10])], | ||
), | ||
# UC5: user did not specify anything, use the midpoint of the current object's q-array |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added a test for specifying nothing
@sbillinge ready for review! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please see inline
One other thought circling in my mind is to have the DO's carry around a kind of master array that kind of goes from 0 to q=45 (about the widest range we will ever encounter) with a fine grid and we interpolate the values onto there (linear or quadratic, whichever gives us sufficient accuracy. We could use it for various things, but this would be one of them because we could compute the scaling for the arrays on this grid (but apply the scaling to the data on all the grids). Don't make this change, but let's thin about it. The pro is it is cleaner and nicer and could be used for other things. The con is that we have to carry around another largish array in memory.
yself = data[1][xindex] | ||
scaled.on_tth[1] = data[1] * ytarget / yself | ||
scaled.on_q[1] = data[1] * ytarget / yself | ||
x_data, x_target = (np.abs(data[0] - xvalue)).argmin(), (np.abs(target[0] - xvalue)).argmin() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's change the variable name to xindex_data
to remind us that these are indices.
scaled.on_tth[1] = data[1] * ytarget / yself | ||
scaled.on_q[1] = data[1] * ytarget / yself | ||
x_data, x_target = (np.abs(data[0] - xvalue)).argmin(), (np.abs(target[0] - xvalue)).argmin() | ||
y_data, y_target = data[1][x_data], target[1][x_target] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this line makes things less readable. I would put the data[1][xindex_data]
(and so on) directly in the expression below.
|
||
params_scale_to_bad = [ | ||
# UC1: user did not specify anything | ||
( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add the bad test case for specifying nothing
@sbillinge the new commit is ready for review. I like this idea and I think it will be very helpful for comparing absorption corrected curves. We can also have a function that allows user to specify which xarray to interpolate on? |
tests/test_diffraction_objects.py
Outdated
with pytest.raises( | ||
ValueError, match="You must specify exactly one of 'q', 'tth', or 'd'. Please rerun specifying only one." | ||
): | ||
orig_diff_object.scale_to(target_diff_object, q=inputs[8], tth=inputs[9], d=inputs[10], offset=inputs[11]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
having inputs up to ,etc inputs[10]
does not appear scalable to me and I found this was very hard to read and maintain in diffpy.snmf which I had to refactor: https://github.com/diffpy/diffpy.snmf/pull/120/files#diff-1bd6af744434d75c63490430b955f577f60277dfe95e9ad716e3f808a2ed9d48L85-L87
Discussion here:
#225 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One way to resolve this future nightware could be having reusable instances of DiffractionObject
defined under conftest.py
with specific UC cases. Then, we import these instances through the parameters in each test func. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, I agree in this case, this would be helpful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw, to make it more readable we could also pass the inputs as a dict so it would read input["wavelenght"]
instead of input[0]
. The intent of the former is much clearer.
yes, that had occured to me too. I think that was where my functions came from for building arrays using arange etc.. But for most flexibility for hte least code, I would suggest we just allow the user to pass in a master array. We can force them to do it on "q". If they want to specify a tth value to scale to we will have to convert. @yucongalicechen will you put these on new issues? IN any case, let's put them on milestone 3.7. Are you finished working on this PR and it can be merged? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please see inline.
the diffraction object you want to scale the current one onto | ||
|
||
q, tth, d : float, optional, must specify exactly one of them | ||
the xvalue (in `q`, `tth`, or `d` space) to align the current and target objects |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"The value of the x-array where you want the curves to line up vertically. Specify a value on one of the allowed grids, q
, tth
, or d
), e.g., q=10."
"You must specify exactly one of 'q', 'tth', or 'd'. Please rerun specifying only one." | ||
) | ||
|
||
xtype = "q" if q is not None else "tth" if tth is not None else "d" if d is not None else "q" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we drop the last "else "q""? given our validation above?
) | ||
|
||
xtype = "q" if q is not None else "tth" if tth is not None else "d" if d is not None else "q" | ||
data, target = self.on_xtype(xtype), target_diff_object.on_xtype(xtype) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
split to two lines for greater readability
tests/test_diffraction_objects.py
Outdated
with pytest.raises( | ||
ValueError, match="You must specify exactly one of 'q', 'tth', or 'd'. Please rerun specifying only one." | ||
): | ||
orig_diff_object.scale_to(target_diff_object, q=inputs[8], tth=inputs[9], d=inputs[10], offset=inputs[11]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw, to make it more readable we could also pass the inputs as a dict so it would read input["wavelenght"]
instead of input[0]
. The intent of the former is much clearer.
thanks for the suggestions @sbillinge @bobleesj . I've edited the docstring and tests to make the codes more readable. Also created a new issue #230 for the comment above. This PR is ready to be reviewed again. |
closes #49, closes #186