Skip to content

test: adding a test to unsqueeze squeezed data #180

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

Luiskitsu
Copy link
Contributor

I have added a test that unsqueezes squeezed data and compares it with the expected uniform data. I have also plotted the results to make sure it was doing the correct thing. When the input data is squeezed (positive squeezed parameters), the unsqueezing works great:
image
However, when the input data is unsqueezed (negative squeezed parameters), the test fails because the transformation pushes data outside the original domain. This results in missing values during interpolation, and recovering the signal would require extrapolation, which is not a good idea.
image
I don't see an easy way to handle this, but in practice, our squeeze/unsqueeze parameters will be quite small — so the extrapolated region would be minimal, and I expect the error to be negligible. Thoughts?

@sbillinge
Copy link
Contributor

thanks @Luiskitsu this is goo progress. I think I would prefer if we have a squeeze_0 as a general matter so we are not confied to squeezes with no zero offset. Also, I don't see what the squeeze parameters are that you used. As we devise tests we are thinking about what behavior we want the program to have. As you may see already, this is a valuable exercise (we found the problem with the unsqueeze going out of range for example), but it also raises the question about what we wan tto support and what we don't. i.e., the ranges of the squeeze parameter domains.

For the stretch we had a well defined use case in mind (thermal expansion). Do we have any thoughts about that here? What kind of squeeze distortions we might encounter in practice and so what ranges of squeeze parameters. Whatever we decide, we want to capture it in the docstring eventually and maybe even raise warnings of exceptions if people try to use it out of range.

@Luiskitsu
Copy link
Contributor Author

Luiskitsu commented Apr 1, 2025

Thanks Simon. I have added squeeze_0 for non zero offset. I have also included "safe" bounds in docstring.
For the squeeze morph, my only use case in mind is to correct for non-linear distortions from the detector that have not been corrected during calibration, for example in pyfai. These distortion can be due to misalignment or tilt on each module of the detector. I expect these non-linear distortion to be very small, at least based on our recent experiment. I have created bounds that I think are safe for an x-axis going from 0 to 10. Might be good to write it more general as a percentage depending on detector size.
Here I am showing an example for squeeze parameters of:
squeeze_0 = 0.2
squeeze_1 = 0.001
squeeze_2 = 0.001
image
And now one for parameters of:
squeeze_0 = -0.2
squeeze_1 = -0.001
squeeze_2 = -0.001
image

@sbillinge
Copy link
Contributor

These look good. Sorry to go so slowly with this, but I think it can be very useful so I would like to get it right. If this is blocking data analysis we can pivot, but in the long run, having robust code will always win the race.

I am not yet reviewing code, just the Use Case, but I wonder why we can't actually keep the original domain. This would make it much more useful I think. It means extapolating at bit at one or other end in general, but the extrapolation should be very well controlled because we are only 3rd order and fitting over a fairly wide range, no? If we put constraints on how much the extrapolation can go we may want it to be a percentage of the range, something like that? Could you think a bit about that maybe.

btw these tests don't need to do the regression, since the squeeze morph module only computes the morphed data. The regression is done elsewhere, though the fact of you doing the regression allowed us to figure out that the extrapolation is going to be something we have to deal with. But for the tests we just need 2 or 3 sets of curves with different morph parameters that we will then write tests using. I imagine the most commonly used one will be a mixture of positive and negative coefficients tbh so it might be good to have and example of that. Do we want to test only very small morphs because of your UC, or experiment with larger ones in case someone else has another UC that requires larger ones, and we show that the morph still works?

@Luiskitsu
Copy link
Contributor Author

Hi Simon,
This has already been super helpful for me and will definitely lead to a more robust and pro code. I completely agree that it’s worth taking the time to get this right.
In the meantime, Mike and Sam asked to look at the data, so I’ve updated the Jupyter notebook we had during beamtime. I’ve added a full workflow from I(Q) to G(r), using the same morphing code we had and adding the G(r). While it’s not yet perfect or fully robust, it works well for exploring G(r), dG(r), and the figure of merit. I plan to send this to Sam, Reegan, and Clara so they can start data analysis while you and I can focus on refining the code, making it more robust, and taking the time we need. I’ll email you the notebook with an example shortly.

I don't fully understand your comment on the original domain. Right now, in the tests, all data is on the same x-axis grid. I’ve played a bit with extrapolation using interp1d from SciPy, and for small to moderate squeezing cases it seems to behave reasonably well. I’ll think more about how to implement a controlled extrapolation in a clean, constrained way. These are the examples I tested:
Squeeze parameters of:
squeeze_0 = 0.2
squeeze_1 = -0.001
squeeze_2 = -0.001
image
Squeeze parameters of:
squeeze_0 = 0.2
squeeze_1 = -0.001
squeeze_2 = 0.001
image
Squeeze parameters of:
squeeze_0 = 0.2
squeeze_1 = 0.001
squeeze_2 = -0.001
image
Squeeze parameters of:
squeeze_0 = -0.2
squeeze_1 = 0.001
squeeze_2 = -0.001
image

Also, for the real squeeze function, I don't think we need extrapolation because the parameters will be refined?
In the tests I am using squeezing parameters that are one to two order of magnitude larger than what we should see in reality

@sbillinge
Copy link
Contributor

In the meantime, Mike and Sam asked to look at the data, so I’ve updated the Jupyter notebook we had during beamtime. I’ve added a full workflow from I(Q) to G(r), using the same morphing code we had and adding the G(r). While it’s not yet perfect or fully robust, it works well for exploring G(r), dG(r), and the figure of merit. I plan to send this to Sam, Reegan, and Clara so they can start data analysis while you and I can focus on refining the code, making it more robust, and taking the time we need. I’ll email you the notebook with an example shortly.

yes, this is great!

@sbillinge
Copy link
Contributor

Also, for the real squeeze function, I don't think we need extrapolation because the parameters will be refined?
In the tests I am using squeezing parameters that are one to two order of magnitude larger than what we should see in reality

OK, this is great. What I meant about the original domain is I think what you are already doing. I think that the way morph works is that the data are always on the original grid and the y-array is transformed. OK, these three cases look good and we can use them for the tests. I will now look at the testing code.

Copy link
Contributor

@sbillinge sbillinge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice work on simulating the different cases. Now we can work on the tests. Please see my comments. @pytest.mark.paramatrize has a slightly tricky syntax (and I always have to look it up) but it works well for what we want to do here.

@@ -0,0 +1,23 @@
**Added:**

* Squeeze test: given a squeezed signal, unsqueezing recovers the original
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you don't need this. This "news" is user facing documentation that will become the changelog at the next release. It tells users how the code has changed since the last release. Se we want to say something like, unded Added, * Polynomial squeeze of x-axis of morphed data" or something even more descriptive.

@@ -0,0 +1,74 @@
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't need docstrings in test functions, so please delete this, (but we do like comments on each of the tests that capture the intended behavior of each test.



def test_morphsqueeze():
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see above, no docstrings here.

import numpy as np
from scipy.interpolate import interp1d


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we are now going to test 3 or 4 different variants of the same thing (i.e., different representative numbers) so it is time to use @pytest.paramatrize decorator. Please see diffpy.utils tests for examples.

assert np.allclose(y_actual, y_expected, atol=100)

# Plot to verify what we are doing
plt.figure(figsize=(7, 4))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't want plotting code in the tests. You may want to leave these but commented out with a comment above saying that this plotting code was used for the comments in the github PR and give the URL.

# Check that the unsqueezed (actual) data matches the expected data
# Including tolerance error because I was having issues
# with y_actual == y_expected. I think is because interpolation?
assert np.allclose(y_actual, y_expected, atol=100)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this tolerance seems high?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes sorry, it should be like 1 due to the extrapolation error. I will change this.

@sbillinge
Copy link
Contributor

@Luiskitsu btw, one more thought is whether we want to extend the UC to allow the users to specify the order of the polynomial. If one of the arguments of the function is a list of initial values for the coefficients, the order would just be the length of that. This may be a desirable UC if we can figure out how to implement it (for example, using numpy.polynomial). We could write tests for 2 or 3 different orders.

@Luiskitsu
Copy link
Contributor Author

@sbillinge Thanks so much for your feedback! I have followed all your suggestions and modified the test to include pytest.mark.parametrize decorator as well as numpy.polynomial. I have used diffpy.utils tests as reference. I have used different polynomial degrees as well as positive and negative values of coefficients for the tests. I could include more tests but the ones I am using should be much larger squeezing values compared to what we should expect from experiments. I really like the numpy.polynomial idea. Now this squeeze morph can do the same as the stretch morph, the shift morph and also include higher degrees. Is it better to have different morphs that can do different corrections to the data or have one morph that can do all?

@sbillinge
Copy link
Contributor

@sbillinge Thanks so much for your feedback! I have followed all your suggestions and modified the test to include pytest.mark.parametrize decorator as well as numpy.polynomial. I have used diffpy.utils tests as reference. I have used different polynomial degrees as well as positive and negative values of coefficients for the tests. I could include more tests but the ones I am using should be much larger squeezing values compared to what we should expect from experiments. I really like the numpy.polynomial idea. Now this squeeze morph can do the same as the stretch morph, the shift morph and also include higher degrees. Is it better to have different morphs that can do different corrections to the data or have one morph that can do all?

this did occur to me (that this supercedes the other morphs) but I think we should leave them as they are as they have a physical interpretation whereas this is more ad hoc. Later we can maybe keep the same interface of "shift", "stretch" etc., but use this morph underneath. but for now we don't have to refactor those I think.

Copy link

codecov bot commented Apr 7, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 96.92%. Comparing base (d1c2695) to head (d42d234).

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #180      +/-   ##
==========================================
+ Coverage   96.85%   96.92%   +0.07%     
==========================================
  Files          18       19       +1     
  Lines         795      814      +19     
==========================================
+ Hits          770      789      +19     
  Misses         25       25              
Files with missing lines Coverage Δ
tests/test_morphsqueeze.py 100.00% <100.00%> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

@sbillinge sbillinge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fantastic! you are a quick learner! Please see my comment.

],
)
def test_morphsqueeze(squeeze_coeffs):
# Uniform x-axis grid. This is the same x-axis for all data.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comments in the test are not needed, just the comments summarizing the behavior above are useful.

# Apply squeeze parameters to uniform data to get the squeezed data
x_squeezed = x + squeeze_polynomial(x)
y_squeezed = np.sin(x_squeezed)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see where the function is called, so we don't seem to be testing anything. I would expect to see something like.

morphed = SqueezeMorph(x, y)
actual_x, actual_y = morphed.xyallout

or sthg like that (not sure I have the syntax right), then

assert np.allclose(actual_y, expected_y)

I expect this to fail initially because the function is not written yet, but write an empty function with just a docstring and a signature and then import it.

Then write the code to make the test pass.

@Luiskitsu
Copy link
Contributor Author

Luiskitsu commented Apr 8, 2025

@sbillinge I have added a MorphSqueeze class following the same architecture as MorphStretch and integrated it with the tests. The current tests pass, which suggests that the basic functionality is working.
Note that the test and the current function implementation are set up to "unsqueeze" squeezed data (we know the squeezing parameters). For our actual application, we could either choose to unsqueeze or to squeeze the data. I would need to change self.y_morph_out = interp1d(x_squeezed, self.y_morph_in)(self.x_morph_in) to self.y_morph_out = interp1d(self.x_morph_in, self.y_morph_in)(x_squeezed). I think both approaches would work as we are refining the parameters. The point is that they represent opposite mappings (inverse versus forward) and will yield different numerical values for the squeezing parameters.

Copy link
Contributor

@sbillinge sbillinge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still some issues here. please see my comments. This is how I think we should build the y_expected in the test

    # build the expected y_array
    x_make = np.linspace(-2., 12., 1401)
    x_test = np.linspace(0., 10., 1001)
    squeeze_polynomial = Polynomial([0.2, -0.01, 0.001, -0.001, 0.0001])
    y_init = np.sin(x_make)
    y_squeezed = np.sin(x_make+ squeeze_polynomial(x_make))
    lower_idx = np.where(x_make == 0.0)[0][0]
    upper_idx = np.where(x_make == 10.0)[0][0]
    y_expected = y_squeezed[lower_idx:upper_idx+1]

    plt.plot(x_make, y_init, 'o')
    plt.plot(x_make + squeeze_polynomial(x_make), y_squeezed, '-')
    plt.plot(x_make, y_squeezed, '-')
    plt.plot(x_test, y_expected, '-')

    plt.show()

You can delete the plotting code, but it shows that it builds the squeezed target correctly over the range of the testing data

],
)
def test_morphsqueeze(squeeze_coeffs):

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please close this code up. Use empty lines only where they are needed for syntactic clarity.

)

# Check that the morphed (actual) data matches the expected data
# Including tolerance error because of extrapolation error
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a no-no. You are updating the test to make it pass, but the test should express the behavior we want, then write whatever code we need to make the test pass. We don't want a loose test to pass because we haven't coded the edge-case correctly.

[0.1, 0.3],
# 4th order squeeze coefficients
[0.2, -0.01, 0.001, -0.001, 0.0001],
# Testing zeros
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs a statement of the desired behavior when it is fed zeros. Also, do we need a test for some zeros and others not zero? What behavior we want then?

@Luiskitsu
Copy link
Contributor Author

Still some issues here. please see my comments. This is how I think we should build the y_expected in the test

    # build the expected y_array
    x_make = np.linspace(-2., 12., 1401)
    x_test = np.linspace(0., 10., 1001)
    squeeze_polynomial = Polynomial([0.2, -0.01, 0.001, -0.001, 0.0001])
    y_init = np.sin(x_make)
    y_squeezed = np.sin(x_make+ squeeze_polynomial(x_make))
    lower_idx = np.where(x_make == 0.0)[0][0]
    upper_idx = np.where(x_make == 10.0)[0][0]
    y_expected = y_squeezed[lower_idx:upper_idx+1]

    plt.plot(x_make, y_init, 'o')
    plt.plot(x_make + squeeze_polynomial(x_make), y_squeezed, '-')
    plt.plot(x_make, y_squeezed, '-')
    plt.plot(x_test, y_expected, '-')

    plt.show()

You can delete the plotting code, but it shows that it builds the squeezed target correctly over the range of the testing data

Thank you for the feedback. I believe the code you provided is not correct as the y_expected is squeezed when it should be a uniform sine. That said, I think I interpreted the point you made with your comments and changed the code correspondingly. Now there is no need of tolerance as I use an extended x-axis as you suggest. Please see my new changes. Also, I have plotted all the tests results for clarity and confirmation:
image
image
image
image
image
image
image
image

Copy link
Contributor

@sbillinge sbillinge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome, this is getting pretty close now....

[0.1, 0.3],
# 4th order squeeze coefficients
[0.2, -0.01, 0.001, -0.001, 0.0004],
# Zeros and non-zeros, expect 0 + a1x + 0 + a3x**3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not quite what I am after. I have a feeling that if a morph parameter is zero the morph is not refined (though I am not sure). Otherwise there would be no need in, e.g., MorphStretch to avoid instantiating the morph object for a zero stretchy, you could just pass it to the morph with a value zero and it would return an unstretched morph.

This is bad coding tbh (wrist slap to my former students), it is what we call "magic" which is always to be avoided. Kind of "hidden behavior" if you like, which is why is like magic. We need to check if this is the case, but if it is the case, then we need to handle it here. So the comment would be something like # Zeros and non-zeros, zero coefficients will not be refined or sthg like that. This will be hard in our case tbh and may break the existing structure of the code in which case we may need to scale back. argh, it is hard to write good code.....

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I didn’t consider the implications for refinement. I think the cleanest and most consistent behavior is to only skip the morphing if squeeze is None. If the coefficients are all zero (e.g., [0, 0, 0]), the code should still run. This way, those coefficients remain part of the refinement process and aren’t treated as a special case.
This should avoid any hidden or "magic" behavior and keep everything explicit. So the comment should be like # Zeros and non-zeros, the full polynomial is applied ...
Let me know what you think, I can change the code accordingly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't think any user will ever want to fix one of the coefficients but tefine the others? Remember, the test captures desired behavior, not behavior that is ready to code.... We can decide not to implement things that are to hard, but that decision cubes later when we figure out it is desired but to hard

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think someone might want to fix some coefficients and refine others. I need to think more on how to implement it. So far what I thought we could do is instead of having a squeeze parameter that is an array of coefficients, we could have different input squeeze parameters, squeeze_0, squeeze_1, etc, for each coefficient. Then we could either give them an initial value if we want to refine that coefficient or set it up as None and won't be refined. This is similar as just having different morphs for each coefficient, like shift, stretch, and add more.

def test_morphsqueeze(squeeze_coeffs):
x_target = np.linspace(0, 10, 1001)
y_target = np.sin(x_target)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

close up. no blank line here (and everywhere below)

morph = MorphSqueeze()
morph.squeeze = squeeze_coeffs

x_actual, y_actual, x_expected, y_expected = morph(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not right. You can't have an "expected" returned by the function. x_expected and y_expected are what you are calling x_target and y_target I think. I would just rename them from the beginning.

x_morph, y_morph, x_target, y_target
)
y_actual = y_actual[lower_idx : upper_idx + 1]
assert np.allclose(y_actual, y_expected)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we probably want four asserts, one each for the four quantities returned by the function. I think (but again, not sure) that it should simply return a copy of x_target and y_target in those places. We are not actually testing all the possibilities in principle because in our tests the morph and the target are on the same x-grid. Again, a conversation about behavior....do we want to be able to morph things onto each other that are on different grids? If yes, we need to work harder to build our test, but it should be done in the same way.....the expecteds are hard-coded and not coming from the function. Remember that the expecteds encode what behavior we want.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’ll add the four asserts to check all outputs from the morph.
If we want to keep behavior consistent across all morphs, I think it makes sense for MorphSqueeze to assume same x-grids. But I do think it’s very important to support morphing between datasets on different grids more generally. One idea would be to create a dedicated morph that regrids the data onto a common x-grid before other morphs are applied. This would allow flexibility without overcomplicating individual morph classes? What do you think?
This is especially relevant for workflows like ours where, for example, XFEL data and synchrotron data come in on different grids

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like there is already a MorphRGrid

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we make tests it is very important to keep the discussion to desired behavior and not get distracted by how it will be implemented. If the other morphs are all on the same grid we can stick to that but make an issue to address this in the future. Personally I think it would be more useful if they could be on different grids. It won't be hard to implement but some decisions are needed about which grid to use ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a great point. I have focused on the desired behavior in the test and used different grids. I am keeping the target grid fixed and interpolate the morphed data onto it. This works well when the input morph data has a finer step in the x-axis compared to the target data. However, in the scenario where the input morph data has much lower resolution in x-axis the interpolation won't be very accurate. So far I have implemented the interpolation in the test function but we could instead implement it in the MorphSqueeze class.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my last commit for the updated test that uses different x-grids. I still need to think through the best way to allow selective refinement of squeeze coefficients. Apologies if I’m moving a bit slowly, I’m new to this but learning a lot! Thank you again for all your mentorship and patience!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my last commit for the updated test that uses different x-grids. I still need to think through the best way to allow selective refinement of squeeze coefficients. Apologies if I’m moving a bit slowly, I’m new to this but learning a lot! Thank you again for all your mentorship and patience!

You are doing really well. Writing good code is hard, but worth it...... Sorry I am making it so slow by being so picky!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please keep being as picky as needed! So that I can learn the right/best way to write good code :)

Copy link
Contributor

@sbillinge sbillinge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please see latest round of comments. The first one is the hardest one but once the logic becomes clear, future cases will be much quicker.

I am not even looking at your code in the morph, just focusing on the test. In general, at this point in the process I want the function to be empty and the tests failing. When we have the tests testing what we want we will turn to the code itself.

def test_morphsqueeze(squeeze_coeffs):
x_expected = np.linspace(0, 10, 1001)
y_expected = np.sin(x_expected)
x_make = np.linspace(-3, 13, 3250)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a comment here on why this grid was chosen for the morph. (a) we want it to be different than the target grid in general (b) we want to support the case where the morph grid is finer than the target but not vice versa.

In general we want to test the behavior but at minimal computational cost, so we may want to coarsen all the grids...maybe even by 10x.

y_morph = np.sin(x_squeezed)
morph = MorphSqueeze()
morph.squeeze = squeeze_coeffs
x_actual, y_actual, x_target, y_target = morph(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good names may be x_morph_actual and x_target_actual etc. to make it more readable to future researchers (including ourselves)

x_actual, y_actual, x_target, y_target = morph(
x_make, y_morph, x_expected, y_expected
)
y_actual = interp1d(x_actual, y_actual)(x_target)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interp1d is deprecated. we shouldn't use it. If you google this you will find recommendations for alternative approaches. Basically using cubic splines is probably best.

Also, this is not y_actual The actuals have to be what are returned by the function, by definition.

Of course the test will fail until you implement it in the function, but that is the point of the test. We first write tests that capture the behavior that we want but that fail. Then we write the code until the tests pass. When the tests pass, then the code we have written captures the behavior we want by definition. But ONLY if we were very strict that the wrote the test that captured the behavior we want.

assert np.allclose(y_actual, y_expected)
assert np.allclose(x_actual, x_expected)
assert np.allclose(x_target, x_expected)
assert np.allclose(y_target, y_expected)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think these tests are quite right. they should all look like:

assert np.allclose(y_target_actual, y_target_expected)

Then we hard-code the expected's to be what we want

@Luiskitsu
Copy link
Contributor Author

@sbillinge I think I finally understand what you meant about focusing the test on the desired behavior, not on getting the current implementation to pass. I had been doing it the wrong way around. The structure of the test makes much more sense now ... Please see my last commit. For future tests I have now a much clearer idea on how to prepare them.

Copy link
Contributor

@sbillinge sbillinge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems to be getting there, modulo one comment I made.

x_morph_actual, y_morph_actual, x_target_actual, y_target_actual = morph(
x_morph, y_morph, x_target_expected, y_target_expected
)
assert np.allclose(y_morph_actual, y_target_expected)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, I may not have been clear enough.....this should be

    assert np.allclose(y_morph_actual, y_morph_expected)
    assert np.allclose(x_morph_actual, x_morph_expected)

for the sake of readability, then earlier in the code y_morph_expected = y_target_expected if the behavior we want is the morph to reappear on the target grid (I am honestly not sure what is desired behavior there tbh).

This may not support our UC though where we want to morph XFEL data onto synchrotron data because the synchrotron data extend much further than the XFEL so how do we handle that UC?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can trim or not trim the target depending if the target data extends further than the morph data, and then interpolate the morph data to the target data. This should work for both scenarios, when the target data extends much further than the morph data and when the morph data extends much further. I will implement this now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This reply is too implementational again 😂. Don't forget that at this stage we discuss only behavior. Otherwise we end up with a program that is what can be programmed, not s program that solves the user's problems....

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually, once we understand the behavior we can find a way to do it....

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a user’s perspective, I think the desired behavior is for the morph to only be applied over the overlapping region between the morph and the target data. I have implemented this in my last commit, but happy to revise it depending on what we decide is the best desired behavior.

@sbillinge
Copy link
Contributor

Another final step for this PR is to work on the documentation. This means editing the docstring of the function itself to summarize how it is used and capturing the discussions that have taken place here insofar as it is relevant to users (e.g., that the target grid is used.

We may also need tests for a case where the target grid extends further than the morph grid (or vice versa). What do we want to happen then?

@Luiskitsu
Copy link
Contributor Author

I’ve updated the docstring for the MorphSqueeze function and added functionality to handle cases where the target grid extends beyond the morph grid (or vice versa). The target data is now trimmed so that the morph is only applied over the overlapping region, and the morphed data is interpolated onto the trimmed target grid. This works for both when the target extends beyond the morph but also when the morph extends beyond the target. So far I have added two different tests where I am just changing the x_target and x_morph to either be x_target range larger than morph or opposite, test_morphsqueeze_target_extends_beyond_morph(squeeze_coeffs) and test_morphsqueeze_morph_extends_beyond_target(squeeze_coeffs). Currently, there’s some duplicated code between the tests, which I plan to clean up next. But before doing that, I wanted to confirm with you whether the tests capture the behavior you were looking for. I am attaching here example figures for the test when the x_target goes beyond the x_morph, similar to our synchrotron vs xfel or synchrotron vs lab base.
image
image
image
image

Copy link
Contributor

@sbillinge sbillinge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see comment. I am just about to board my flight. I will try and seed some UCs in an issue so you can see what I mean.

squeeze_polynomial = Polynomial(squeeze_coeffs)
x_squeezed = x_morph + squeeze_polynomial(x_morph)
y_morph = np.sin(x_squeezed)
# Trim target data to the region overlapping with the squeezed morph
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure we have thus quite right. Let's take a step back and start over. Don't modify any code, let's just discuss behavior again. Things got a bit more complicated when we went to different grids. This wasn't a Use Case (UC) before so we need to move carefully, as it will have knock-on effects for the test of the package.

Since you learned everything else, we may as well learn about Use Cases which is what we use when we develop new functionality, as here. I actually suggest we close this PR and open a new one. We will reuse code from here because the code is good....

The approach is first UCs (what are some little scenarios that exemplify user behavior). Then tests that capture the UCs we want to implement (no code so all tests will fail, but we write an empty function so we get the signature and docstring right)

Only when this is fully covered we write the code to pass the tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Simon, hope you have a great flight!
I like your suggestion, lets close this PR and open a new one. I am going to create a new issue with the UCs you provided. Then I will create a test for each UC with the function being empty, just the behavior we want.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants