Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove cvxpy for weight minimization, fix test values and refactor tests codes #120

Merged
merged 14 commits into from
Oct 30, 2024

Conversation

bobleesj
Copy link
Contributor

Closes #112

@@ -1,5 +1,4 @@
numpy
scipy
cvxpy
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As intended

tm = [
([[[1, 0], [0, 1]], [1, 1], [0, 0], [1, 1]], [0, 0]),
test_matrix = [
# ([stretched_component_gram_matrix, linear_coefficient, lower_bound, upper_bound], expected)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added an in-line comment to see the order of the input data in the test matrix

([[[1, 0], [0, 1]], [1, 1], [0, 0], [1, 1]], [0, 0]),
test_matrix = [
# ([stretched_component_gram_matrix, linear_coefficient, lower_bound, upper_bound], expected)
([[[1, 0], [0, 1]], [1, 1], 0, 0], [0, 0]),
Copy link
Contributor Author

@bobleesj bobleesj Oct 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixing syntax bug -

fixing from [0,0] to to 0, 0]

0,0] refers to lower_bound and upper_bound, respectively.

expected = tm[1]
actual = get_weights(tm[0][0], tm[0][1], tm[0][2], tm[0][3])
actual = get_weights(stretched_component_gram_matrix, linear_coefficient, lower_bound, upper_bound)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pass variables instead of tm[0][0], etc. which is much harder to read for future generations.

Copy link
Contributor Author

@bobleesj bobleesj Oct 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Problem- the previous code in line 212 in this file had an absolute tolerance of 0.5!

np.testing.assert_allclose(actual, expected, rtol=1e-03, atol=0.5)

When I modified atol to 1e-05

np.testing.assert_allclose(actual, expected, rtol=1e-03, atol=1e-05)

the original code started to fail (3-4 out of 7-8 test cases failed).

Since our new scipy minimization function has been implemented correctly per our tests above, I've updated the numbers for the cases that are failing - CI is passing as well with the new implmentation.

expected = tuwm[1]
np.testing.assert_allclose(actual, expected, rtol=1e-03, atol=0.5)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

atol=0.5 is way too high considering we are comparing numbers between 0 and 1.2, etc.

Copy link
Contributor Author

@bobleesj bobleesj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ready for review @sbillinge Please find my Interesting comments

@bobleesj bobleesj marked this pull request as ready for review October 26, 2024 13:26
Copy link
Contributor

@sbillinge sbillinge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look like a step in the right direction.

Before I merge, let me think a bit about this. I haven't looked at the tests pretty much at all in this code before so I want to try and understand what behavior they are testing. It would be good to collaborate on this if you are interested. People tend to be bad at testing until they are good. They don't really realize that the tests are supposed to define the behavior we want, then we code to the tests, but most people's workflow tends to be the other way around....write the code then come up with some tests that pass when the code runs....

that's how you might end up with a test with ridiculous tolerances.....

So the next step is to think what each function is supposed to be doing and what behavior we want it to have. This takes time, so we will just do it for this bit of code that we have been touching here....

@sbillinge
Copy link
Contributor

I took a quick look at those tests and they are a bit hard to understand. There are also inconsistencies in that function. For example, in the docstring it says that the inputs need to be 1D array like (for example) then there is a line of code that takes the array and does np.asarray(variable), which of course is not needed if the input is 1d array like as suggested by the docstring. It is not a huge deal, but just technical debt that makes the code harder to maintain in the future, so if we can figure it out and fix it now, all the better.

But for the tests, we may want to start by putting a comment on each lline of tm with what behavior it is testing. Usually we would want to start with the simplest case and then get more complicated and do a few edge cases.

This may be more trouble than it is worth, but let's give it a bit of thought at least since we are working on it.

@bobleesj
Copy link
Contributor Author

Yes, the tests are hard to interpret and understand using unknown keywords such as taw, etc. And a lot of unnecessary casting, as you've mentioned, np.asarray, etc.

I will take some time to refactor the code and keep you posted.

Would you want me to make a separate PR for refactoring?

@sbillinge
Copy link
Contributor

Yes, the tests are hard to interpret and understand using unknown keywords such as taw, etc. And a lot of unnecessary casting, as you've mentioned, np.asarray, etc.

I will take some time to refactor the code and keep you posted.

Would you want me to make a separate PR for refactoring?

It may make sense to do it on the same PR. First work on the tests. Then the code refactor will simply include the removal of cvxpy.

Let's update then review the tests before doing the refactor. We may want to do it systematically also for any functions that call this one, unless this gets to be too much. But those functions will control whether the inputs here are array-like or not, for example. Is there a reason we don't make these as arrays at the beginning and just pass them around as such, for example.

@bobleesj
Copy link
Contributor Author

@sbillinge ready for review

Mostly, I tried to leverage the power of @pytest.mark.parametrize where we can assign variable names instead of passing some random matrix names. The variables are nicely passed along within the test functions.

I think it's best to further continue working on a separate PR

@bobleesj bobleesj changed the title Remove cvxpy for weight minimization, fix test values with improved accuracy Remove cvxpy for weight minimization, fix test values and refactor tests codes Oct 29, 2024
@sbillinge sbillinge merged commit 726463f into diffpy:main Oct 30, 2024
3 checks passed
@sbillinge
Copy link
Contributor

Thanks for this @bobleesj . I agree this is a nice way to do the mark.paramaterize.

Of greater utility would be some comments that explain the intent of each test within the paramaterize. It may be hard to reverse engineer this or people's tests but good practice moving forward.

@bobleesj bobleesj deleted the rm-cvxpy branch October 30, 2024 03:24
@bobleesj
Copy link
Contributor Author

Of greater utility would be some comments that explain the intent of each test within the paramaterize. It may be hard to reverse engineer this or people's tests but good practice moving forward.

Got it, now that we have slightly more readable test functions, I will try to reverse engineer..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

try and remove cvxpy dependency
2 participants