-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add scaling/normalisation of KPI outputs #77
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty good. I like the normaliser interface. I'm very glad to see increasing unit test coverage.
There are a couple of things I've raised that are quick fixes (the changelog comments, for example). And I'm not keen on some aspects of some of the builders in the tests.
I only skimmed quite a lot of the tests. They're big and complicated. That is largely because of the awkward design of the code under test (which is definitely not the fault of this PR) - we have a God class, and a strange API. Those things are now starting to impact the maintainability of the code, and one symptom of that is how difficult it is to write unit tests around the KPI calculations.
The fixtures and setup of each test are very involved, there are large numbers of assertions in each test, and it's difficult to match the expectations to the fixtures. Given that the KPI calculations are arguably the most important area of the code, that is something that we should address soon.
src/test/java/com/arup/cml/abm/kpi/builders/ScoringConfigBuilder.java
Outdated
Show resolved
Hide resolved
src/test/java/com/arup/cml/abm/kpi/builders/ScoringConfigBuilder.java
Outdated
Show resolved
Hide resolved
src/test/java/com/arup/cml/abm/kpi/builders/ScoringConfigBuilder.java
Outdated
Show resolved
Hide resolved
...est/java/com/arup/cml/abm/kpi/tablesaw/TestTablesawAccessToMobilityWithLinearNormaliser.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Kasia! It's looking very good and a straight forward implementation.
I have two comments:
- How should a user interact with the normalisation feature to adjust the bounds of the KPI's? Should they modify the
MatsimKpiGenerator.java
file? It'll be helpful to maybe just add a couple of sentences in the README on this feature and the expected functionality. - I don't think the Affordability KPI writes correctly in the case where it needs to output -1/-1. Do you mind checking this?
...est/java/com/arup/cml/abm/kpi/tablesaw/TestTablesawAffordabilityKpiWithLinearNormaliser.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Tidied up changelog and javadoc in
LinearNormaliser
- Refactored test fixture builders - changed
Legs/TripsBuilder
->Legs/TripsTableBuilder
which rely onLeg/TripBuilder
s which build the Leg/Trips with a nicer.withX
methods - Also refactored
ScoringConfigBuilder
to move away from use of defaulted values and nicer.withX
methods. This still remains in a couple of places (plans, network & transit schedule builders), I just ran out of time today
src/main/java/com/arup/cml/abm/kpi/matsim/run/MatsimKpiGenerator.java
Outdated
Show resolved
Hide resolved
src/test/java/com/arup/cml/abm/kpi/builders/PersonsBuilder.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
… fewer things at any one time
A few somewhat small changes:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Kasia!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tests are looking better now 👍
Resolves: #69
Fixes: #73
This PR makes updates to KPI outputs to give a scaled/normalised value alongside the actual KPI value.
The output types of the methods change, majority of them are now maps with two keys:
"actual"
and"normalised"
. They are still saved to csv, now with two columns:"actual"
and"normalised"
(examples below for the integration test files).[Bonus] When working on this, adding tests specifically, I found a bug in GHG KPI and fixed it right there and then.
Normalisation
We normalise the KPI outputs using an affine transformation. It maps linearly the given interval (unique to each KPI) to
[0,10]
. I madeNormaliser
an interface, for if/when we want to change the type of transformation. The bounds of the mapped interval ([0,10]
) were deliberately kept changeable - this made it easier to include the normaliser in tests in a simpler capacity (multiplicative constant). Thinking about it now, maybe I should have mocked this, the linear normaliser is tested separately - let me know what you think. I think being able to change scale bounds is still useful if one day we decide we want to map to[0,1]
for example.LinearNormaliser
defaults to[0,10]
right now, but I still wanted to specify the mapped interval inMatsimKpiGenerator
. Example:maps values between
0
and100
to values between0
and10
.One funny thing you will notice is that sometimes the interval for a KPI is flipped, big values are bad and small values are good. This is handled by the same normaliser (using factors of
-1
under the hood) and the normaliser object is specified like this:i.e. mapping values between
100
and0
to values between0
and10
.Let me know if that API is clear, I think of it as
[0, 10] -> [100, 0]
. I briefly considered a dedicated class for the reverse normaliser, but I thought it might look more confusingnew ReverseLinearNormaliser normaliser = ReverseLinearNormaliser(0, 10, 0, 100)
, because you need to pay attention to the name of the class as well as the interval values. That class would be almost identical to theLinearNormaliser
class too, with only a couple of small changes. I think what I've got here fells more natural but keen to get some opinions on that.Tests
Majority of the new code is tests for the KPIs that were being normalised. The more complicated KPIs get more tests.
This also meant more content being added to the builders that set up the inputs to produce different behaviour in the tests. Very happy to be plugging those holes.
Integration tests
Outputs that are benchmarks for integration tests changed for all of the KPIs being scaled. Below are the details.
smol
drt