Initial refactoring to use sane OSITrace reader #70

pmai · 2024-04-14T13:28:47Z

This is currently a mostly minimally invasive attempt to sanitize the trace input handling of osi-validation. It rips out all of the old complexity, which does not give any performance benefits as far as I can tell, and replaces it with very streamlined sequential I/O. This simple code is currently around 4x as fast as the original code on my limited testing, and just uses the new OSI osi3trace module.

It is for the most part bug-for-bug and to some extent performance-bug-for-performance-bug compatible with the old code (except for the 4x speed up), as it tries to not touch the other code that desperately needs refactoring. This will have to be handled in seperate PRs, with the log handling an obvious next candidate, as that is currently mostly performance limiting.

This also does not update the referenced submodule right now, as that needs other fixes that are part of e.g. PR #61.

Signed-off-by: Pierre R. Mai <[email protected]>

jdsika · 2024-04-15T11:07:40Z

@masipp can you tell us where the requirements for parallel came form?

pmai · 2024-04-15T12:43:27Z

@masipp can you tell us where the requirements for parallel came form?

And just to be clear: There is of course potential for single-trace speed-ups from proper parallelization, however:

Currently it just was not there, likely also due to too fine-grained parallelization and other problems with the approach taken
It only makes sense to introduce this once the rest of the code is refactored, as currently the log-handling is fully memory-bound and memory-limited, and is the principal bottleneck
It really should be handled transparently to the user, i.e. if anything there should only be an option to disable parallelization, for cases where multiple traces are validated in parallel, as that is a much better use of multiple cores, instead of intra-file parallelization.

The same is btw. true for most of the other argument options: This really should be handled internally (e.g. limiting memory use, buffering where useful, etc.), and should not be exposed as user arguments. But that has to be handled in a separate PR.

pmai · 2024-04-15T12:46:40Z

And as another aside: The code retains the influence of the blast argument on resetting of the LOGS variable. However given that all log entries are currently retained in the logger component, I do not see the point in the LOGS variable and especially the resetting of it: It definitely just looks like a bug somewhere, however this is mostly related to logging and not the I/O, hence I retained the resetting for now.

Similarly I did not rewrite the use of global variables, and various other suspect aspects of data flow, as that should be handled in a separate PR.

ClemensLinnhoff · 2024-04-15T13:28:35Z

I fully agree. We should limit the user options to the things that actually should be set by the user. In my (somewhat limited) tests with a couple of trace files I did not notice any performance issues from a user perspective. I can really only see the need for performance improvements, if a big database of trace files should be tested. And then I also agree, that testing multiple trace files should be parallelized and not parallelization within individual trace file. And if at some point we want to implement cross-time-step rules (see #65) parallelizing time steps within one trace file will not work anyways.

.github/workflows/ci.yml

doc/usage.adoc

osivalidator/osi_general_validator.py

pmai self-assigned this Apr 14, 2024

pmai force-pushed the refactor/simplify-trace-handling branch from 958206e to 2db9449 Compare April 14, 2024 13:41

Initial refactoring to use sane OSITrace reader

ce4ec62

Signed-off-by: Pierre R. Mai <[email protected]>

pmai force-pushed the refactor/simplify-trace-handling branch from 2db9449 to ce4ec62 Compare April 14, 2024 14:27

jdsika requested a review from ClemensLinnhoff April 15, 2024 10:09

jdsika added the quality Quality improvements. label Apr 15, 2024

TimmRuppert reviewed Apr 22, 2024

View reviewed changes

.github/workflows/ci.yml Show resolved Hide resolved

doc/usage.adoc Show resolved Hide resolved

osivalidator/osi_general_validator.py Outdated Show resolved Hide resolved

This was referenced Apr 22, 2024

Update OSITrace to be used in osi-validation OpenSimulationInterface/open-simulation-interface#762

Closed

Update OSITrace OpenSimulationInterface/open-simulation-interface#761

Closed

Refactor left-over manual indexing after code review

75ebad2

pmai marked this pull request as ready for review May 10, 2024 11:26

ClemensLinnhoff requested a review from TimmRuppert May 10, 2024 11:37

TimmRuppert approved these changes May 15, 2024

View reviewed changes

ClemensLinnhoff approved these changes May 15, 2024

View reviewed changes

jdsika merged commit 71e8e24 into master May 15, 2024

jdsika deleted the refactor/simplify-trace-handling branch May 15, 2024 07:16

ClemensLinnhoff mentioned this pull request May 16, 2024

Remove default check for is_set #61

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Initial refactoring to use sane OSITrace reader #70

Initial refactoring to use sane OSITrace reader #70

Uh oh!

pmai commented Apr 14, 2024

Uh oh!

jdsika commented Apr 15, 2024

Uh oh!

pmai commented Apr 15, 2024

Uh oh!

pmai commented Apr 15, 2024

Uh oh!

ClemensLinnhoff commented Apr 15, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Initial refactoring to use sane OSITrace reader #70

Initial refactoring to use sane OSITrace reader #70

Uh oh!

Conversation

pmai commented Apr 14, 2024

Uh oh!

jdsika commented Apr 15, 2024

Uh oh!

pmai commented Apr 15, 2024

Uh oh!

pmai commented Apr 15, 2024

Uh oh!

ClemensLinnhoff commented Apr 15, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!