You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
To ease my benchmarking of pandas code, I created a simple benchmark suite. It's based on unittest and is timeit-like, but does not use timeit.
I've included some basic tests. See if this is something we want included with Orange. I'm only not sure about the location, for now, I've put this in /benchmark/
Example output on current master:
Running benchmark with CPython v3.5.2 on Darwin-15.6.0-x86_64-i386-64bit (OSX 10.11.6 x86_64)
[adult_create_X] with 50 loops, best of 3:
min 0.518 usec per loop
avg 0.528 usec per loop
[adult_filter] skipped: Not a pandas environment.
[adult_read] with 5 loops, best of 3:
min 748 msec per loop
avg 786 msec per loop
[iris_basic_stats] with 50 loops, best of 3:
min 206 usec per loop
avg 247 usec per loop
[iris_contingency] with 10 loops, best of 3:
min 200 usec per loop
avg 211 usec per loop
[iris_create_X] with 100 loops, best of 3:
min 0.491 usec per loop
avg 0.525 usec per loop
[iris_discretize] with 10 loops, best of 3:
min 662 usec per loop
avg 991 usec per loop
[iris_distributions] with 20 loops, best of 3:
min 110 usec per loop
avg 115 usec per loop
[iris_iteration_pandas] skipped: Not a pandas environment.
[iris_iteration_pre_pandas] with 10 loops, best of 3:
min 681 usec per loop
avg 782 usec per loop
[iris_read] with 100 loops, best of 3:
min 2.49 msec per loop
avg 2.87 msec per loop
Could this be shaped into a set of performance guarantee tests which always run and fail if some change results in poorer performance over some threshold?
I don't think it can be done, atleast without significant effort. There is no travis-like tool that integrates with github, the only similar one I found was asv, which seems local-only. Running on travis is impossible because it does not provide a consistent performance environment as far as I'm aware. The threshold is the other problem: you have to allow for some noise and, even then, you'd get a bazillion PRs that worsen the performance slightly and then another one that, maybe even because of noise, finally fails, without it even being its fault.
timeit-likeness is a feature only in that the output is similar and it runs tests numberxrepeat times :)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
To ease my benchmarking of pandas code, I created a simple benchmark suite. It's based on
unittestand istimeit-like, but does not usetimeit.I've included some basic tests. See if this is something we want included with Orange. I'm only not sure about the location, for now, I've put this in
/benchmark/Example output on current master: