-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
DES: How much perf penalty will we accept to get rid of libreduction? #40263
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
The referenced benchmark is: N = 10 ** 4
labels = np.random.randint(0, 2000, size=N)
labels2 = np.random.randint(0, 3, size=N)
df = DataFrame(
{
"key": labels,
"key2": labels2,
"value1": np.random.randn(N),
"value2": ["foo", "bar", "baz", "qux"] * (N // 4),
}
)
df.groupby("key").apply(lambda x: 1) Running this with current master, I get:
When disabling the usage of libreduction fast_apply using this patch: --- a/pandas/core/groupby/ops.py
+++ b/pandas/core/groupby/ops.py
@@ -390,6 +390,7 @@ class BaseGrouper:
# for now -> relies on BlockManager internals
pass
elif (
+ False and
com.get_callable_name(f) not in base.plotting_methods
and isinstance(splitter, FrameSplitter)
and axis == 0 I get the following timing:
So a 3-4x slowdown. However, the applied function is not doing anything useful (just returning a constant), so basically this benchmark is only measuring the overhead. And whether a 3-4x slowdown in the overhead is significant in a real use case, depends on how much time this overhead itself takes. So using a slightly more complex example, calculating the mean of one of the columns (which is still a relatively simple/fast function, I think). With master, this gives
with libreduction disabled, I get:
So still slower, but no longer a 3-4x slowdown. (it's probably useful to see if those numbers are similar on different machines) |
@jbrockmendel did #42992 close this? |
Only half of it |
Closed by #43189 |
xref some discussion #40171 (comment)
libreduction and the associated callers are a disproportionate maintenance headache [citation needed]. It would be nice to be able to rip it out and just have one path for those methods, but that would entail a non-trivial performance hit. Recently, though, we've managed to optimize the pure-python path a bit, and im optimistic we can shave off some more of the difference.
The question: how much of a perf penalty are we willing to accept in order to remove libreduction?
Copying from #40171 (comment)
The text was updated successfully, but these errors were encountered: