Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cqlengine: Remove deepcopy on UserType deserialization #277

Merged

Conversation

k0machi
Copy link

@k0machi k0machi commented Dec 1, 2023

This change makes it so newly instanced UserType during deserialization
isn't immediately copied by deepcopy, which could cause huge slowdown if
that UserType contains a lot of data or nested UserTypes, in which case the
deepcopy calls would cascade as each to_python call would eventually clone
parts of source object. As there isn't a lot of information on why this
deepcopy is here in the first place this change could potentially break
something. Running integration tests against this commit does not produce
regressions, so this call looks safe to remove, but I'm leaving this
warning here for the future reference.

Fixes #152

This change makes it so newly instanced UserType during deserialization
isn't immediately copied by deepcopy, which could cause huge slowdown if
that UserType contains a lot of data or nested UserTypes, in which case the
deepcopy calls would cascade as each to_python call would eventually clone
parts of source object. As there isn't a lot of information on why this
deepcopy is here in the first place this change could potentially break
something. Running integration tests against this commit does not produce
regressions, so this call looks safe to remove, but I'm leaving this
warning here for the future reference.

Fixes scylladb#152
@k0machi k0machi self-assigned this Dec 1, 2023
@k0machi k0machi requested review from fruch and Lorak-mmk December 1, 2023 15:01
@Lorak-mmk
Copy link

Do you think you could open this PR against upstream too (datastax/python-driver)? I'd like to see if they see any reason not to merge this.

@fruch
Copy link

fruch commented Dec 3, 2023

Do you think you could open this PR against upstream too (datastax/python-driver)? I'd like to see if they see any reason not to merge this.

FYI, no one responded on the issue that was opened:
https://datastax-oss.atlassian.net/browse/PYTHON-1309

@k0machi
Copy link
Author

k0machi commented Dec 4, 2023

Opened in upstream: datastax#1192

@fruch
Copy link

fruch commented Dec 19, 2023

@Lorak-mmk

There was no response from upstream people yet

I'd say we will merge this performance improvement, since our testing didn't show it breaking anything.

@mykaul
Copy link

mykaul commented Dec 19, 2023

@Lorak-mmk

There was no response from upstream people yet

I'd say we will merge this performance improvement, since our testing didn't show it breaking anything.

@fruch
Copy link

fruch commented Dec 19, 2023

@Lorak-mmk
There was no response from upstream people yet
I'd say we will merge this performance improvement, since our testing didn't show it breaking anything.

@k0machi was talking about other places there a deepcopy, he wanted to remove, not this PR

  • Do we happen to have performance numbers for this change?

@k0machi did a manual test, with Argus calls, and shown a big improvement across the board.

we don't have a specific performance suite for this driver, especially not for cqlengine.

@mykaul
Copy link

mykaul commented Dec 19, 2023

we don't have a specific performance suite for this driver, especially not for cqlengine.

A before/after would hopefully suffice, showing reduced latency (I've seen elsewhere in the threads a reduction of seconds - #152 (comment) ?)

@k0machi
Copy link
Author

k0machi commented Dec 19, 2023

we don't have a specific performance suite for this driver, especially not for cqlengine.

A before/after would hopefully suffice, showing reduced latency (I've seen elsewhere in the threads a reduction of seconds - #152 (comment) ?)

  • Before
    profile-1

  • After
    profile2

It doesn't quite show the latency numbers, but this request went down from ~9s avg to 750ms average

@mykaul
Copy link

mykaul commented Dec 19, 2023

this request went down from ~9s avg to 750ms average

Thanks - this is what I was hoping to see. Flamegraphs are great, but perf results are more important.

Copy link

@fruch fruch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the information we do have, I think it's safe enough to merge
And it improve perf in quite a lot

@mykaul
Copy link

mykaul commented Dec 19, 2023

I believe we have a release pending this week - if so, I'd wait after the release, instead of sending it in the last minute.

@mykaul
Copy link

mykaul commented Jan 16, 2024

@avelanarius - can we merge it? The lack of response from usptream is disappointing, but should not hold us back.

@avelanarius avelanarius merged commit 70860b0 into scylladb:master Jan 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Speed bottleneck on deepcopy
5 participants