Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optimize XML serialization #55

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

ThomasSevestre
Copy link

This PR improves performance of XML serialization :

  • by 15% without active support
  • by 25% when active support is loaded

This has a visible impact in our application using ActiveRecord. I have a 15% overall gain.

I'm using this script to benchmark :

# require "active_support/all"
require "benchmark"
module Xlsxtream
  n = 500000
  Benchmark.bm do |x|
    x.report { for i in 1..n; Row.new(['foo', nil, 23, 'foo', nil, 23, 'foo', nil, 23, 'foo', nil, 23], 1).to_xml; end }
  end
end

@julik
Copy link
Collaborator

julik commented Jan 1, 2025

@ThomasSevestre this looks amazing but I believe extending core classes in a "leaf" library is not a great idea. Can we achieve the same with a module in the gem that would contain all those helper methods, without core namespace alterations?

@felixbuenemann
Copy link
Owner

felixbuenemann commented Jan 3, 2025

I’m against patching core classes, too. I’m also curious as to why this is faster than the existing code. Would you care to explain?

@ThomasSevestre
Copy link
Author

ThomasSevestre commented Jan 3, 2025

I do not like patching core classes as well... I'm not sure it is possible to achieve the same performance in a different way.

In current code, the expensive part is the big case when :

case value
when Numeric
when TrueClass, FalseClass
when Time
when DateTime
when Date
else
end

It is using === operator multiple times on value.
This operator is overriden by active support, which explains why it has more impact when the gem is loaded.
Switching to a simple method call is faster because method calls are highly optimized by ruby core team.

This trick is also used by rails teams, here for instance : rails/rails@d36eb239

@julik
Copy link
Collaborator

julik commented Jan 3, 2025

The Rails team is at liberty adding methods to core classes, but this is not the liberty authors of "leaf" libraries have - just due to the adoption.

@ThomasSevestre
Copy link
Author

Yes. I'm not trying to advocate for core class patches, just explaining the performance gain.

It may be acceptable to load this optimisation explicitly ?

Like this for instance :

require "xlsxstream"
require "xlsxstream/core_extension"

@ThomasSevestre
Copy link
Author

Another solution would be to optimize for String and nil which are probably the most common types.

It would look like this :

case value
when nil
  # noop
when String
when Numeric
when TrueClass, FalseClass
when Time
when DateTime
when Date
else
  # recursive call to handle value.to_s
end

It will ensure good performance for common types.

What do you think ?

@felixbuenemann
Copy link
Owner

@ThomasSevestre I like the idea with the optimized case statement. Could you benchmark that version?
Also is “when nil” faster then “when NilClass”, which would look nicer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants