Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support persistent connections #251

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

hennevogel
Copy link
Collaborator

Rebase of #241

This influxdb client is not using persistent connections.  This causes a
significant performance loss.  Below is a benchmark script that writes a
single point to influxdb.  Running it takes 44 seconds of time:

    $ ruby -Ilib t.rb
    Writing 1000 points
                       user     system      total        real
    write_points  19.270388   1.394931  20.665319 ( 44.622170)

This works out to a write speed of about 22 points per second, or about
50ms per point.

This is:
* Pretty slow.  If you are generating points quickly maximum resolution
  of 50ms is not great.  A user may lose data because they cannot sample
  quickly enough.
* Pretty inefficient.  For each point written the client library must
  establish a TCP connection, establish a TLS session, finally write
  some data.  Any data written may be restricted by the TCP slow-start
  windowing algorithm.

  Much more ruby code must be run, almost half the time spent is in the
  "user" category, time that could be doing anything else in a ruby
  application, or time that could be used by other processes.

This commit caches HTTP connections across requests.  Running the same
benchmark takes 4.26 seconds:

    $ ruby -Ilib t.rb
    Writing 1000 points
                       user     system      total        real
    write_points   0.551663   0.084603   0.636266 (  4.261201)

This works out to a speed of 234 points per second, or about 5ms per
point.  Writing points now no longer need to recreate a TCP connection,
renegotiate a TLS session, or be held up by TCP window sizing
limitations.

This is much more efficient in terms of CPU time used per point, instead
of ~46% of time taken occurring in the "user" category, now only 13% of
time is in "user".  The balance of the time is now spent waiting for IO
to complete.

Benchmark script:

    require "benchmark"
    require "influxdb"

    n = Integer ENV["N"] rescue 1_000

    influxdb =
      InfluxDB::Client.new url: ENV["INFLUX_URL"],
                           username: ENV["INFLUX_USER"],
                           password: ENV["INFLUX_PASS"],
                           time_precision: "u"

    def write_point counter, influxdb
      points = [
        {
          series: "test",
          values: {
            counter: counter,
          },
        },
      ]

      influxdb.write_points points
    end

    puts "Writing #{n} points"

    Benchmark.bm 12 do |bm|
      bm.report "write_points" do
        n.times do |i|
          write_point i, influxdb
        end
      end
    end
Previously setting the log level to Logger::DEBUG would have no effect
as the log level of the Logger object was not changed.

If you set:

    InfluxDB::Logging.log_level = Logger::DEBUG

InfluxDB::Logging::log? would return true, allowing the log statement to
proceed, but #log would not do anything because the Logger object was
still at its default from initialization, Logger::INFO.

By setting the log level directly on the Logger object and removing
::log? we allow the Logger object to determine if a log level needs to
be logged or not.

This allows debug-level log messages to be displayed.
With the addition of persistent connections we need to make a separate
InfluxDB::Client per worker as the persistent connection is attached to
the client.

Here this is done by copying the InfluxDB::Config object and creating a
new host queue (so threads don't have to communicate to share hosts) and
creating a new client without async enabled (so it will use HTTP or UDP
write methods).
@hennevogel hennevogel force-pushed the feature/persistent-connections branch from 20a444d to 47b8674 Compare February 6, 2021 18:39
Let's not split this central method.
@hennevogel hennevogel force-pushed the feature/persistent-connections branch from 47b8674 to c3f1aa6 Compare February 6, 2021 18:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants