Benchmarking performance #247

michaellenaghan · 2025-03-15T00:10:35Z

michaellenaghan
Mar 15, 2025

I've created a benchmark for Go SQLite drivers.

At the moment, it includes ncruces, modernc, mattn, and tailscale.

For ncruces it benchmarks both "direct" and "driver" performance.

(All of the "driver" implementations use exactly the same code for everything other than opening the database.)

I hope to push the repo next week, but in the meantime here are some results.

I've broken them into sections:

Baseline
Populate
ReadWrite
Query

"Baseline" measures the performance of a "SELECT 1" query.

"Populate" measures populating a database of posts and comments with no transactions; one transaction for the entire population; and a transaction per post + comments.

"ReadWrite" measures various combinations of reading and writing posts and comments, with and without transactions.

"Query" measures various complicated queries.

TL;DR

ncruces does very, very well on all of the "normal" database stuff
ncruces sometimes struggles on "complicated queries"
other drivers sometimes struggle on "complicated queries" too
the "driver" version of ncruces typically uses a lot more memory than the "direct" version of ncruces

Baseline

goos: darwin
goarch: arm64
pkg: github.com/michaellenaghan/go-sqlite-bench
cpu: Apple M2 Pro
                   │ gsb_ncruces_direct_bench.txt │    gsb_ncruces_driver_bench.txt     │     gsb_modernc_driver_bench.txt     │     gsb_mattn_driver_bench.txt     │   gsb_tailscale_driver_bench.txt   │
                   │            sec/op            │    sec/op     vs base               │    sec/op     vs base                │   sec/op     vs base               │   sec/op     vs base               │
Baseline/Select1                      1.980µ ± 5%    1.946µ ± 1%        ~ (p=0.190 n=6)    1.497µ ± 1%   -24.44% (p=0.002 n=6)   1.198µ ± 0%  -39.54% (p=0.002 n=6)   1.117µ ± 1%  -43.63% (p=0.002 n=6)
Baseline/Select1-2                   1092.5n ± 3%   1014.0n ± 1%   -7.19% (p=0.002 n=6)   1072.0n ± 1%         ~ (p=0.327 n=6)   736.4n ± 1%  -32.60% (p=0.002 n=6)   643.2n ± 0%  -41.12% (p=0.002 n=6)
Baseline/Select1-4                    645.9n ± 5%   1041.5n ± 2%  +61.26% (p=0.002 n=6)    989.7n ± 1%   +53.24% (p=0.002 n=6)   711.1n ± 1%  +10.11% (p=0.002 n=6)   635.9n ± 1%        ~ (p=0.310 n=6)
Baseline/Select1-8                    1.174µ ± 0%    1.279µ ± 1%   +8.95% (p=0.002 n=6)    3.337µ ± 1%  +184.32% (p=0.002 n=6)   1.276µ ± 1%   +8.73% (p=0.002 n=6)   1.101µ ± 2%   -6.18% (p=0.002 n=6)
geomean                               1.132µ         1.273µ       +12.51%                  1.517µ        +34.06%                 945.8n       -16.42%                 842.1n       -25.59%

                   │ gsb_ncruces_direct_bench.txt │    gsb_ncruces_driver_bench.txt     │    gsb_modernc_driver_bench.txt    │     gsb_mattn_driver_bench.txt     │   gsb_tailscale_driver_bench.txt    │
                   │             B/op             │    B/op      vs base                │    B/op      vs base               │    B/op      vs base               │    B/op      vs base                │
Baseline/Select1                       80.00 ± 0%    32.00 ± 0%   -60.00% (p=0.002 n=6)    68.00 ± 0%  -15.00% (p=0.002 n=6)   128.00 ± 0%  +60.00% (p=0.002 n=6)   240.00 ± 0%  +200.00% (p=0.002 n=6)
Baseline/Select1-2                     84.00 ± 5%    32.00 ± 0%   -61.90% (p=0.002 n=6)    68.00 ± 0%  -19.05% (p=0.002 n=6)   128.00 ± 0%  +52.38% (p=0.002 n=6)   240.00 ± 0%  +185.71% (p=0.002 n=6)
Baseline/Select1-4                     92.00 ± 1%   375.00 ± 1%  +307.61% (p=0.002 n=6)   134.00 ± 1%  +45.65% (p=0.002 n=6)   150.00 ± 0%  +63.04% (p=0.002 n=6)   254.00 ± 0%  +176.09% (p=0.002 n=6)
Baseline/Select1-8                     111.0 ± 0%   1054.5 ± 1%  +850.00% (p=0.002 n=6)    196.5 ± 1%  +77.03% (p=0.002 n=6)    155.0 ± 0%  +39.64% (p=0.002 n=6)    273.0 ± 0%  +145.95% (p=0.002 n=6)
geomean                                91.02         141.9        +55.86%                  105.0       +15.41%                  139.7       +53.49%                  251.4       +176.21%

                   │ gsb_ncruces_direct_bench.txt │     gsb_ncruces_driver_bench.txt     │    gsb_modernc_driver_bench.txt    │     gsb_mattn_driver_bench.txt     │   gsb_tailscale_driver_bench.txt   │
                   │          allocs/op           │ allocs/op   vs base                  │ allocs/op   vs base                │ allocs/op   vs base                │ allocs/op   vs base                │
Baseline/Select1                       2.000 ± 0%   1.000 ± 0%   -50.00% (p=0.002 n=6)     4.000 ± 0%  +100.00% (p=0.002 n=6)   7.000 ± 0%  +250.00% (p=0.002 n=6)   5.000 ± 0%  +150.00% (p=0.002 n=6)
Baseline/Select1-2                     2.000 ± 0%   1.000 ± 0%   -50.00% (p=0.002 n=6)     4.000 ± 0%  +100.00% (p=0.002 n=6)   7.000 ± 0%  +250.00% (p=0.002 n=6)   5.000 ± 0%  +150.00% (p=0.002 n=6)
Baseline/Select1-4                     1.000 ± 0%   1.000 ± 0%         ~ (p=1.000 n=6) ¹   5.000 ± 0%  +400.00% (p=0.002 n=6)   7.000 ± 0%  +600.00% (p=0.002 n=6)   5.000 ± 0%  +400.00% (p=0.002 n=6)
Baseline/Select1-8                     1.000 ± 0%   3.000 ± 0%  +200.00% (p=0.002 n=6)     6.000 ± 0%  +500.00% (p=0.002 n=6)   7.000 ± 0%  +600.00% (p=0.002 n=6)   5.000 ± 0%  +400.00% (p=0.002 n=6)
geomean                                1.414        1.316         -6.94%                   4.681       +230.98%                 7.000       +394.97%                 5.000       +253.55%
¹ all samples are equal

Populate

goos: darwin
goarch: arm64
pkg: github.com/michaellenaghan/go-sqlite-bench
cpu: Apple M2 Pro
                             │ gsb_ncruces_direct_bench.txt │   gsb_ncruces_driver_bench.txt    │   gsb_modernc_driver_bench.txt    │     gsb_mattn_driver_bench.txt     │   gsb_tailscale_driver_bench.txt   │
                             │            sec/op            │   sec/op     vs base              │   sec/op    vs base               │   sec/op     vs base               │   sec/op     vs base               │
Populate/PopulateDB-4                           5.042 ±  5%   4.942 ±  5%  -1.98% (p=0.041 n=6)   5.979 ± 2%  +18.58% (p=0.002 n=6)   5.029 ±  6%        ~ (p=0.818 n=6)   4.893 ±  4%   -2.96% (p=0.041 n=6)
Populate/PopulateDBWithTx-4                     6.566 ± 12%   6.258 ±  8%  -4.68% (p=0.009 n=6)   7.096 ± 3%        ~ (p=0.065 n=6)   5.779 ± 22%  -11.99% (p=0.041 n=6)   5.667 ±  7%  -13.69% (p=0.002 n=6)
Populate/PopulateDBWithTxs-4                    5.119 ± 11%   4.710 ± 31%       ~ (p=0.310 n=6)   5.453 ± 4%   +6.53% (p=0.041 n=6)   4.626 ± 23%        ~ (p=0.310 n=6)   4.355 ± 13%  -14.91% (p=0.026 n=6)
geomean                                         5.534         5.262        -4.92%                 6.139       +10.93%                 5.123         -7.42%                 4.943        -10.68%

                             │ gsb_ncruces_direct_bench.txt │      gsb_ncruces_driver_bench.txt      │      gsb_modernc_driver_bench.txt       │         gsb_mattn_driver_bench.txt          │    gsb_tailscale_driver_bench.txt     │
                             │             B/op             │      B/op       vs base                │      B/op       vs base                 │       B/op        vs base                   │     B/op       vs base                │
Populate/PopulateDB-4                        5.557Mi ±   1%   12.466Mi ±  0%  +124.33% (p=0.002 n=6)    10.720Mi ± 1%    +92.91% (p=0.002 n=6)    2043.990Mi ± 0%   +36683.12% (p=0.002 n=6)    7.605Mi ± 1%   +36.86% (p=0.002 n=6)
Populate/PopulateDBWithTx-4                  715.4Ki ± 192%   7789.7Ki ± 18%  +988.91% (p=0.002 n=6)   10975.9Ki ± 0%  +1434.30% (p=0.002 n=6)   2093051.8Ki ± 0%  +292484.26% (p=0.002 n=6)   7788.1Ki ± 0%  +988.69% (p=0.002 n=6)
Populate/PopulateDBWithTxs-4                 979.2Ki ±   0%   9222.6Ki ±  1%  +841.88% (p=0.002 n=6)   11965.0Ki ± 1%  +1121.95% (p=0.002 n=6)   2094523.7Ki ± 0%  +213807.67% (p=0.002 n=6)   9180.2Ki ± 1%  +837.55% (p=0.002 n=6)
geomean                                      1.548Mi           9.488Mi        +512.76%                   11.03Mi        +612.48%                     1.997Gi       +131940.95%                  8.034Mi       +418.87%

                             │ gsb_ncruces_direct_bench.txt │     gsb_ncruces_driver_bench.txt     │     gsb_modernc_driver_bench.txt      │      gsb_mattn_driver_bench.txt      │    gsb_tailscale_driver_bench.txt    │
                             │          allocs/op           │  allocs/op    vs base                │  allocs/op    vs base                 │  allocs/op    vs base                │  allocs/op    vs base                │
Populate/PopulateDB-4                           204.6k ± 0%    411.6k ± 2%  +101.21% (p=0.002 n=6)    562.5k ± 1%   +174.95% (p=0.002 n=6)    411.0k ± 2%  +100.91% (p=0.002 n=6)    252.4k ± 3%   +23.40% (p=0.002 n=6)
Populate/PopulateDBWithTx-4                     45.41k ± 1%   252.45k ± 0%  +455.93% (p=0.002 n=6)   562.45k ± 0%  +1138.59% (p=0.002 n=6)   411.01k ± 0%  +805.09% (p=0.002 n=6)   252.44k ± 0%  +455.90% (p=0.002 n=6)
Populate/PopulateDBWithTxs-4                    53.90k ± 0%   285.96k ± 0%  +430.57% (p=0.002 n=6)   583.48k ± 0%   +982.57% (p=0.002 n=6)   449.34k ± 0%  +733.69% (p=0.002 n=6)   281.41k ± 0%  +422.12% (p=0.002 n=6)
geomean                                         79.41k         309.7k       +290.06%                  569.4k        +617.05%                  423.4k       +433.21%                  261.7k       +229.63%

ReadWrite

goos: darwin
goarch: arm64
pkg: github.com/michaellenaghan/go-sqlite-bench
cpu: Apple M2 Pro
                                                           │ gsb_ncruces_direct_bench.txt │    gsb_ncruces_driver_bench.txt     │     gsb_modernc_driver_bench.txt      │      gsb_mattn_driver_bench.txt      │    gsb_tailscale_driver_bench.txt    │
                                                           │            sec/op            │    sec/op     vs base               │    sec/op      vs base                │    sec/op      vs base               │    sec/op      vs base               │
ReadWrite/ReadPost                                                        22.88µ ±     2%   29.55µ ±  2%  +29.13% (p=0.002 n=6)    54.11µ ± 32%  +136.45% (p=0.002 n=6)    21.88µ ±  1%   -4.41% (p=0.002 n=6)   19.97µ ±   1%  -12.76% (p=0.002 n=6)
ReadWrite/ReadPost-2                                                      15.72µ ±     1%   17.99µ ±  0%  +14.45% (p=0.002 n=6)    32.58µ ±  9%  +107.28% (p=0.002 n=6)    14.08µ ±  2%  -10.43% (p=0.002 n=6)   14.24µ ±   1%   -9.42% (p=0.002 n=6)
ReadWrite/ReadPost-4                                                      11.80µ ±     1%   13.40µ ±  9%  +13.55% (p=0.002 n=6)    22.96µ ± 10%   +94.62% (p=0.002 n=6)    13.33µ ±  0%  +13.02% (p=0.002 n=6)   13.42µ ±   0%  +13.77% (p=0.002 n=6)
ReadWrite/ReadPost-8                                                      21.76µ ±     1%   21.77µ ±  1%        ~ (p=0.394 n=6)    30.75µ ±  1%   +41.32% (p=0.002 n=6)    22.49µ ±  1%   +3.37% (p=0.002 n=6)   22.27µ ±   0%   +2.34% (p=0.002 n=6)
ReadWrite/ReadPostWithTx                                                  25.65µ ±     1%   31.63µ ±  2%  +23.30% (p=0.002 n=6)    39.56µ ±  2%   +54.20% (p=0.002 n=6)    22.10µ ±  2%  -13.85% (p=0.002 n=6)   33.87µ ±   3%  +32.04% (p=0.002 n=6)
ReadWrite/ReadPostWithTx-2                                                16.39µ ±     1%   18.90µ ±  2%  +15.36% (p=0.002 n=6)    30.85µ ±  4%   +88.27% (p=0.002 n=6)    19.35µ ±  1%  +18.12% (p=0.002 n=6)   19.93µ ±   2%  +21.66% (p=0.002 n=6)
ReadWrite/ReadPostWithTx-4                                                11.66µ ±     2%   13.41µ ± 14%  +15.00% (p=0.002 n=6)    21.74µ ±  2%   +86.40% (p=0.002 n=6)    12.70µ ±  1%   +8.92% (p=0.002 n=6)   15.77µ ±   1%  +35.19% (p=0.002 n=6)
ReadWrite/ReadPostWithTx-8                                                21.89µ ±     0%   21.06µ ±  1%   -3.82% (p=0.002 n=6)    33.76µ ±  1%   +54.22% (p=0.002 n=6)    24.42µ ±  1%  +11.55% (p=0.002 n=6)   29.43µ ±   1%  +34.44% (p=0.002 n=6)
ReadWrite/ReadPostAndComments                                             518.4µ ±     0%   559.2µ ±  0%   +7.87% (p=0.002 n=6)   1243.0µ ±  5%  +139.78% (p=0.002 n=6)    516.8µ ±  1%        ~ (p=0.394 n=6)   498.1µ ±   4%   -3.92% (p=0.002 n=6)
ReadWrite/ReadPostAndComments-2                                           342.7µ ±     0%   382.4µ ±  0%  +11.57% (p=0.002 n=6)    608.4µ ±  1%   +77.52% (p=0.002 n=6)    387.0µ ±  0%  +12.92% (p=0.002 n=6)   354.6µ ±   2%   +3.48% (p=0.002 n=6)
ReadWrite/ReadPostAndComments-4                                           228.8µ ±     9%   237.0µ ±  5%   +3.60% (p=0.041 n=6)    332.8µ ±  1%   +45.48% (p=0.002 n=6)    248.4µ ±  3%   +8.55% (p=0.026 n=6)   258.2µ ±   1%  +12.87% (p=0.002 n=6)
ReadWrite/ReadPostAndComments-8                                           395.2µ ±     1%   383.0µ ±  1%   -3.08% (p=0.002 n=6)    456.1µ ±  3%   +15.39% (p=0.002 n=6)    397.3µ ±  1%        ~ (p=0.132 n=6)   389.5µ ±   1%   -1.44% (p=0.004 n=6)
ReadWrite/ReadPostAndCommentsWithTx                                       517.9µ ±     1%   499.2µ ±  1%   -3.62% (p=0.002 n=6)    997.7µ ±  3%   +92.63% (p=0.002 n=6)    412.9µ ±  2%  -20.28% (p=0.002 n=6)   525.7µ ±   4%   +1.49% (p=0.004 n=6)
ReadWrite/ReadPostAndCommentsWithTx-2                                     342.3µ ±     0%   386.0µ ±  0%  +12.78% (p=0.002 n=6)    613.6µ ±  1%   +79.26% (p=0.002 n=6)    390.6µ ±  0%  +14.11% (p=0.002 n=6)   382.7µ ±   4%  +11.81% (p=0.002 n=6)
ReadWrite/ReadPostAndCommentsWithTx-4                                     226.5µ ±     1%   241.0µ ±  6%   +6.40% (p=0.002 n=6)    338.3µ ±  1%   +49.36% (p=0.002 n=6)    251.1µ ±  2%  +10.88% (p=0.002 n=6)   299.2µ ±  13%  +32.10% (p=0.002 n=6)
ReadWrite/ReadPostAndCommentsWithTx-8                                     395.0µ ±     1%   388.7µ ±  1%   -1.61% (p=0.002 n=6)    466.0µ ±  2%   +17.97% (p=0.002 n=6)    405.4µ ±  6%   +2.62% (p=0.002 n=6)   391.8µ ±   0%   -0.81% (p=0.004 n=6)
ReadWrite/WritePost                                                       213.1µ ±     3%   231.4µ ±  5%   +8.60% (p=0.002 n=6)    233.2µ ± 13%    +9.45% (p=0.002 n=6)    220.0µ ±  2%   +3.23% (p=0.015 n=6)   215.2µ ±   6%        ~ (p=0.589 n=6)
ReadWrite/WritePost-2                                                     199.0µ ±    11%   188.2µ ± 14%   -5.43% (p=0.041 n=6)    221.8µ ± 16%   +11.42% (p=0.015 n=6)    212.0µ ±  7%        ~ (p=0.065 n=6)   200.5µ ±   1%        ~ (p=0.818 n=6)
ReadWrite/WritePost-4                                                     199.1µ ±     4%   199.1µ ± 18%        ~ (p=0.589 n=6)    222.9µ ± 14%   +11.96% (p=0.002 n=6)    215.0µ ± 20%   +7.99% (p=0.002 n=6)   200.9µ ±   8%        ~ (p=0.485 n=6)
ReadWrite/WritePost-8                                                     199.8µ ±     3%   212.1µ ± 23%   +6.17% (p=0.004 n=6)    252.9µ ± 11%   +26.57% (p=0.002 n=6)    223.5µ ± 16%  +11.89% (p=0.002 n=6)   204.4µ ±   3%        ~ (p=0.240 n=6)
ReadWrite/WritePostWithTx                                                 223.9µ ±     1%   233.3µ ±  5%   +4.19% (p=0.015 n=6)    239.3µ ± 20%    +6.89% (p=0.002 n=6)    223.0µ ± 20%        ~ (p=0.589 n=6)   204.7µ ±   2%   -8.60% (p=0.002 n=6)
ReadWrite/WritePostWithTx-2                                               199.1µ ±     1%   194.5µ ±  2%   -2.32% (p=0.026 n=6)    234.3µ ±  1%   +17.63% (p=0.002 n=6)    218.8µ ±  9%   +9.85% (p=0.002 n=6)   203.0µ ±   9%   +1.94% (p=0.002 n=6)
ReadWrite/WritePostWithTx-4                                               205.0µ ±    15%   201.7µ ±  1%   -1.60% (p=0.009 n=6)    248.6µ ± 16%   +21.25% (p=0.004 n=6)    229.7µ ±  4%        ~ (p=0.065 n=6)   218.8µ ±   7%        ~ (p=0.180 n=6)
ReadWrite/WritePostWithTx-8                                               207.5µ ±     2%   216.9µ ±  4%   +4.50% (p=0.009 n=6)    326.4µ ± 27%   +57.29% (p=0.002 n=6)    227.7µ ± 11%   +9.73% (p=0.002 n=6)   236.2µ ±   8%  +13.83% (p=0.002 n=6)
ReadWrite/WritePostAndComments                                            5.004m ±     1%   5.133m ±  8%        ~ (p=0.394 n=6)    5.885m ±  2%   +17.61% (p=0.002 n=6)    5.565m ±  3%  +11.23% (p=0.002 n=6)   5.211m ±  10%   +4.15% (p=0.002 n=6)
ReadWrite/WritePostAndComments-2                                          4.217m ±     3%   4.339m ±  2%   +2.89% (p=0.009 n=6)    5.752m ± 12%   +36.41% (p=0.002 n=6)    5.427m ±  5%  +28.71% (p=0.002 n=6)   5.188m ±   3%  +23.04% (p=0.002 n=6)
ReadWrite/WritePostAndComments-4                                          4.416m ±     3%   4.355m ±  3%        ~ (p=0.310 n=6)    5.688m ±  5%   +28.81% (p=0.002 n=6)    5.244m ±  5%  +18.76% (p=0.002 n=6)   5.042m ±   5%  +14.18% (p=0.002 n=6)
ReadWrite/WritePostAndComments-8                                          4.633m ±     2%   4.780m ±  2%   +3.18% (p=0.026 n=6)    5.721m ± 25%   +23.48% (p=0.002 n=6)    5.709m ±  9%  +23.23% (p=0.002 n=6)   5.465m ±  30%  +17.97% (p=0.002 n=6)
ReadWrite/WritePostAndCommentsWithTx                                      4.347m ±     1%   4.192m ±  3%   -3.57% (p=0.004 n=6)    5.230m ±  2%   +20.29% (p=0.002 n=6)    4.685m ±  5%   +7.77% (p=0.002 n=6)   4.499m ±   8%   +3.49% (p=0.002 n=6)
ReadWrite/WritePostAndCommentsWithTx-2                                    3.257m ±    15%   3.169m ±  3%        ~ (p=0.485 n=6)    4.460m ± 14%   +36.95% (p=0.002 n=6)    3.897m ± 13%  +19.67% (p=0.002 n=6)   4.049m ± 243%  +24.32% (p=0.002 n=6)
ReadWrite/WritePostAndCommentsWithTx-4                                    3.302m ±     6%   3.134m ±  4%   -5.08% (p=0.009 n=6)    4.508m ±  8%   +36.51% (p=0.002 n=6)    4.292m ±  5%  +29.98% (p=0.002 n=6)   4.192m ± 188%  +26.94% (p=0.002 n=6)
ReadWrite/WritePostAndCommentsWithTx-8                                    3.507m ±     8%   3.305m ±  1%   -5.77% (p=0.002 n=6)    6.780m ± 24%   +93.32% (p=0.002 n=6)    5.221m ± 24%  +48.88% (p=0.002 n=6)   4.515m ±  78%  +28.75% (p=0.002 n=6)
ReadWrite/ReadOrWritePostAndComments/write_rate=10                        1.082m ±     5%   1.032m ±  9%        ~ (p=0.485 n=6)    1.776m ±  7%   +64.15% (p=0.002 n=6)    1.041m ±  7%        ~ (p=0.310 n=6)   1.097m ±   9%        ~ (p=0.485 n=6)
ReadWrite/ReadOrWritePostAndComments/write_rate=10-2                      737.7µ ±     7%   780.5µ ±  9%        ~ (p=0.240 n=6)   1045.1µ ±  8%   +41.66% (p=0.002 n=6)    765.0µ ± 11%        ~ (p=0.240 n=6)   697.3µ ±   6%   -5.48% (p=0.026 n=6)
ReadWrite/ReadOrWritePostAndComments/write_rate=10-4                      571.3µ ±     8%   592.5µ ± 11%        ~ (p=0.589 n=6)    831.8µ ±  8%   +45.58% (p=0.002 n=6)    733.6µ ±  7%  +28.40% (p=0.002 n=6)   767.2µ ±   8%  +34.28% (p=0.002 n=6)
ReadWrite/ReadOrWritePostAndComments/write_rate=10-8                      645.5µ ±     6%   648.3µ ±  5%        ~ (p=0.699 n=6)    887.2µ ±  7%   +37.44% (p=0.002 n=6)    760.7µ ± 14%  +17.84% (p=0.002 n=6)   723.7µ ±  85%  +12.10% (p=0.002 n=6)
ReadWrite/ReadOrWritePostAndComments/write_rate=90                        4.624m ±     4%   4.636m ±  2%        ~ (p=0.818 n=6)    5.745m ±  3%   +24.23% (p=0.002 n=6)    5.106m ±  5%  +10.42% (p=0.002 n=6)   4.790m ±  13%   +3.57% (p=0.026 n=6)
ReadWrite/ReadOrWritePostAndComments/write_rate=90-2                      3.941m ±     4%   3.970m ±  4%        ~ (p=0.485 n=6)    5.312m ±  3%   +34.80% (p=0.002 n=6)    5.018m ±  8%  +27.33% (p=0.002 n=6)   5.149m ±   7%  +30.66% (p=0.002 n=6)
ReadWrite/ReadOrWritePostAndComments/write_rate=90-4                      3.987m ±     3%   4.420m ± 15%  +10.84% (p=0.002 n=6)    5.262m ±  4%   +31.97% (p=0.002 n=6)    5.195m ±  8%  +30.29% (p=0.002 n=6)   5.229m ±  52%  +31.14% (p=0.002 n=6)
ReadWrite/ReadOrWritePostAndComments/write_rate=90-8                      4.229m ±     4%   4.319m ±  6%   +2.12% (p=0.004 n=6)    5.348m ± 12%   +26.46% (p=0.002 n=6)    5.151m ± 16%  +21.81% (p=0.002 n=6)   6.043m ±  99%  +42.91% (p=0.002 n=6)
ReadWrite/ReadOrWritePostAndCommentsWithTx/write_rate=10                  963.4µ ±     7%   978.2µ ±  4%        ~ (p=0.699 n=6)   1621.4µ ±  7%   +68.30% (p=0.002 n=6)   1017.5µ ± 15%        ~ (p=0.180 n=6)   958.0µ ±  13%        ~ (p=0.937 n=6)
ReadWrite/ReadOrWritePostAndCommentsWithTx/write_rate=10-2                605.9µ ±     7%   636.2µ ±  5%        ~ (p=0.093 n=6)    967.2µ ±  5%   +59.64% (p=0.002 n=6)    684.6µ ±  7%  +12.99% (p=0.002 n=6)   619.3µ ±   8%        ~ (p=0.394 n=6)
ReadWrite/ReadOrWritePostAndCommentsWithTx/write_rate=10-4                487.8µ ±    18%   441.6µ ±  8%        ~ (p=0.240 n=6)    641.6µ ±  8%   +31.55% (p=0.002 n=6)    573.4µ ±  7%  +17.55% (p=0.015 n=6)   639.8µ ±  16%  +31.17% (p=0.009 n=6)
ReadWrite/ReadOrWritePostAndCommentsWithTx/write_rate=10-8                568.5µ ±     6%   534.7µ ± 14%        ~ (p=0.310 n=6)    733.4µ ±  3%   +29.01% (p=0.002 n=6)    723.9µ ± 45%  +27.34% (p=0.002 n=6)   641.6µ ±  64%  +12.86% (p=0.002 n=6)
ReadWrite/ReadOrWritePostAndCommentsWithTx/write_rate=90                  4.420m ± 26125%   4.035m ± 14%        ~ (p=0.093 n=6)    5.053m ±  2%         ~ (p=0.065 n=6)    4.634m ±  5%        ~ (p=0.093 n=6)   4.296m ±  11%        ~ (p=0.394 n=6)
ReadWrite/ReadOrWritePostAndCommentsWithTx/write_rate=90-2                2.974m ±     6%   2.909m ± 19%        ~ (p=0.240 n=6)    4.004m ±  6%   +34.63% (p=0.002 n=6)    3.622m ±  5%  +21.80% (p=0.002 n=6)   4.212m ±  53%  +41.64% (p=0.002 n=6)
ReadWrite/ReadOrWritePostAndCommentsWithTx/write_rate=90-4                2.885m ±     2%   2.902m ±  4%        ~ (p=0.937 n=6)    3.999m ± 17%   +38.63% (p=0.002 n=6)    4.079m ±  8%  +41.40% (p=0.002 n=6)   4.162m ± 140%  +44.28% (p=0.002 n=6)
ReadWrite/ReadOrWritePostAndCommentsWithTx/write_rate=90-8                3.048m ±     5%   3.067m ± 20%        ~ (p=0.699 n=6)    4.801m ± 47%   +57.51% (p=0.002 n=6)    4.357m ± 18%  +42.96% (p=0.002 n=6)   4.756m ±  62%  +56.03% (p=0.002 n=6)
geomean                                                                   486.6µ            500.0µ         +2.75%                  702.4µ         +44.35%                  544.9µ        +11.98%                 549.0µ         +12.81%

                                                           │ gsb_ncruces_direct_bench.txt │      gsb_ncruces_driver_bench.txt       │      gsb_modernc_driver_bench.txt       │         gsb_mattn_driver_bench.txt         │     gsb_tailscale_driver_bench.txt     │
                                                           │             B/op             │      B/op       vs base                 │      B/op       vs base                 │      B/op        vs base                   │     B/op       vs base                 │
ReadWrite/ReadPost                                                         80.12Ki ±   0%    80.80Ki ±  0%     +0.85% (p=0.002 n=6)   160.62Ki ±  0%   +100.49% (p=0.002 n=6)      80.74Ki ± 0%       +0.78% (p=0.002 n=6)   80.88Ki ±  0%     +0.95% (p=0.002 n=6)
ReadWrite/ReadPost-2                                                       80.12Ki ±   0%    80.80Ki ±  0%     +0.85% (p=0.002 n=6)   160.63Ki ±  0%   +100.49% (p=0.002 n=6)      80.74Ki ± 0%       +0.78% (p=0.002 n=6)   80.88Ki ±  0%     +0.95% (p=0.002 n=6)
ReadWrite/ReadPost-4                                                       80.12Ki ±   0%    81.15Ki ±  0%     +1.29% (p=0.002 n=6)   160.64Ki ±  0%   +100.50% (p=0.002 n=6)      80.75Ki ± 0%       +0.78% (p=0.002 n=6)   80.88Ki ±  0%     +0.95% (p=0.002 n=6)
ReadWrite/ReadPost-8                                                       80.12Ki ±   0%    82.05Ki ±  0%     +2.41% (p=0.002 n=6)   160.67Ki ±  0%   +100.54% (p=0.002 n=6)      80.81Ki ± 0%       +0.86% (p=0.002 n=6)   80.90Ki ±  0%     +0.98% (p=0.002 n=6)
ReadWrite/ReadPostWithTx                                                   80.15Ki ±   0%    81.52Ki ±  0%     +1.71% (p=0.002 n=6)   161.17Ki ±  0%   +101.09% (p=0.002 n=6)      81.48Ki ± 0%       +1.66% (p=0.002 n=6)   81.42Ki ±  0%     +1.59% (p=0.002 n=6)
ReadWrite/ReadPostWithTx-2                                                 80.15Ki ±   0%    81.52Ki ±  0%     +1.72% (p=0.002 n=6)   161.18Ki ±  0%   +101.11% (p=0.002 n=6)      81.48Ki ± 0%       +1.67% (p=0.002 n=6)   81.43Ki ±  0%     +1.60% (p=0.002 n=6)
ReadWrite/ReadPostWithTx-4                                                 80.15Ki ±   0%    81.73Ki ±  0%     +1.97% (p=0.002 n=6)   161.20Ki ±  0%   +101.12% (p=0.002 n=6)      81.49Ki ± 0%       +1.68% (p=0.002 n=6)   81.45Ki ±  0%     +1.63% (p=0.002 n=6)
ReadWrite/ReadPostWithTx-8                                                 80.12Ki ±   0%    82.48Ki ±  0%     +2.95% (p=0.002 n=6)   161.23Ki ±  0%   +101.24% (p=0.002 n=6)      81.57Ki ± 0%       +1.81% (p=0.002 n=6)   81.53Ki ±  0%     +1.76% (p=0.002 n=6)
ReadWrite/ReadPostAndComments                                              2.033Mi ±   0%    2.037Mi ±  0%     +0.17% (p=0.002 n=6)    4.069Mi ±  0%   +100.10% (p=0.002 n=6)      2.037Mi ± 0%       +0.16% (p=0.002 n=6)   2.037Mi ±  0%     +0.19% (p=0.002 n=6)
ReadWrite/ReadPostAndComments-2                                            2.033Mi ±   0%    2.037Mi ±  0%     +0.17% (p=0.002 n=6)    4.069Mi ±  0%   +100.10% (p=0.002 n=6)      2.037Mi ± 0%       +0.16% (p=0.002 n=6)   2.037Mi ±  0%     +0.19% (p=0.002 n=6)
ReadWrite/ReadPostAndComments-4                                            2.033Mi ±   0%    2.038Mi ±  0%     +0.22% (p=0.002 n=6)    4.069Mi ±  0%   +100.10% (p=0.002 n=6)      2.037Mi ± 0%       +0.16% (p=0.002 n=6)   2.037Mi ±  0%     +0.19% (p=0.002 n=6)
ReadWrite/ReadPostAndComments-8                                            2.033Mi ±   0%    2.039Mi ±  0%     +0.30% (p=0.002 n=6)    4.069Mi ±  0%   +100.11% (p=0.002 n=6)      2.037Mi ± 0%       +0.16% (p=0.002 n=6)   2.037Mi ±  0%     +0.19% (p=0.002 n=6)
ReadWrite/ReadPostAndCommentsWithTx                                        2.033Mi ±   0%    2.038Mi ±  0%     +0.21% (p=0.002 n=6)    4.070Mi ±  0%   +100.14% (p=0.002 n=6)      2.038Mi ± 0%       +0.21% (p=0.002 n=6)   2.038Mi ±  0%     +0.23% (p=0.002 n=6)
ReadWrite/ReadPostAndCommentsWithTx-2                                      2.033Mi ±   0%    2.038Mi ±  0%     +0.21% (p=0.002 n=6)    4.070Mi ±  0%   +100.15% (p=0.002 n=6)      2.038Mi ± 0%       +0.21% (p=0.002 n=6)   2.038Mi ±  0%     +0.23% (p=0.002 n=6)
ReadWrite/ReadPostAndCommentsWithTx-4                                      2.033Mi ±   0%    2.038Mi ±  0%     +0.22% (p=0.002 n=6)    4.070Mi ±  0%   +100.15% (p=0.002 n=6)      2.038Mi ± 0%       +0.21% (p=0.002 n=6)   2.038Mi ±  0%     +0.24% (p=0.002 n=6)
ReadWrite/ReadPostAndCommentsWithTx-8                                      2.033Mi ±   0%    2.039Mi ±  0%     +0.25% (p=0.002 n=6)    4.070Mi ±  0%   +100.15% (p=0.002 n=6)      2.038Mi ± 0%       +0.22% (p=0.002 n=6)   2.038Mi ±  0%     +0.24% (p=0.002 n=6)
ReadWrite/WritePost                                                          260.0 ±   0%      646.0 ±  0%   +148.46% (p=0.002 n=6)      344.0 ±  0%    +32.31% (p=0.002 n=6)      83264.0 ± 0%   +31924.62% (p=0.002 n=6)     432.0 ±  0%    +66.15% (p=0.002 n=6)
ReadWrite/WritePost-2                                                        362.5 ± 108%      710.5 ± 11%    +96.00% (p=0.041 n=6)      344.0 ±  0%          ~ (p=1.000 n=6)      83264.0 ± 0%   +22869.38% (p=0.002 n=6)     432.0 ±  0%          ~ (p=0.299 n=6)
ReadWrite/WritePost-4                                                        461.5 ±   2%     1115.5 ±  7%   +141.71% (p=0.002 n=6)      347.0 ±  0%    -24.81% (p=0.002 n=6)      83265.0 ± 0%   +17942.25% (p=0.002 n=6)     432.0 ±  0%     -6.39% (p=0.002 n=6)
ReadWrite/WritePost-8                                                        477.5 ±   1%     1857.0 ±  6%   +288.90% (p=0.002 n=6)      354.5 ±  0%    -25.76% (p=0.002 n=6)      83268.0 ± 0%   +17338.32% (p=0.002 n=6)     433.0 ±  0%     -9.32% (p=0.002 n=6)
ReadWrite/WritePostWithTx                                                    260.0 ±   0%     1046.0 ±  0%   +302.31% (p=0.002 n=6)      736.0 ±  0%   +183.08% (p=0.002 n=6)      83848.0 ± 0%   +32149.23% (p=0.002 n=6)     856.0 ±  0%   +229.23% (p=0.002 n=6)
ReadWrite/WritePostWithTx-2                                                  334.0 ±   1%     1120.5 ±  1%   +235.48% (p=0.002 n=6)      736.0 ±  0%   +120.36% (p=0.002 n=6)      83848.0 ± 0%   +25004.19% (p=0.002 n=6)     856.0 ±  0%   +156.29% (p=0.002 n=6)
ReadWrite/WritePostWithTx-4                                                  459.5 ±  92%     1576.0 ±  3%   +242.98% (p=0.002 n=6)      741.0 ±  0%          ~ (p=0.061 n=6)      83855.0 ± 0%   +18149.18% (p=0.002 n=6)     858.5 ±  0%          ~ (p=0.061 n=6)
ReadWrite/WritePostWithTx-8                                                  487.5 ±   1%     2172.0 ±  7%   +345.54% (p=0.002 n=6)      751.5 ±  1%    +54.15% (p=0.002 n=6)      83859.0 ± 0%   +17101.85% (p=0.002 n=6)     859.0 ±  0%    +76.21% (p=0.002 n=6)
ReadWrite/WritePostAndComments                                             5.036Ki ±   0%   13.232Ki ±  0%   +162.75% (p=0.002 n=6)   11.313Ki ±  0%   +124.65% (p=0.002 n=6)   2112.137Ki ± 0%   +41839.65% (p=0.002 n=6)   8.462Ki ±  0%    +68.02% (p=0.002 n=6)
ReadWrite/WritePostAndComments-2                                           6.607Ki ±   1%   14.869Ki ±  4%   +125.05% (p=0.002 n=6)   11.315Ki ±  0%    +71.26% (p=0.002 n=6)   2112.137Ki ± 0%   +31868.49% (p=0.002 n=6)   8.462Ki ±  0%    +28.08% (p=0.002 n=6)
ReadWrite/WritePostAndComments-4                                           8.132Ki ±   1%   23.381Ki ±  9%   +187.52% (p=0.002 n=6)   11.399Ki ±  0%    +40.18% (p=0.002 n=6)   2112.169Ki ± 0%   +25874.07% (p=0.002 n=6)   8.475Ki ±  0%     +4.22% (p=0.002 n=6)
ReadWrite/WritePostAndComments-8                                          10.662Ki ±   2%   45.283Ki ±  6%   +324.71% (p=0.002 n=6)   11.586Ki ±  0%     +8.67% (p=0.002 n=6)   2112.255Ki ± 0%   +19710.86% (p=0.002 n=6)   8.505Ki ±  0%    -20.23% (p=0.002 n=6)
ReadWrite/WritePostAndCommentsWithTx                                         336.0 ±   0%     8576.0 ±  0%  +2452.38% (p=0.002 n=6)    11425.0 ±  0%  +3300.30% (p=0.002 n=6)    2162864.0 ± 0%  +643609.52% (p=0.002 n=6)    8536.0 ±  0%  +2440.48% (p=0.002 n=6)
ReadWrite/WritePostAndCommentsWithTx-2                                       411.5 ±   9%     8651.0 ± 25%  +2002.31% (p=0.002 n=6)    11426.0 ±  0%  +2676.67% (p=0.002 n=6)    2162863.0 ± 0%  +525504.62% (p=0.002 n=6)    8537.5 ±  0%  +1974.73% (p=0.002 n=6)
ReadWrite/WritePostAndCommentsWithTx-4                                     1.255Ki ±   2%   15.523Ki ±  3%  +1136.52% (p=0.002 n=6)   11.229Ki ±  0%   +794.52% (p=0.002 n=6)   2112.191Ki ± 0%  +168152.31% (p=0.002 n=6)   8.359Ki ±  1%   +565.89% (p=0.002 n=6)
ReadWrite/WritePostAndCommentsWithTx-8                                     2.903Ki ±  10%   27.293Ki ±  4%   +840.08% (p=0.002 n=6)   11.451Ki ±  1%   +294.42% (p=0.002 n=6)   2112.250Ki ± 0%   +72652.91% (p=0.002 n=6)   8.377Ki ±  0%   +188.55% (p=0.002 n=6)
ReadWrite/ReadOrWritePostAndComments/write_rate=10                         1.826Mi ±   1%    1.841Mi ±  2%          ~ (p=0.699 n=6)    3.663Mi ±  3%   +100.56% (p=0.002 n=6)      2.039Mi ± 0%      +11.65% (p=0.002 n=6)   1.817Mi ±  1%          ~ (p=0.240 n=6)
ReadWrite/ReadOrWritePostAndComments/write_rate=10-2                       1.837Mi ±   1%    1.829Mi ±  2%          ~ (p=0.485 n=6)    3.662Mi ±  2%    +99.31% (p=0.002 n=6)      2.039Mi ± 0%      +10.99% (p=0.002 n=6)   1.836Mi ±  1%          ~ (p=0.818 n=6)
ReadWrite/ReadOrWritePostAndComments/write_rate=10-4                       1.829Mi ±   1%    1.839Mi ±  1%          ~ (p=0.394 n=6)    3.652Mi ±  1%    +99.64% (p=0.002 n=6)      2.039Mi ± 0%      +11.47% (p=0.002 n=6)   1.829Mi ±  1%          ~ (p=0.818 n=6)
ReadWrite/ReadOrWritePostAndComments/write_rate=10-8                       1.829Mi ±   1%    1.840Mi ±  1%          ~ (p=0.132 n=6)    3.680Mi ±  1%   +101.19% (p=0.002 n=6)      2.039Mi ± 0%      +11.51% (p=0.002 n=6)   1.833Mi ±  1%          ~ (p=0.699 n=6)
ReadWrite/ReadOrWritePostAndComments/write_rate=90                         198.2Ki ±  15%    213.9Ki ± 25%          ~ (p=0.699 n=6)    397.3Ki ± 23%   +100.49% (p=0.002 n=6)     2109.5Ki ± 0%     +964.50% (p=0.002 n=6)   220.0Ki ± 15%          ~ (p=0.132 n=6)
ReadWrite/ReadOrWritePostAndComments/write_rate=90-2                       209.3Ki ±  41%    217.0Ki ± 35%          ~ (p=0.818 n=6)    426.8Ki ± 37%   +103.87% (p=0.002 n=6)     2109.2Ki ± 0%     +907.58% (p=0.002 n=6)   218.3Ki ± 22%          ~ (p=0.699 n=6)
ReadWrite/ReadOrWritePostAndComments/write_rate=90-4                       218.8Ki ±  28%    256.8Ki ± 24%          ~ (p=0.240 n=6)    459.9Ki ± 29%   +110.16% (p=0.002 n=6)     2110.1Ki ± 0%     +864.17% (p=0.002 n=6)   221.5Ki ± 22%          ~ (p=0.699 n=6)
ReadWrite/ReadOrWritePostAndComments/write_rate=90-8                       229.7Ki ±  35%    270.4Ki ± 12%          ~ (p=0.132 n=6)    442.9Ki ± 20%    +92.81% (p=0.002 n=6)     2109.8Ki ± 0%     +818.43% (p=0.002 n=6)   219.7Ki ± 30%          ~ (p=0.394 n=6)
ReadWrite/ReadOrWritePostAndCommentsWithTx/write_rate=10                   1.829Mi ±   2%    1.824Mi ±  1%          ~ (p=0.818 n=6)    3.630Mi ±  2%    +98.48% (p=0.002 n=6)      2.040Mi ± 0%      +11.56% (p=0.002 n=6)   1.840Mi ±  1%          ~ (p=0.589 n=6)
ReadWrite/ReadOrWritePostAndCommentsWithTx/write_rate=10-2                 1.824Mi ±   1%    1.828Mi ±  1%          ~ (p=0.589 n=6)    3.658Mi ±  1%   +100.50% (p=0.002 n=6)      2.040Mi ± 0%      +11.84% (p=0.002 n=6)   1.832Mi ±  1%          ~ (p=0.589 n=6)
ReadWrite/ReadOrWritePostAndCommentsWithTx/write_rate=10-4                 1.822Mi ±   1%    1.844Mi ±  2%          ~ (p=0.132 n=6)    3.666Mi ±  2%   +101.15% (p=0.002 n=6)      2.040Mi ± 0%      +11.96% (p=0.002 n=6)   1.832Mi ±  1%          ~ (p=0.589 n=6)
ReadWrite/ReadOrWritePostAndCommentsWithTx/write_rate=10-8                 1.837Mi ±   1%    1.842Mi ±  1%          ~ (p=0.818 n=6)    3.664Mi ±  1%    +99.48% (p=0.002 n=6)      2.040Mi ± 0%      +11.08% (p=0.002 n=6)   1.836Mi ±  1%          ~ (p=0.818 n=6)
ReadWrite/ReadOrWritePostAndCommentsWithTx/write_rate=90                   234.5Ki ±  99%    226.0Ki ± 22%          ~ (p=0.699 n=6)    418.6Ki ± 39%    +78.49% (p=0.002 n=6)     2109.7Ki ± 0%     +799.53% (p=0.002 n=6)   188.9Ki ± 14%          ~ (p=0.065 n=6)
ReadWrite/ReadOrWritePostAndCommentsWithTx/write_rate=90-2                 219.1Ki ±  13%    201.5Ki ± 38%          ~ (p=0.589 n=6)    434.6Ki ± 16%    +98.42% (p=0.002 n=6)     2109.6Ki ± 0%     +863.06% (p=0.002 n=6)   211.5Ki ± 17%          ~ (p=0.485 n=6)
ReadWrite/ReadOrWritePostAndCommentsWithTx/write_rate=90-4                 215.4Ki ±  21%    226.3Ki ± 38%          ~ (p=0.699 n=6)    445.7Ki ± 11%   +106.88% (p=0.002 n=6)     2109.8Ki ± 0%     +879.38% (p=0.002 n=6)   203.6Ki ± 19%          ~ (p=0.485 n=6)
ReadWrite/ReadOrWritePostAndCommentsWithTx/write_rate=90-8                 221.0Ki ±  40%    230.3Ki ± 28%          ~ (p=0.485 n=6)    439.8Ki ± 39%    +99.02% (p=0.002 n=6)     2109.9Ki ± 0%     +854.69% (p=0.002 n=6)   231.5Ki ± 30%          ~ (p=0.589 n=6)
geomean                                                                    62.89Ki           106.0Ki          +68.56%                  135.2Ki         +114.94%                    710.4Ki         +1029.56%                 83.55Ki          +32.85%

                                                           │ gsb_ncruces_direct_bench.txt │     gsb_ncruces_driver_bench.txt      │     gsb_modernc_driver_bench.txt      │      gsb_mattn_driver_bench.txt       │    gsb_tailscale_driver_bench.txt     │
                                                           │          allocs/op           │  allocs/op    vs base                 │  allocs/op    vs base                 │  allocs/op    vs base                 │  allocs/op    vs base                 │
ReadWrite/ReadPost                                                            5.000 ±  0%    25.000 ± 0%   +400.00% (p=0.002 n=6)    30.000 ± 0%   +500.00% (p=0.002 n=6)    28.000 ± 0%   +460.00% (p=0.002 n=6)    25.000 ± 0%   +400.00% (p=0.002 n=6)
ReadWrite/ReadPost-2                                                          5.000 ±  0%    25.000 ± 0%   +400.00% (p=0.002 n=6)    30.000 ± 0%   +500.00% (p=0.002 n=6)    28.000 ± 0%   +460.00% (p=0.002 n=6)    25.000 ± 0%   +400.00% (p=0.002 n=6)
ReadWrite/ReadPost-4                                                          5.000 ±  0%    25.000 ± 4%   +400.00% (p=0.002 n=6)    30.000 ± 0%   +500.00% (p=0.002 n=6)    28.000 ± 0%   +460.00% (p=0.002 n=6)    25.000 ± 0%   +400.00% (p=0.002 n=6)
ReadWrite/ReadPost-8                                                          5.000 ±  0%    27.000 ± 0%   +440.00% (p=0.002 n=6)    30.000 ± 3%   +500.00% (p=0.002 n=6)    29.000 ± 0%   +480.00% (p=0.002 n=6)    25.000 ± 0%   +400.00% (p=0.002 n=6)
ReadWrite/ReadPostWithTx                                                      4.000 ±  0%    43.000 ± 0%   +975.00% (p=0.002 n=6)    41.000 ± 0%   +925.00% (p=0.002 n=6)    51.000 ± 0%  +1175.00% (p=0.002 n=6)    36.000 ± 0%   +800.00% (p=0.002 n=6)
ReadWrite/ReadPostWithTx-2                                                    4.000 ±  0%    43.000 ± 0%   +975.00% (p=0.002 n=6)    41.000 ± 0%   +925.00% (p=0.002 n=6)    51.000 ± 0%  +1175.00% (p=0.002 n=6)    36.000 ± 0%   +800.00% (p=0.002 n=6)
ReadWrite/ReadPostWithTx-4                                                    4.000 ±  0%    43.500 ± 1%   +987.50% (p=0.002 n=6)    41.000 ± 0%   +925.00% (p=0.002 n=6)    51.000 ± 0%  +1175.00% (p=0.002 n=6)    36.000 ± 0%   +800.00% (p=0.002 n=6)
ReadWrite/ReadPostWithTx-8                                                    5.000 ±  0%    45.000 ± 0%   +800.00% (p=0.002 n=6)    42.000 ± 2%   +740.00% (p=0.002 n=6)    52.000 ± 0%   +940.00% (p=0.002 n=6)    37.000 ± 0%   +640.00% (p=0.002 n=6)
ReadWrite/ReadPostAndComments                                                 86.00 ±  0%    241.00 ± 0%   +180.23% (p=0.002 n=6)    397.00 ± 0%   +361.63% (p=0.002 n=6)    248.00 ± 0%   +188.37% (p=0.002 n=6)    240.00 ± 0%   +179.07% (p=0.002 n=6)
ReadWrite/ReadPostAndComments-2                                               86.00 ±  0%    241.00 ± 0%   +180.23% (p=0.002 n=6)    398.00 ± 0%   +362.79% (p=0.002 n=6)    248.00 ± 0%   +188.37% (p=0.002 n=6)    241.00 ± 0%   +180.23% (p=0.002 n=6)
ReadWrite/ReadPostAndComments-4                                               86.00 ±  0%    243.50 ± 0%   +183.14% (p=0.002 n=6)    398.00 ± 0%   +362.79% (p=0.002 n=6)    248.00 ± 0%   +188.37% (p=0.002 n=6)    240.00 ± 0%   +179.07% (p=0.002 n=6)
ReadWrite/ReadPostAndComments-8                                               87.00 ±  0%    247.00 ± 2%   +183.91% (p=0.002 n=6)    398.00 ± 0%   +357.47% (p=0.002 n=6)    248.00 ± 0%   +185.06% (p=0.002 n=6)    240.00 ± 0%   +175.86% (p=0.002 n=6)
ReadWrite/ReadPostAndCommentsWithTx                                           85.00 ±  0%    265.00 ± 0%   +211.76% (p=0.002 n=6)    415.00 ± 0%   +388.24% (p=0.002 n=6)    277.00 ± 0%   +225.88% (p=0.002 n=6)    258.00 ± 0%   +203.53% (p=0.002 n=6)
ReadWrite/ReadPostAndCommentsWithTx-2                                         85.00 ±  0%    265.00 ± 0%   +211.76% (p=0.002 n=6)    416.00 ± 0%   +389.41% (p=0.002 n=6)    278.00 ± 0%   +227.06% (p=0.002 n=6)    259.00 ± 0%   +204.71% (p=0.002 n=6)
ReadWrite/ReadPostAndCommentsWithTx-4                                         85.00 ±  0%    265.00 ± 0%   +211.76% (p=0.002 n=6)    416.00 ± 0%   +389.41% (p=0.002 n=6)    278.00 ± 0%   +227.06% (p=0.002 n=6)    259.00 ± 0%   +204.71% (p=0.002 n=6)
ReadWrite/ReadPostAndCommentsWithTx-8                                         86.00 ±  0%    267.00 ± 0%   +210.47% (p=0.002 n=6)    416.00 ± 0%   +383.72% (p=0.002 n=6)    279.00 ± 0%   +224.42% (p=0.002 n=6)    259.00 ± 0%   +201.16% (p=0.002 n=6)
ReadWrite/WritePost                                                           6.000 ±  0%    16.000 ± 0%   +166.67% (p=0.002 n=6)    17.000 ± 0%   +183.33% (p=0.002 n=6)    16.000 ± 0%   +166.67% (p=0.002 n=6)    10.000 ± 0%    +66.67% (p=0.002 n=6)
ReadWrite/WritePost-2                                                         8.000 ± 12%    18.000 ± 0%   +125.00% (p=0.002 n=6)    17.000 ± 0%   +112.50% (p=0.002 n=6)    16.000 ± 0%   +100.00% (p=0.002 n=6)    10.000 ± 0%    +25.00% (p=0.002 n=6)
ReadWrite/WritePost-4                                                         12.00 ±  0%     20.00 ± 0%    +66.67% (p=0.002 n=6)     17.00 ± 0%    +41.67% (p=0.002 n=6)     16.00 ± 0%    +33.33% (p=0.002 n=6)     10.00 ± 0%    -16.67% (p=0.002 n=6)
ReadWrite/WritePost-8                                                         14.00 ±  7%     25.00 ± 4%    +78.57% (p=0.002 n=6)     17.00 ± 0%    +21.43% (p=0.002 n=6)     16.00 ± 0%    +14.29% (p=0.002 n=6)     10.00 ± 0%    -28.57% (p=0.002 n=6)
ReadWrite/WritePostWithTx                                                     6.000 ±  0%    24.000 ± 0%   +300.00% (p=0.002 n=6)    25.000 ± 0%   +316.67% (p=0.002 n=6)    36.000 ± 0%   +500.00% (p=0.002 n=6)    20.000 ± 0%   +233.33% (p=0.002 n=6)
ReadWrite/WritePostWithTx-2                                                   8.000 ±  0%    26.000 ± 0%   +225.00% (p=0.002 n=6)    25.000 ± 0%   +212.50% (p=0.002 n=6)    36.000 ± 0%   +350.00% (p=0.002 n=6)    20.000 ± 0%   +150.00% (p=0.002 n=6)
ReadWrite/WritePostWithTx-4                                                   12.00 ±  0%     28.00 ± 0%   +133.33% (p=0.002 n=6)     25.00 ± 0%   +108.33% (p=0.002 n=6)     36.00 ± 0%   +200.00% (p=0.002 n=6)     20.00 ± 0%    +66.67% (p=0.002 n=6)
ReadWrite/WritePostWithTx-8                                                   15.00 ±  0%     33.00 ± 0%   +120.00% (p=0.002 n=6)     25.00 ± 0%    +66.67% (p=0.002 n=6)     36.00 ± 0%   +140.00% (p=0.002 n=6)     20.00 ± 0%    +33.33% (p=0.002 n=6)
ReadWrite/WritePostAndComments                                                159.0 ±  0%     384.0 ± 0%   +141.51% (p=0.002 n=6)     531.0 ± 0%   +233.96% (p=0.002 n=6)     381.0 ± 0%   +139.62% (p=0.002 n=6)     227.0 ± 0%    +42.77% (p=0.002 n=6)
ReadWrite/WritePostAndComments-2                                              209.0 ±  0%     434.0 ± 1%   +107.66% (p=0.002 n=6)     531.0 ± 0%   +154.07% (p=0.002 n=6)     381.0 ± 0%    +82.30% (p=0.002 n=6)     227.0 ± 0%     +8.61% (p=0.002 n=6)
ReadWrite/WritePostAndComments-4                                              258.0 ±  1%     487.5 ± 1%    +88.95% (p=0.002 n=6)     532.0 ± 0%   +106.20% (p=0.002 n=6)     381.0 ± 0%    +47.67% (p=0.002 n=6)     227.0 ± 0%    -12.02% (p=0.002 n=6)
ReadWrite/WritePostAndComments-8                                              340.5 ±  2%     600.0 ± 1%    +76.21% (p=0.002 n=6)     536.0 ± 0%    +57.42% (p=0.002 n=6)     383.0 ± 0%    +12.48% (p=0.002 n=6)     228.0 ± 0%    -33.04% (p=0.002 n=6)
ReadWrite/WritePostAndCommentsWithTx                                          8.000 ±  0%   236.000 ± 0%  +2850.00% (p=0.002 n=6)   534.000 ± 0%  +6575.00% (p=0.002 n=6)   396.000 ± 0%  +4850.00% (p=0.002 n=6)   232.000 ± 0%  +2800.00% (p=0.002 n=6)
ReadWrite/WritePostAndCommentsWithTx-2                                        10.50 ± 14%    238.00 ± 0%  +2166.67% (p=0.002 n=6)    534.00 ± 0%  +4985.71% (p=0.002 n=6)    396.00 ± 0%  +3671.43% (p=0.002 n=6)    232.00 ± 0%  +2109.52% (p=0.002 n=6)
ReadWrite/WritePostAndCommentsWithTx-4                                        38.00 ±  3%    269.00 ± 0%   +607.89% (p=0.002 n=6)    535.00 ± 0%  +1307.89% (p=0.002 n=6)    396.00 ± 0%   +942.11% (p=0.002 n=6)    232.00 ± 0%   +510.53% (p=0.002 n=6)
ReadWrite/WritePostAndCommentsWithTx-8                                        92.50 ±  9%    331.00 ± 0%   +257.84% (p=0.002 n=6)    539.50 ± 0%   +483.24% (p=0.002 n=6)    397.00 ± 0%   +329.19% (p=0.002 n=6)    232.00 ± 0%   +150.81% (p=0.002 n=6)
ReadWrite/ReadOrWritePostAndComments/write_rate=10                            97.00 ±  1%    258.00 ± 1%   +165.98% (p=0.002 n=6)    410.00 ± 1%   +322.68% (p=0.002 n=6)    260.50 ± 1%   +168.56% (p=0.002 n=6)    238.50 ± 0%   +145.88% (p=0.002 n=6)
ReadWrite/ReadOrWritePostAndComments/write_rate=10-2                          103.5 ±  1%     265.5 ± 1%   +156.52% (p=0.002 n=6)     411.0 ± 0%   +297.10% (p=0.002 n=6)     259.5 ± 1%   +150.72% (p=0.002 n=6)     239.0 ± 0%   +130.92% (p=0.002 n=6)
ReadWrite/ReadOrWritePostAndComments/write_rate=10-4                          106.5 ±  2%     270.0 ± 1%   +153.52% (p=0.002 n=6)     412.0 ± 0%   +286.85% (p=0.002 n=6)     261.5 ± 1%   +145.54% (p=0.002 n=6)     239.0 ± 0%   +124.41% (p=0.002 n=6)
ReadWrite/ReadOrWritePostAndComments/write_rate=10-8                          115.0 ±  3%     288.0 ± 3%   +150.43% (p=0.002 n=6)     412.0 ± 0%   +258.26% (p=0.002 n=6)     261.5 ± 1%   +127.39% (p=0.002 n=6)     240.0 ± 0%   +108.70% (p=0.002 n=6)
ReadWrite/ReadOrWritePostAndComments/write_rate=90                            152.5 ±  0%     370.0 ± 1%   +142.62% (p=0.002 n=6)     518.5 ± 0%   +240.00% (p=0.002 n=6)     367.5 ± 1%   +140.98% (p=0.002 n=6)     228.0 ± 0%    +49.51% (p=0.002 n=6)
ReadWrite/ReadOrWritePostAndComments/write_rate=90-2                          199.0 ±  2%     416.0 ± 2%   +109.05% (p=0.002 n=6)     517.0 ± 1%   +159.80% (p=0.002 n=6)     366.0 ± 1%    +83.92% (p=0.002 n=6)     228.0 ± 0%    +14.57% (p=0.002 n=6)
ReadWrite/ReadOrWritePostAndComments/write_rate=90-4                          240.5 ±  1%     467.0 ± 3%    +94.18% (p=0.002 n=6)     517.5 ± 1%   +115.18% (p=0.002 n=6)     370.5 ± 2%    +54.05% (p=0.002 n=6)     228.5 ± 0%     -4.99% (p=0.002 n=6)
ReadWrite/ReadOrWritePostAndComments/write_rate=90-8                          316.5 ±  2%     564.0 ± 2%    +78.20% (p=0.002 n=6)     521.5 ± 0%    +64.77% (p=0.002 n=6)     370.5 ± 1%    +17.06% (p=0.002 n=6)     229.0 ± 0%    -27.65% (p=0.002 n=6)
ReadWrite/ReadOrWritePostAndCommentsWithTx/write_rate=10                      78.00 ±  1%    263.00 ± 0%   +237.18% (p=0.002 n=6)    427.50 ± 0%   +448.08% (p=0.002 n=6)    290.00 ± 1%   +271.79% (p=0.002 n=6)    259.00 ± 0%   +232.05% (p=0.002 n=6)
ReadWrite/ReadOrWritePostAndCommentsWithTx/write_rate=10-2                    79.00 ±  1%    264.00 ± 0%   +234.18% (p=0.002 n=6)    428.00 ± 0%   +441.77% (p=0.002 n=6)    289.00 ± 1%   +265.82% (p=0.002 n=6)    260.00 ± 0%   +229.11% (p=0.002 n=6)
ReadWrite/ReadOrWritePostAndCommentsWithTx/write_rate=10-4                    80.50 ±  1%    266.00 ± 0%   +230.43% (p=0.002 n=6)    428.00 ± 0%   +431.68% (p=0.002 n=6)    290.00 ± 0%   +260.25% (p=0.002 n=6)    260.00 ± 0%   +222.98% (p=0.002 n=6)
ReadWrite/ReadOrWritePostAndCommentsWithTx/write_rate=10-8                    84.00 ±  1%    271.00 ± 2%   +222.62% (p=0.002 n=6)    428.50 ± 0%   +410.12% (p=0.002 n=6)    290.00 ± 0%   +245.24% (p=0.002 n=6)    260.00 ± 0%   +209.52% (p=0.002 n=6)
ReadWrite/ReadOrWritePostAndCommentsWithTx/write_rate=90                      17.50 ± 94%    239.00 ± 0%  +1265.71% (p=0.002 n=6)    522.00 ± 1%  +2882.86% (p=0.002 n=6)    384.00 ± 1%  +2094.29% (p=0.002 n=6)    234.00 ± 0%  +1237.14% (p=0.002 n=6)
ReadWrite/ReadOrWritePostAndCommentsWithTx/write_rate=90-2                    19.00 ±  5%    241.50 ± 1%  +1171.05% (p=0.002 n=6)    521.50 ± 0%  +2644.74% (p=0.002 n=6)    384.00 ± 1%  +1921.05% (p=0.002 n=6)    235.00 ± 0%  +1136.84% (p=0.002 n=6)
ReadWrite/ReadOrWritePostAndCommentsWithTx/write_rate=90-4                    43.00 ±  2%    268.00 ± 1%   +523.26% (p=0.002 n=6)    522.50 ± 0%  +1115.12% (p=0.002 n=6)    384.50 ± 0%   +794.19% (p=0.002 n=6)    235.00 ± 0%   +446.51% (p=0.002 n=6)
ReadWrite/ReadOrWritePostAndCommentsWithTx/write_rate=90-8                    92.00 ±  2%    325.00 ± 6%   +253.26% (p=0.002 n=6)    525.00 ± 1%   +470.65% (p=0.002 n=6)    386.00 ± 0%   +319.57% (p=0.002 n=6)    236.00 ± 0%   +156.52% (p=0.002 n=6)
geomean                                                                       36.06           136.4        +278.13%                   180.3        +400.11%                   145.9        +304.61%                   105.9        +193.56%

Query

Correlated Aggregation

goos: darwin
goarch: arm64
pkg: github.com/michaellenaghan/go-sqlite-bench
cpu: Apple M2 Pro
                              │ gsb_ncruces_direct_bench.txt │   gsb_ncruces_driver_bench.txt    │   gsb_modernc_driver_bench.txt    │     gsb_mattn_driver_bench.txt     │  gsb_tailscale_driver_bench.txt   │
                              │            sec/op            │   sec/op    vs base               │   sec/op    vs base               │   sec/op     vs base               │   sec/op    vs base               │
Query/CorrelatedAggregation                      7.530 ± 20%   6.269 ± 3%  -16.75% (p=0.002 n=6)   2.891 ± 1%  -61.60% (p=0.002 n=6)   1.703 ± 12%  -77.38% (p=0.002 n=6)   1.658 ± 6%  -77.98% (p=0.002 n=6)
Query/CorrelatedAggregation-2                    6.559 ± 24%   6.242 ± 1%   -4.83% (p=0.002 n=6)   2.893 ± 1%  -55.89% (p=0.002 n=6)   1.673 ±  2%  -74.49% (p=0.002 n=6)   1.659 ± 7%  -74.71% (p=0.002 n=6)
Query/CorrelatedAggregation-4                    6.299 ±  1%   6.282 ± 1%        ~ (p=0.310 n=6)   2.875 ± 1%  -54.36% (p=0.002 n=6)   1.668 ±  2%  -73.53% (p=0.002 n=6)   1.642 ± 1%  -73.94% (p=0.002 n=6)
Query/CorrelatedAggregation-8                    6.400 ±  4%   6.238 ± 1%   -2.53% (p=0.002 n=6)   3.037 ± 6%  -52.55% (p=0.002 n=6)   1.643 ±  2%  -74.33% (p=0.002 n=6)   1.663 ± 2%  -74.02% (p=0.002 n=6)
geomean                                          6.680         6.258        -6.32%                 2.923       -56.24%                 1.672        -74.98%                 1.655       -75.22%

                              │ gsb_ncruces_direct_bench.txt │       gsb_ncruces_driver_bench.txt        │      gsb_modernc_driver_bench.txt       │       gsb_mattn_driver_bench.txt        │     gsb_tailscale_driver_bench.txt      │
                              │             B/op             │      B/op        vs base                  │     B/op       vs base                  │     B/op       vs base                  │     B/op       vs base                  │
Query/CorrelatedAggregation                    392.0 ± 6641%    47160.0 ± 146%  +11930.61% (p=0.002 n=6)    71016.0 ± 1%  +18016.33% (p=0.002 n=6)    47008.0 ± 1%  +11891.84% (p=0.002 n=6)    47200.0 ± 2%  +11940.82% (p=0.002 n=6)
Query/CorrelatedAggregation-2                  472.0 ±   20%    47240.0 ±   0%   +9908.47% (p=0.002 n=6)    71096.0 ± 0%  +14962.71% (p=0.002 n=6)    47088.0 ± 0%   +9876.27% (p=0.002 n=6)    47280.0 ± 0%   +9916.95% (p=0.002 n=6)
Query/CorrelatedAggregation-4                1.109Ki ±  167%   46.344Ki ±   0%   +4077.46% (p=0.002 n=6)   69.594Ki ± 0%   +6173.24% (p=0.002 n=6)   46.141Ki ± 0%   +4059.15% (p=0.002 n=6)   46.328Ki ± 0%   +4076.06% (p=0.002 n=6)
Query/CorrelatedAggregation-8                1.414Ki ±  220%   46.922Ki ±   7%   +3218.23% (p=0.002 n=6)   69.953Ki ± 0%   +4846.96% (p=0.002 n=6)   46.453Ki ± 0%   +3185.08% (p=0.002 n=6)   46.656Ki ± 0%   +3199.45% (p=0.002 n=6)
geomean                                        742.8            46.36Ki          +6291.74%                  69.58Ki        +9492.92%                  46.12Ki        +6258.45%                  46.31Ki        +6284.84%

                              │ gsb_ncruces_direct_bench.txt │      gsb_ncruces_driver_bench.txt      │      gsb_modernc_driver_bench.txt      │       gsb_mattn_driver_bench.txt       │     gsb_tailscale_driver_bench.txt     │
                              │          allocs/op           │  allocs/op    vs base                  │  allocs/op    vs base                  │  allocs/op    vs base                  │  allocs/op    vs base                  │
Query/CorrelatedAggregation                     10.00 ± 260%   4772.00 ± 1%  +47620.00% (p=0.002 n=6)   7777.00 ± 0%  +77670.00% (p=0.002 n=6)   4773.00 ± 0%  +47630.00% (p=0.002 n=6)   3772.00 ± 0%  +37620.00% (p=0.002 n=6)
Query/CorrelatedAggregation-2                   12.00 ±   8%   4774.00 ± 0%  +39683.33% (p=0.002 n=6)   7779.00 ± 0%  +64725.00% (p=0.002 n=6)   4775.00 ± 0%  +39691.67% (p=0.002 n=6)   3774.00 ± 0%  +31350.00% (p=0.002 n=6)
Query/CorrelatedAggregation-4                   18.00 ±  22%   4779.00 ± 0%  +26450.00% (p=0.002 n=6)   7783.50 ± 0%  +43141.67% (p=0.002 n=6)   4779.00 ± 0%  +26450.00% (p=0.002 n=6)   3778.00 ± 0%  +20888.89% (p=0.002 n=6)
Query/CorrelatedAggregation-8                   25.50 ±  29%   4788.00 ± 0%  +18676.47% (p=0.002 n=6)   7791.50 ± 0%  +30454.90% (p=0.002 n=6)   4787.00 ± 0%  +18672.55% (p=0.002 n=6)   3786.00 ± 0%  +14747.06% (p=0.002 n=6)
geomean                                         15.32           4.778k       +31090.33%                  7.783k       +50702.43%                  4.778k       +31091.97%                  3.777k       +24557.87%

CTE

goos: darwin
goarch: arm64
pkg: github.com/michaellenaghan/go-sqlite-bench
cpu: Apple M2 Pro
            │ gsb_ncruces_direct_bench.txt │    gsb_ncruces_driver_bench.txt    │    gsb_modernc_driver_bench.txt     │     gsb_mattn_driver_bench.txt     │   gsb_tailscale_driver_bench.txt   │
            │            sec/op            │    sec/op     vs base              │    sec/op     vs base               │   sec/op     vs base               │   sec/op     vs base               │
Query/CTE                      436.7µ ± 0%    457.7µ ± 1%  +4.81% (p=0.002 n=6)    365.8µ ± 3%  -16.23% (p=0.002 n=6)   186.5µ ± 1%  -57.29% (p=0.002 n=6)   187.3µ ± 5%  -57.11% (p=0.002 n=6)
Query/CTE-2                   218.62µ ± 0%   230.29µ ± 0%  +5.34% (p=0.002 n=6)   184.12µ ± 1%  -15.78% (p=0.002 n=6)   89.03µ ± 0%  -59.27% (p=0.002 n=6)   88.80µ ± 1%  -59.38% (p=0.002 n=6)
Query/CTE-4                   110.72µ ± 0%   116.50µ ± 0%  +5.22% (p=0.002 n=6)    98.61µ ± 1%  -10.93% (p=0.002 n=6)   45.82µ ± 0%  -58.62% (p=0.002 n=6)   44.45µ ± 1%  -59.85% (p=0.002 n=6)
Query/CTE-8                    55.91µ ± 1%    59.05µ ± 0%  +5.62% (p=0.002 n=6)    67.78µ ± 0%  +21.22% (p=0.002 n=6)   28.49µ ± 1%  -49.05% (p=0.002 n=6)   22.41µ ± 1%  -59.92% (p=0.002 n=6)
geomean                        155.9µ         164.1µ       +5.25%                  145.7µ        -6.58%                 68.23µ       -56.24%                 63.80µ       -59.08%

            │ gsb_ncruces_direct_bench.txt │      gsb_ncruces_driver_bench.txt      │    gsb_modernc_driver_bench.txt     │     gsb_mattn_driver_bench.txt      │    gsb_tailscale_driver_bench.txt    │
            │             B/op             │     B/op       vs base                 │    B/op      vs base                │    B/op      vs base                │    B/op      vs base                 │
Query/CTE                       64.00 ± 0%    752.00 ±  0%  +1075.00% (p=0.002 n=6)   568.00 ± 0%  +787.50% (p=0.002 n=6)   600.00 ± 0%  +837.50% (p=0.002 n=6)   816.00 ± 0%  +1175.00% (p=0.002 n=6)
Query/CTE-2                     64.00 ± 0%    752.00 ±  0%  +1075.00% (p=0.002 n=6)   568.00 ± 0%  +787.50% (p=0.002 n=6)   600.00 ± 0%  +837.50% (p=0.002 n=6)   816.00 ± 0%  +1175.00% (p=0.002 n=6)
Query/CTE-4                     64.00 ± 0%    835.50 ±  5%  +1205.47% (p=0.002 n=6)   569.00 ± 0%  +789.06% (p=0.002 n=6)   600.00 ± 0%  +837.50% (p=0.002 n=6)   816.00 ± 0%  +1175.00% (p=0.002 n=6)
Query/CTE-8                     67.00 ± 3%   1048.00 ± 15%  +1464.18% (p=0.002 n=6)   568.00 ± 0%  +747.76% (p=0.002 n=6)   600.00 ± 0%  +795.52% (p=0.002 n=6)   816.00 ± 0%  +1117.91% (p=0.002 n=6)
geomean                         64.74          838.9        +1195.78%                  568.2       +777.78%                  600.0       +826.82%                  816.0       +1160.48%

            │ gsb_ncruces_direct_bench.txt │     gsb_ncruces_driver_bench.txt     │     gsb_modernc_driver_bench.txt     │      gsb_mattn_driver_bench.txt      │    gsb_tailscale_driver_bench.txt    │
            │          allocs/op           │  allocs/op   vs base                 │  allocs/op   vs base                 │  allocs/op   vs base                 │  allocs/op   vs base                 │
Query/CTE                       2.000 ± 0%   19.000 ± 0%   +850.00% (p=0.002 n=6)   22.000 ± 0%  +1000.00% (p=0.002 n=6)   18.000 ± 0%   +800.00% (p=0.002 n=6)   20.000 ± 0%   +900.00% (p=0.002 n=6)
Query/CTE-2                     2.000 ± 0%   19.000 ± 0%   +850.00% (p=0.002 n=6)   22.000 ± 0%  +1000.00% (p=0.002 n=6)   18.000 ± 0%   +800.00% (p=0.002 n=6)   20.000 ± 0%   +900.00% (p=0.002 n=6)
Query/CTE-4                     2.000 ± 0%   19.000 ± 0%   +850.00% (p=0.002 n=6)   22.000 ± 0%  +1000.00% (p=0.002 n=6)   18.000 ± 0%   +800.00% (p=0.002 n=6)   20.000 ± 0%   +900.00% (p=0.002 n=6)
Query/CTE-8                     1.000 ± 0%   19.000 ± 0%  +1800.00% (p=0.002 n=6)   22.000 ± 0%  +2100.00% (p=0.002 n=6)   18.000 ± 0%  +1700.00% (p=0.002 n=6)   20.000 ± 0%  +1900.00% (p=0.002 n=6)
geomean                         1.682         19.00       +1029.75%                  22.00       +1208.13%                  18.00        +970.29%                  20.00       +1089.21%

CTE (Recursive)

goos: darwin
goarch: arm64
pkg: github.com/michaellenaghan/go-sqlite-bench
cpu: Apple M2 Pro
                     │ gsb_ncruces_direct_bench.txt │    gsb_ncruces_driver_bench.txt    │    gsb_modernc_driver_bench.txt     │     gsb_mattn_driver_bench.txt     │   gsb_tailscale_driver_bench.txt   │
                     │            sec/op            │    sec/op     vs base              │    sec/op     vs base               │   sec/op     vs base               │   sec/op     vs base               │
Query/CTERecursive                      6.516m ± 1%    6.514m ± 0%       ~ (p=0.937 n=6)    5.462m ± 1%  -16.18% (p=0.002 n=6)   2.396m ± 0%  -63.23% (p=0.002 n=6)   2.382m ± 1%  -63.44% (p=0.002 n=6)
Query/CTERecursive-2                    3.269m ± 0%    3.275m ± 1%  +0.19% (p=0.015 n=6)    2.756m ± 1%  -15.69% (p=0.002 n=6)   1.213m ± 1%  -62.88% (p=0.002 n=6)   1.199m ± 1%  -63.33% (p=0.002 n=6)
Query/CTERecursive-4                   1655.4µ ± 0%   1663.2µ ± 0%  +0.47% (p=0.002 n=6)   1424.4µ ± 1%  -13.96% (p=0.002 n=6)   606.1µ ± 1%  -63.39% (p=0.002 n=6)   601.9µ ± 1%  -63.64% (p=0.002 n=6)
Query/CTERecursive-8                    840.9µ ± 0%    844.3µ ± 1%       ~ (p=0.180 n=6)    742.9µ ± 1%  -11.65% (p=0.002 n=6)   313.6µ ± 3%  -62.70% (p=0.002 n=6)   316.2µ ± 3%  -62.39% (p=0.002 n=6)
geomean                                 2.334m         2.340m       +0.26%                  1.998m       -14.39%                 862.2µ       -63.05%                 858.7µ       -63.20%

                     │ gsb_ncruces_direct_bench.txt │     gsb_ncruces_driver_bench.txt      │     gsb_modernc_driver_bench.txt      │      gsb_mattn_driver_bench.txt       │    gsb_tailscale_driver_bench.txt     │
                     │             B/op             │     B/op      vs base                 │     B/op      vs base                 │     B/op      vs base                 │     B/op      vs base                 │
Query/CTERecursive                       64.00 ± 0%   1648.00 ± 0%  +2475.00% (p=0.002 n=6)   1928.00 ± 0%  +2912.50% (p=0.002 n=6)   1480.00 ± 0%  +2212.50% (p=0.002 n=6)   1456.00 ± 0%  +2175.00% (p=0.002 n=6)
Query/CTERecursive-2                     64.00 ± 2%   1648.00 ± 0%  +2475.00% (p=0.002 n=6)   1928.00 ± 0%  +2912.50% (p=0.002 n=6)   1480.00 ± 0%  +2212.50% (p=0.002 n=6)   1456.00 ± 0%  +2175.00% (p=0.002 n=6)
Query/CTERecursive-4                     64.00 ± 3%   2693.00 ± 0%  +4107.81% (p=0.002 n=6)   1947.00 ± 0%  +2942.19% (p=0.002 n=6)   1482.00 ± 0%  +2215.62% (p=0.002 n=6)   1457.00 ± 0%  +2176.56% (p=0.002 n=6)
Query/CTERecursive-8                     64.50 ± 2%   3338.50 ± 1%  +5075.97% (p=0.002 n=6)   1959.00 ± 0%  +2937.21% (p=0.002 n=6)   1483.00 ± 0%  +2199.22% (p=0.002 n=6)   1457.00 ± 0%  +2158.91% (p=0.002 n=6)
geomean                                  64.12        2.171Ki       +3366.58%                 1.895Ki       +2926.07%                 1.447Ki       +2209.95%                 1.422Ki       +2171.36%

                     │ gsb_ncruces_direct_bench.txt │     gsb_ncruces_driver_bench.txt     │     gsb_modernc_driver_bench.txt      │      gsb_mattn_driver_bench.txt      │    gsb_tailscale_driver_bench.txt    │
                     │          allocs/op           │  allocs/op   vs base                 │  allocs/op    vs base                 │  allocs/op   vs base                 │  allocs/op   vs base                 │
Query/CTERecursive                       2.000 ± 0%   77.000 ± 0%  +3750.00% (p=0.002 n=6)   108.000 ± 0%  +5300.00% (p=0.002 n=6)   76.000 ± 0%  +3700.00% (p=0.002 n=6)   48.000 ± 0%  +2300.00% (p=0.002 n=6)
Query/CTERecursive-2                     2.000 ± 0%   77.000 ± 0%  +3750.00% (p=0.002 n=6)   108.000 ± 0%  +5300.00% (p=0.002 n=6)   76.000 ± 0%  +3700.00% (p=0.002 n=6)   48.000 ± 0%  +2300.00% (p=0.002 n=6)
Query/CTERecursive-4                     2.000 ± 0%   79.000 ± 0%  +3850.00% (p=0.002 n=6)   108.000 ± 0%  +5300.00% (p=0.002 n=6)   76.000 ± 0%  +3700.00% (p=0.002 n=6)   48.000 ± 0%  +2300.00% (p=0.002 n=6)
Query/CTERecursive-8                     2.000 ± 0%   80.000 ± 0%  +3900.00% (p=0.002 n=6)   108.000 ± 0%  +5300.00% (p=0.002 n=6)   76.000 ± 0%  +3700.00% (p=0.002 n=6)   48.000 ± 0%  +2300.00% (p=0.002 n=6)
geomean                                  2.000         78.24       +3811.96%                   108.0       +5300.00%                  76.00       +3700.00%                  48.00       +2300.00%

Group By Aggregation

goos: darwin
goarch: arm64
pkg: github.com/michaellenaghan/go-sqlite-bench
cpu: Apple M2 Pro
                           │ gsb_ncruces_direct_bench.txt │    gsb_ncruces_driver_bench.txt    │     gsb_modernc_driver_bench.txt      │      gsb_mattn_driver_bench.txt      │   gsb_tailscale_driver_bench.txt   │
                           │            sec/op            │    sec/op     vs base              │    sec/op      vs base                │    sec/op     vs base                │   sec/op     vs base               │
Query/GroupByAggregation                      577.9µ ± 1%    588.1µ ± 1%  +1.77% (p=0.002 n=6)    475.6µ ±  1%   -17.69% (p=0.002 n=6)    271.6µ ± 0%   -53.00% (p=0.002 n=6)   251.1µ ± 2%  -56.55% (p=0.002 n=6)
Query/GroupByAggregation-2                    289.3µ ± 0%    293.7µ ± 1%  +1.54% (p=0.002 n=6)    366.8µ ±  0%   +26.81% (p=0.002 n=6)    189.7µ ± 1%   -34.44% (p=0.002 n=6)   121.7µ ± 1%  -57.93% (p=0.002 n=6)
Query/GroupByAggregation-4                   147.23µ ± 5%   147.56µ ± 3%       ~ (p=0.699 n=6)   377.75µ ± 13%  +156.57% (p=0.002 n=6)   333.88µ ± 4%  +126.78% (p=0.002 n=6)   60.72µ ± 0%  -58.76% (p=0.002 n=6)
Query/GroupByAggregation-8                    74.44µ ± 1%    75.47µ ± 2%  +1.38% (p=0.026 n=6)   657.89µ ±  1%  +783.75% (p=0.002 n=6)   432.03µ ± 2%  +480.35% (p=0.002 n=6)   30.98µ ± 6%  -58.38% (p=0.002 n=6)
geomean                                       206.9µ         209.4µ       +1.23%                  456.3µ        +120.56%                  293.6µ        +41.90%                 87.08µ       -57.91%

                           │ gsb_ncruces_direct_bench.txt │     gsb_ncruces_driver_bench.txt      │    gsb_modernc_driver_bench.txt     │     gsb_mattn_driver_bench.txt      │    gsb_tailscale_driver_bench.txt    │
                           │             B/op             │     B/op      vs base                 │    B/op      vs base                │    B/op      vs base                │    B/op      vs base                 │
Query/GroupByAggregation                       64.00 ± 0%   672.00 ±  0%   +950.00% (p=0.002 n=6)   472.00 ± 0%  +637.50% (p=0.002 n=6)   520.00 ± 0%  +712.50% (p=0.002 n=6)   744.00 ± 0%  +1062.50% (p=0.002 n=6)
Query/GroupByAggregation-2                     64.00 ± 0%   672.00 ±  0%   +950.00% (p=0.002 n=6)   472.00 ± 0%  +637.50% (p=0.002 n=6)   520.00 ± 0%  +712.50% (p=0.002 n=6)   744.00 ± 0%  +1062.50% (p=0.002 n=6)
Query/GroupByAggregation-4                     64.00 ± 0%   768.50 ±  6%  +1100.78% (p=0.002 n=6)   477.00 ± 0%  +645.31% (p=0.002 n=6)   521.00 ± 0%  +714.06% (p=0.002 n=6)   744.00 ± 0%  +1062.50% (p=0.002 n=6)
Query/GroupByAggregation-8                     65.00 ± 0%   922.00 ± 11%  +1318.46% (p=0.002 n=6)   497.00 ± 1%  +664.62% (p=0.002 n=6)   523.00 ± 1%  +704.62% (p=0.002 n=6)   744.00 ± 0%  +1044.62% (p=0.002 n=6)
geomean                                        64.25         752.1        +1070.62%                  479.4       +646.15%                  521.0       +710.91%                  744.0       +1058.00%

                           │ gsb_ncruces_direct_bench.txt │     gsb_ncruces_driver_bench.txt     │     gsb_modernc_driver_bench.txt     │     gsb_mattn_driver_bench.txt      │    gsb_tailscale_driver_bench.txt    │
                           │          allocs/op           │  allocs/op   vs base                 │  allocs/op   vs base                 │  allocs/op   vs base                │  allocs/op   vs base                 │
Query/GroupByAggregation                      2.000 ±  0%   17.000 ± 0%   +750.00% (p=0.002 n=6)   18.000 ± 0%   +800.00% (p=0.002 n=6)   16.000 ± 0%  +700.00% (p=0.002 n=6)   18.000 ± 0%   +800.00% (p=0.002 n=6)
Query/GroupByAggregation-2                    2.000 ±  0%   17.000 ± 0%   +750.00% (p=0.002 n=6)   18.000 ± 0%   +800.00% (p=0.002 n=6)   16.000 ± 0%  +700.00% (p=0.002 n=6)   18.000 ± 0%   +800.00% (p=0.002 n=6)
Query/GroupByAggregation-4                    2.000 ±  0%   17.000 ± 0%   +750.00% (p=0.002 n=6)   18.000 ± 0%   +800.00% (p=0.002 n=6)   16.000 ± 0%  +700.00% (p=0.002 n=6)   18.000 ± 0%   +800.00% (p=0.002 n=6)
Query/GroupByAggregation-8                    1.500 ± 33%   17.000 ± 0%  +1033.33% (p=0.002 n=6)   18.000 ± 0%  +1100.00% (p=0.002 n=6)   16.000 ± 0%  +966.67% (p=0.002 n=6)   18.000 ± 0%  +1100.00% (p=0.002 n=6)
geomean                                       1.861          17.00        +813.38%                  18.00        +867.11%                  16.00       +759.66%                  18.00        +867.11%

JSON

goos: darwin
goarch: arm64
pkg: github.com/michaellenaghan/go-sqlite-bench
cpu: Apple M2 Pro
             │ gsb_ncruces_direct_bench.txt │   gsb_ncruces_driver_bench.txt    │    gsb_modernc_driver_bench.txt     │     gsb_mattn_driver_bench.txt      │   gsb_tailscale_driver_bench.txt   │
             │            sec/op            │   sec/op     vs base              │   sec/op     vs base                │    sec/op     vs base               │   sec/op     vs base               │
Query/JSON                      17.76m ± 3%   17.53m ± 3%       ~ (p=0.180 n=6)   17.62m ± 2%         ~ (p=0.394 n=6)   13.84m ± 70%        ~ (p=0.065 n=6)   13.53m ± 2%  -23.80% (p=0.002 n=6)
Query/JSON-2                    13.35m ± 2%   14.65m ± 5%  +9.68% (p=0.002 n=6)   14.70m ± 8%   +10.13% (p=0.002 n=6)   12.01m ±  9%  -10.03% (p=0.002 n=6)   11.75m ± 3%  -11.99% (p=0.002 n=6)
Query/JSON-4                    12.31m ± 5%   13.42m ± 7%  +8.96% (p=0.015 n=6)   15.60m ± 1%   +26.71% (p=0.002 n=6)   10.86m ±  5%  -11.82% (p=0.002 n=6)   12.12m ± 8%        ~ (p=0.589 n=6)
Query/JSON-8                    13.49m ± 1%   14.17m ± 5%  +5.07% (p=0.002 n=6)   34.77m ± 3%  +157.76% (p=0.002 n=6)   13.58m ±  0%   +0.67% (p=0.015 n=6)   15.30m ± 1%  +13.42% (p=0.002 n=6)
geomean                         14.09m        14.86m       +5.52%                 19.36m        +37.44%                 12.51m        -11.17%                 13.10m        -6.98%

             │ gsb_ncruces_direct_bench.txt │       gsb_ncruces_driver_bench.txt       │     gsb_modernc_driver_bench.txt      │     gsb_mattn_driver_bench.txt      │    gsb_tailscale_driver_bench.txt    │
             │             B/op             │      B/op       vs base                  │     B/op      vs base                 │    B/op      vs base                │    B/op      vs base                 │
Query/JSON                      66.00 ±  2%     802.00 ±  0%   +1115.15% (p=0.002 n=6)    618.00 ± 0%   +836.36% (p=0.002 n=6)   649.00 ± 0%  +883.33% (p=0.002 n=6)   865.00 ± 0%  +1210.61% (p=0.002 n=6)
Query/JSON-2                    69.00 ± 19%     802.00 ±  0%   +1062.32% (p=0.002 n=6)    618.00 ± 0%   +795.65% (p=0.002 n=6)   649.00 ± 5%  +840.58% (p=0.002 n=6)   867.50 ± 1%  +1157.25% (p=0.002 n=6)
Query/JSON-4                    71.00 ± 17%   15422.00 ± 19%  +21621.13% (p=0.002 n=6)    830.00 ± 3%  +1069.01% (p=0.002 n=6)   716.00 ± 5%  +908.45% (p=0.002 n=6)   908.50 ± 2%  +1179.58% (p=0.002 n=6)
Query/JSON-8                    97.50 ± 37%   25766.00 ±  2%  +26326.67% (p=0.002 n=6)   1381.00 ± 8%  +1316.41% (p=0.002 n=6)   814.50 ± 6%  +735.38% (p=0.002 n=6)   966.00 ± 9%   +890.77% (p=0.002 n=6)
geomean                         74.93          3.905Ki         +5236.05%                   813.4        +985.55%                  704.0       +839.53%                  900.8       +1102.22%

             │ gsb_ncruces_direct_bench.txt │     gsb_ncruces_driver_bench.txt      │     gsb_modernc_driver_bench.txt     │      gsb_mattn_driver_bench.txt      │    gsb_tailscale_driver_bench.txt    │
             │          allocs/op           │  allocs/op    vs base                 │  allocs/op   vs base                 │  allocs/op   vs base                 │  allocs/op   vs base                 │
Query/JSON                       2.000 ± 0%   20.000 ±  0%   +900.00% (p=0.002 n=6)   24.000 ± 0%  +1100.00% (p=0.002 n=6)   19.000 ± 0%   +850.00% (p=0.002 n=6)   21.000 ± 0%   +950.00% (p=0.002 n=6)
Query/JSON-2                     2.000 ± 0%   20.000 ±  0%   +900.00% (p=0.002 n=6)   24.000 ± 0%  +1100.00% (p=0.002 n=6)   19.000 ± 0%   +850.00% (p=0.002 n=6)   21.000 ± 0%   +950.00% (p=0.002 n=6)
Query/JSON-4                     2.000 ± 0%   52.500 ± 12%  +2525.00% (p=0.002 n=6)   28.000 ± 0%  +1300.00% (p=0.002 n=6)   20.000 ± 5%   +900.00% (p=0.002 n=6)   22.000 ± 5%  +1000.00% (p=0.002 n=6)
Query/JSON-8                     2.000 ± 0%   76.000 ±  1%  +3700.00% (p=0.002 n=6)   38.500 ± 4%  +1825.00% (p=0.002 n=6)   22.000 ± 5%  +1000.00% (p=0.002 n=6)   22.000 ± 0%  +1000.00% (p=0.002 n=6)
geomean                          2.000         35.54        +1677.17%                  28.07       +1303.56%                  19.96        +898.18%                  21.49        +974.71%

michaellenaghan · 2025-03-15T02:01:15Z

michaellenaghan
Mar 15, 2025
Author

Btw, here's the SQL for the various queries:

const SQLForQueryCorrelatedAggregation = `
	SELECT
		id,
		title,
		(SELECT COUNT(*) FROM comments WHERE post_id = posts.id) as comment_count,
		(SELECT AVG(LENGTH(content)) FROM comments WHERE post_id = posts.id) AS avg_comment_length,
		(SELECT MAX(LENGTH(content)) FROM comments WHERE post_id = posts.id) AS max_comment_length
	FROM posts`
const SQLForQueryCTE = `
	WITH day_totals AS (
		SELECT date(created) as day, COUNT(*) as day_total
		FROM posts
		GROUP BY day
	)
	SELECT day, day_total,
		SUM(day_total) OVER (ORDER BY day) as running_total
	FROM day_totals
	ORDER BY day`
const SQLForQueryCTERecursive = `
	WITH RECURSIVE dates(day) AS (
		SELECT date('now', '-30 days')
		UNION ALL
		SELECT date(day, '+1 day')
		FROM dates
		WHERE day < date('now')
	)
	SELECT day,
		(SELECT COUNT(*) FROM posts WHERE date(created) = day) as day_total
	FROM dates
	ORDER BY day`
const SQLForQueryGroupByAggregation = `
	SELECT
		strftime('%Y-%m', created) AS month,
		COUNT(*) as month_total
	FROM posts
	GROUP BY month
	ORDER BY month`
const SQLForQueryJSON = `
	SELECT
		date(created) as day,
		SUM(json_extract(stats, '$.lorem')) as sum_lorem,
		AVG(json_extract(stats, '$.ipsum.dolor')) as avg_dolor,
		MAX(json_extract(stats, '$.lorem.sit')) as max_sit
	FROM posts
	GROUP BY day
	ORDER BY day`

These results used a database with 1,000 posts and 25 comments per post. The post and comments both had 100 "paragraphs" of content. (The correlated aggregation query is the only one that uses comments; the rest only use posts.)

0 replies

daenney · 2025-03-15T11:05:30Z

daenney
Mar 15, 2025

This is great! I've been meaning to do a benchmark along the same lines, but I'm glad someone beat me to the punch. One less thing to do. Also, thanks for doing a set of benchmarks that are more representative of interactive use of SQLite, instead of focussing on raw insert/query performance which is more useful in the "using SQLite to exchange datasettes" approach.

Looking at the numbers, especially memory, I suspect most of the difference is down to database/sql overhead. That package isn't super optimised and the runtime reflection for struct decoding is inherently costly. You can sort of see it in the benchmarks. Whenever the ncruces-driver sees a huge jump in memory allocation and the like, most of the other drivers tend to follow suite. It's not a perfect correlation, but it tends to end in the same ballpark.

One other interesting thing here could be to add the https://github.com/zombiezen/go-sqlite driver. This one uses the ModernC transpile under the hood, but does not use the provided database/sql driver. Its approach is comparable to the native approach for ncruces. I suspect it'll get you numbers in the same ballpark as that driver.

2 replies

michaellenaghan Mar 18, 2025
Author

Sorry, I was out of town over the weekend!

Whenever the ncruces-driver sees a huge jump in memory allocation and the like, most of the other drivers tend to follow suite.

What I find curious is that memory allocation doesn't seem to have much of an impact on performance. The populate category is a dramatic example; the driver version of ncruces is ~5% faster — probably within the margin of error — but it allocates 500% more memory. Hmmm.

One other interesting thing here could be to add the https://github.com/zombiezen/go-sqlite driver.

Yes, that was on my to do list, and I've done it. Generally speaking, your suspicion was correct — performance is about the same, modernc uses more memory.

michaellenaghan Mar 18, 2025
Author

Btw, I hope to publish the repo tomorrow. I have a few more tweaks I'd like to add today, and tomorrow I'd like to write some documentation.

ncruces · 2025-03-15T13:17:39Z

ncruces
Mar 15, 2025
Maintainer

Few notes.

Maybe call sqlite3.Initialize to avoid measuring the overhead of compilation. If you have tests in multiple packages, configuring a compilation cache avoids compiling the Wasm repeatedly for each test binary which makes collecting results faster.

It's expected that for pure CPU stuff (complex queries) Wasm is slower; speedtest1 shows the same. One thing that should be particularly noticeable is large sorts, for which SQLite can use worker threads, but not with Wasm.

Once you push the repo I'll run profiling to figure out if there are any allocations I can avoid; but there are a few almost insurmountable challenges. It shouldn't be much worse than other database/sql drivers (unless they're cheating).

I know of a few cheats (not necessarily from these drivers), but I'd rather not go there. Like for instance TEXT comes out as a []byte and is copied only if you need a string. I don't think this is appropriate for SQLite because, due to dynamic typing, it's common for you not to know if you got TEXT or a BLOB, and get the result as an any, and use the type to make the distinction.

40 replies

michaellenaghan Mar 21, 2025
Author

Here's a quick version of original (the tagged version); modified (this morning's change); and simpler (this morning's change + simpler query).

These changes are particularly dramatic:

Baseline/Select1-4                                                                7.913µ ± ∞ ¹    6.412µ ± ∞ ¹        ~ (p=1.000 n=1) ²       2.047µ ± ∞ ¹        ~ (p=1.000 n=1) ²
Baseline/Select1PrePrepared-4                                                    4638.0n ± ∞ ¹   3519.0n ± ∞ ¹        ~ (p=1.000 n=1) ²       642.6n ± ∞ ¹        ~ (p=1.000 n=1) ²
Query/JSON-4                                                                     1474.3µ ± ∞ ¹   1073.9µ ± ∞ ¹        ~ (p=1.000 n=1) ²       461.2µ ± ∞ ¹        ~ (p=1.000 n=1) ²
Query/NonRecursiveCTE-4                                                          1596.6µ ± ∞ ¹   1109.6µ ± ∞ ¹        ~ (p=1.000 n=1) ²       542.8µ ± ∞ ¹        ~ (p=1.000 n=1) ²
Query/OrderBy-4                                                                   9.694m ± ∞ ¹    7.298m ± ∞ ¹        ~ (p=1.000 n=1) ²       2.139m ± ∞ ¹        ~ (p=1.000 n=1) ²

Here's everything:

goos: darwin
goarch: arm64
pkg: github.com/michaellenaghan/go-sqlite-bench
cpu: Apple M2 Pro
                                                           │ bench_ncruces_driver_original.txt │   bench_ncruces_driver_modified.txt    │ bench_ncruces_driver_modified_simpler.txt │
                                                           │              sec/op               │    sec/op      vs base                 │      sec/op       vs base                 │
Baseline/Conn-4                                                                   218.1n ± ∞ ¹    219.9n ± ∞ ¹        ~ (p=1.000 n=1) ²       216.4n ± ∞ ¹        ~ (p=1.000 n=1) ²
Baseline/Select1-4                                                                7.913µ ± ∞ ¹    6.412µ ± ∞ ¹        ~ (p=1.000 n=1) ²       2.047µ ± ∞ ¹        ~ (p=1.000 n=1) ²
Baseline/Select1PrePrepared-4                                                    4638.0n ± ∞ ¹   3519.0n ± ∞ ¹        ~ (p=1.000 n=1) ²       642.6n ± ∞ ¹        ~ (p=1.000 n=1) ²
Populate/PopulateDBWithTxs-4                                                      53.49m ± ∞ ¹    50.05m ± ∞ ¹        ~ (p=1.000 n=1) ²       42.46m ± ∞ ¹        ~ (p=1.000 n=1) ²
ReadWrite/ReadPostWithTx-4                                                        72.79µ ± ∞ ¹    69.34µ ± ∞ ¹        ~ (p=1.000 n=1) ²       37.84µ ± ∞ ¹        ~ (p=1.000 n=1) ²
ReadWrite/ReadPostAndCommentsWithTx-4                                            163.70µ ± ∞ ¹   153.13µ ± ∞ ¹        ~ (p=1.000 n=1) ²       73.44µ ± ∞ ¹        ~ (p=1.000 n=1) ²
ReadWrite/WritePostWithTx-4                                                       95.53µ ± ∞ ¹    93.87µ ± ∞ ¹        ~ (p=1.000 n=1) ²       82.10µ ± ∞ ¹        ~ (p=1.000 n=1) ²
ReadWrite/WritePostAndCommentsWithTx-4                                            305.1µ ± ∞ ¹    276.8µ ± ∞ ¹        ~ (p=1.000 n=1) ²       239.9µ ± ∞ ¹        ~ (p=1.000 n=1) ²
ReadWrite/ReadOrWritePostAndCommentsWithTx/write_rate=10-4                        48.28µ ± ∞ ¹    46.85µ ± ∞ ¹        ~ (p=1.000 n=1) ²       33.35µ ± ∞ ¹        ~ (p=1.000 n=1) ²
ReadWrite/ReadOrWritePostAndCommentsWithTx/write_rate=90-4                        307.2µ ± ∞ ¹    306.1µ ± ∞ ¹        ~ (p=1.000 n=1) ²       242.2µ ± ∞ ¹        ~ (p=1.000 n=1) ²
Query/Correlated-4                                                                8.973m ± ∞ ¹    6.993m ± ∞ ¹        ~ (p=1.000 n=1) ²       6.256m ± ∞ ¹        ~ (p=1.000 n=1) ²
Query/GroupBy-4                                                                   217.1µ ± ∞ ¹    165.2µ ± ∞ ¹        ~ (p=1.000 n=1) ²       136.2µ ± ∞ ¹        ~ (p=1.000 n=1) ²
Query/JSON-4                                                                     1474.3µ ± ∞ ¹   1073.9µ ± ∞ ¹        ~ (p=1.000 n=1) ²       461.2µ ± ∞ ¹        ~ (p=1.000 n=1) ²
Query/NonRecursiveCTE-4                                                          1596.6µ ± ∞ ¹   1109.6µ ± ∞ ¹        ~ (p=1.000 n=1) ²       542.8µ ± ∞ ¹        ~ (p=1.000 n=1) ²
Query/OrderBy-4                                                                   9.694m ± ∞ ¹    7.298m ± ∞ ¹        ~ (p=1.000 n=1) ²       2.139m ± ∞ ¹        ~ (p=1.000 n=1) ²
Query/RecursiveCTE-4                                                              1.870m ± ∞ ¹    1.394m ± ∞ ¹        ~ (p=1.000 n=1) ²       1.318m ± ∞ ¹        ~ (p=1.000 n=1) ²
geomean                                                                           246.3µ          209.5µ        -14.94%                       124.2µ        -49.56%
¹ need >= 6 samples for confidence interval at level 0.95
² need >= 4 samples to detect a difference at alpha level 0.05

michaellenaghan Mar 21, 2025
Author

That's because database/sql calls SetInterrupt twice...
Whereas when using the low level API...

Not sure what you mean; all of the results I've been showing are for driver, not direct?

michaellenaghan Mar 21, 2025
Author

Not sure what you mean...

Ah, yes, now I understand. :-) But the real problem seems to be that recursive CTEs are particularly expensive to reset and/or step? The times for "modified" above just change the query and leave everything else in place, including the double calls. That one line change (combined with your change from this morning) results in a 2x average overall improvement — and far more on certain benchmarks. That's pretty remarkable.

ncruces Mar 21, 2025
Maintainer

Right, my point is that I probably didn't benchmark this, so I thought I was making an improvement, and failed.

This, by the way, is the reason good benchmarks are important. I don't want to be over fitting to the other go-sqlite-bench. So thanks.

Those two lines are all over the codebase, database/sql requires that dance all the time, so I should support them well. I'll fix it, don't worry. Just not immediately.

I'm all for optimizing realistic usages, as long as I can get sensible, testable semantics as well, playing nice with the Go scheduler, etc.

michaellenaghan Mar 24, 2025
Author

I tried the most recent change. It does resolve the performance issue. The difference from v0.24.1 is dramatic. (Again, a caveat: these are quick benchmarks with a single run, etc.)

I'm going to open a separate discussion for context cancellation. I read a lot of code over the weekend, and I have some thoughts. It might be easiest to express the thoughts in a pull request, to give us something concrete to talk about, but I'll start a discussion.

To be clear: I don't think that what I have in mind would result in (noticeably) better performance; rather, it would just simplify the code.

goos: darwin
goarch: arm64
pkg: github.com/michaellenaghan/go-sqlite-bench
cpu: Apple M2 Pro
                                                           │ bench_ncruces_driver_original.txt │ bench_ncruces_driver_modified_v2.txt  │
                                                           │              sec/op               │    sec/op     vs base                 │
Baseline/Conn-4                                                                   215.3n ± ∞ ¹   216.1n ± ∞ ¹        ~ (p=1.000 n=1) ²
Baseline/Select1-4                                                                7.884µ ± ∞ ¹   1.891µ ± ∞ ¹        ~ (p=1.000 n=1) ²
Baseline/Select1PrePrepared-4                                                    4521.0n ± ∞ ¹   513.6n ± ∞ ¹        ~ (p=1.000 n=1) ²
Populate/PopulateDBWithTxs-4                                                      50.27m ± ∞ ¹   40.63m ± ∞ ¹        ~ (p=1.000 n=1) ²
ReadWrite/ReadPostWithTx-4                                                        68.29µ ± ∞ ¹   25.77µ ± ∞ ¹        ~ (p=1.000 n=1) ²
ReadWrite/ReadPostAndCommentsWithTx-4                                            153.77µ ± ∞ ¹   61.15µ ± ∞ ¹        ~ (p=1.000 n=1) ²
ReadWrite/WritePostWithTx-4                                                       85.27µ ± ∞ ¹   75.52µ ± ∞ ¹        ~ (p=1.000 n=1) ²
ReadWrite/WritePostAndCommentsWithTx-4                                            282.5µ ± ∞ ¹   224.8µ ± ∞ ¹        ~ (p=1.000 n=1) ²
ReadWrite/ReadOrWritePostAndCommentsWithTx/write_rate=10-4                        48.97µ ± ∞ ¹   32.10µ ± ∞ ¹        ~ (p=1.000 n=1) ²
ReadWrite/ReadOrWritePostAndCommentsWithTx/write_rate=90-4                        287.3µ ± ∞ ¹   216.1µ ± ∞ ¹        ~ (p=1.000 n=1) ²
Query/Correlated-4                                                                9.053m ± ∞ ¹   6.177m ± ∞ ¹        ~ (p=1.000 n=1) ²
Query/GroupBy-4                                                                   196.7µ ± ∞ ¹   120.8µ ± ∞ ¹        ~ (p=1.000 n=1) ²
Query/JSON-4                                                                     1478.8µ ± ∞ ¹   436.1µ ± ∞ ¹        ~ (p=1.000 n=1) ²
Query/NonRecursiveCTE-4                                                          1638.2µ ± ∞ ¹   527.7µ ± ∞ ¹        ~ (p=1.000 n=1) ²
Query/OrderBy-4                                                                   9.671m ± ∞ ¹   1.890m ± ∞ ¹        ~ (p=1.000 n=1) ²
Query/RecursiveCTE-4                                                              1.855m ± ∞ ¹   1.282m ± ∞ ¹        ~ (p=1.000 n=1) ²
geomean                                                                           238.0µ         112.5µ        -52.74%
¹ need >= 6 samples for confidence interval at level 0.95
² need >= 4 samples to detect a difference at alpha level 0.05

danp · 2025-03-15T16:05:23Z

danp
Mar 15, 2025

I'd be curious to hear if golang/go#67546 would help at all. I think it'd be great for at least pgx to have that support so I'm hoping the related CL makes it in for 1.25.

1 reply

ncruces Mar 15, 2025
Maintainer

It might, yes, for somethings at least. But it'd take some effort/imagination, so I'll get there once it's the standard. From an initial look, that helps with features more than raw performance. But it might be very useful to bridge the gap between SQLite's dynamic typing and Go.

I went the opposite way and added Stmt.Columns(dest ...any) error to the low level interface (implemented in C) which matches pretty much exactly what database/sql currently needs in a single call. The other SQLite drivers would probably benefit from something like this as well (if the authors are listening).

So I'm actually a bit surprised my driver generates more garbage than others, but a benchmark I can profile will help there.

michaellenaghan · 2025-03-19T02:11:22Z

michaellenaghan
Mar 19, 2025
Author

In case it's of interest, here's a table comparing the compile-time options of each of the implementations.

mattn	modernc	ncruces	tailscale	zombiezen
		ALLOW_URI_AUTHORITY
ATOMIC_INTRINSICS=1	ATOMIC_INTRINSICS=0	ATOMIC_INTRINSICS=1	ATOMIC_INTRINSICS=1	ATOMIC_INTRINSICS=0
COMPILER=clang-16.0.0	COMPILER=clang-16.0.0	COMPILER=clang-19.1.5	COMPILER=clang-16.0.0	COMPILER=clang-16.0.0
DEFAULT_AUTOVACUUM	DEFAULT_AUTOVACUUM	DEFAULT_AUTOVACUUM	DEFAULT_AUTOVACUUM	DEFAULT_AUTOVACUUM
DEFAULT_CACHE_SIZE=-2000	DEFAULT_CACHE_SIZE=-2000	DEFAULT_CACHE_SIZE=-2000	DEFAULT_CACHE_SIZE=-2000	DEFAULT_CACHE_SIZE=-2000
DEFAULT_FILE_FORMAT=4	DEFAULT_FILE_FORMAT=4	DEFAULT_FILE_FORMAT=4	DEFAULT_FILE_FORMAT=4	DEFAULT_FILE_FORMAT=4
		DEFAULT_FOREIGN_KEYS
DEFAULT_JOURNAL_SIZE_LIMIT=-1	DEFAULT_JOURNAL_SIZE_LIMIT=-1	DEFAULT_JOURNAL_SIZE_LIMIT=-1	DEFAULT_JOURNAL_SIZE_LIMIT=-1	DEFAULT_JOURNAL_SIZE_LIMIT=-1
	DEFAULT_MEMSTATUS=0		DEFAULT_MEMSTATUS=0	DEFAULT_MEMSTATUS=0
DEFAULT_MMAP_SIZE=0	DEFAULT_MMAP_SIZE=0	DEFAULT_MMAP_SIZE=0	DEFAULT_MMAP_SIZE=0	DEFAULT_MMAP_SIZE=0
DEFAULT_PAGE_SIZE=4096	DEFAULT_PAGE_SIZE=4096	DEFAULT_PAGE_SIZE=4096	DEFAULT_PAGE_SIZE=4096	DEFAULT_PAGE_SIZE=4096
DEFAULT_PCACHE_INITSZ=20	DEFAULT_PCACHE_INITSZ=20	DEFAULT_PCACHE_INITSZ=20	DEFAULT_PCACHE_INITSZ=20	DEFAULT_PCACHE_INITSZ=20
DEFAULT_RECURSIVE_TRIGGERS	DEFAULT_RECURSIVE_TRIGGERS	DEFAULT_RECURSIVE_TRIGGERS	DEFAULT_RECURSIVE_TRIGGERS	DEFAULT_RECURSIVE_TRIGGERS
DEFAULT_SECTOR_SIZE=4096	DEFAULT_SECTOR_SIZE=4096	DEFAULT_SECTOR_SIZE=4096	DEFAULT_SECTOR_SIZE=4096	DEFAULT_SECTOR_SIZE=4096
DEFAULT_SYNCHRONOUS=2	DEFAULT_SYNCHRONOUS=2	DEFAULT_SYNCHRONOUS=2	DEFAULT_SYNCHRONOUS=2	DEFAULT_SYNCHRONOUS=2
DEFAULT_WAL_AUTOCHECKPOINT=1000	DEFAULT_WAL_AUTOCHECKPOINT=1000	DEFAULT_WAL_AUTOCHECKPOINT=1000	DEFAULT_WAL_AUTOCHECKPOINT=1000	DEFAULT_WAL_AUTOCHECKPOINT=1000
DEFAULT_WAL_SYNCHRONOUS=1	DEFAULT_WAL_SYNCHRONOUS=2	DEFAULT_WAL_SYNCHRONOUS=1	DEFAULT_WAL_SYNCHRONOUS=1	DEFAULT_WAL_SYNCHRONOUS=2
DEFAULT_WORKER_THREADS=0	DEFAULT_WORKER_THREADS=0	DEFAULT_WORKER_THREADS=0	DEFAULT_WORKER_THREADS=0	DEFAULT_WORKER_THREADS=0
DIRECT_OVERFLOW_READ	DIRECT_OVERFLOW_READ	DIRECT_OVERFLOW_READ	DIRECT_OVERFLOW_READ	DIRECT_OVERFLOW_READ
		DQS=0	DQS=0
		ENABLE_ATOMIC_WRITE
		ENABLE_BATCH_ATOMIC_WRITE
	ENABLE_COLUMN_METADATA	ENABLE_COLUMN_METADATA	ENABLE_COLUMN_METADATA	ENABLE_COLUMN_METADATA
	ENABLE_DBSTAT_VTAB		ENABLE_DBSTAT_VTAB	ENABLE_DBSTAT_VTAB
ENABLE_FTS3
ENABLE_FTS3_PARENTHESIS
	ENABLE_FTS5	ENABLE_FTS5	ENABLE_FTS5	ENABLE_FTS5
	ENABLE_GEOPOLY	ENABLE_GEOPOLY		ENABLE_GEOPOLY
	ENABLE_MATH_FUNCTIONS	ENABLE_MATH_FUNCTIONS		ENABLE_MATH_FUNCTIONS
	ENABLE_MEMORY_MANAGEMENT			ENABLE_MEMORY_MANAGEMENT
	ENABLE_OFFSET_SQL_FUNC			ENABLE_OFFSET_SQL_FUNC
	ENABLE_PREUPDATE_HOOK		ENABLE_PREUPDATE_HOOK	ENABLE_PREUPDATE_HOOK
	ENABLE_RBU			ENABLE_RBU
ENABLE_RTREE	ENABLE_RTREE	ENABLE_RTREE	ENABLE_RTREE	ENABLE_RTREE
	ENABLE_SESSION		ENABLE_SESSION	ENABLE_SESSION
	ENABLE_SNAPSHOT		ENABLE_SNAPSHOT	ENABLE_SNAPSHOT
	ENABLE_STAT4	ENABLE_STAT4	ENABLE_STAT4	ENABLE_STAT4
	ENABLE_UNLOCK_NOTIFY			ENABLE_UNLOCK_NOTIFY
ENABLE_UPDATE_DELETE_LIMIT
		HAVE_ISNAN
	LIKE_DOESNT_MATCH_BLOBS	LIKE_DOESNT_MATCH_BLOBS	LIKE_DOESNT_MATCH_BLOBS	LIKE_DOESNT_MATCH_BLOBS
MALLOC_SOFT_LIMIT=1024	MALLOC_SOFT_LIMIT=1024	MALLOC_SOFT_LIMIT=1024	MALLOC_SOFT_LIMIT=1024	MALLOC_SOFT_LIMIT=1024
MAX_ATTACHED=10	MAX_ATTACHED=10	MAX_ATTACHED=10	MAX_ATTACHED=10	MAX_ATTACHED=10
MAX_COLUMN=2000	MAX_COLUMN=2000	MAX_COLUMN=2000	MAX_COLUMN=2000	MAX_COLUMN=2000
MAX_COMPOUND_SELECT=500	MAX_COMPOUND_SELECT=500	MAX_COMPOUND_SELECT=500	MAX_COMPOUND_SELECT=500	MAX_COMPOUND_SELECT=500
MAX_DEFAULT_PAGE_SIZE=8192	MAX_DEFAULT_PAGE_SIZE=8192	MAX_DEFAULT_PAGE_SIZE=8192	MAX_DEFAULT_PAGE_SIZE=8192	MAX_DEFAULT_PAGE_SIZE=8192
MAX_EXPR_DEPTH=1000	MAX_EXPR_DEPTH=1000	MAX_EXPR_DEPTH=0	MAX_EXPR_DEPTH=0	MAX_EXPR_DEPTH=1000
MAX_FUNCTION_ARG=127	MAX_FUNCTION_ARG=1000	MAX_FUNCTION_ARG=1000	MAX_FUNCTION_ARG=127	MAX_FUNCTION_ARG=1000
MAX_LENGTH=1000000000	MAX_LENGTH=1000000000	MAX_LENGTH=1000000000	MAX_LENGTH=1000000000	MAX_LENGTH=1000000000
MAX_LIKE_PATTERN_LENGTH=50000	MAX_LIKE_PATTERN_LENGTH=50000	MAX_LIKE_PATTERN_LENGTH=50000	MAX_LIKE_PATTERN_LENGTH=50000	MAX_LIKE_PATTERN_LENGTH=50000
MAX_MMAP_SIZE=0x7fff0000	MAX_MMAP_SIZE=0x7fff0000	MAX_MMAP_SIZE=0	MAX_MMAP_SIZE=0x7fff0000	MAX_MMAP_SIZE=0x7fff0000
MAX_PAGE_COUNT=0xfffffffe	MAX_PAGE_COUNT=0xfffffffe	MAX_PAGE_COUNT=0xfffffffe	MAX_PAGE_COUNT=0xfffffffe	MAX_PAGE_COUNT=0xfffffffe
MAX_PAGE_SIZE=65536	MAX_PAGE_SIZE=65536	MAX_PAGE_SIZE=65536	MAX_PAGE_SIZE=65536	MAX_PAGE_SIZE=65536
MAX_SQL_LENGTH=1000000000	MAX_SQL_LENGTH=1000000000	MAX_SQL_LENGTH=1000000000	MAX_SQL_LENGTH=1000000000	MAX_SQL_LENGTH=1000000000
MAX_TRIGGER_DEPTH=1000	MAX_TRIGGER_DEPTH=1000	MAX_TRIGGER_DEPTH=1000	MAX_TRIGGER_DEPTH=1000	MAX_TRIGGER_DEPTH=1000
MAX_VARIABLE_NUMBER=32766	MAX_VARIABLE_NUMBER=32766	MAX_VARIABLE_NUMBER=32766	MAX_VARIABLE_NUMBER=32766	MAX_VARIABLE_NUMBER=32766
MAX_VDBE_OP=250000000	MAX_VDBE_OP=250000000	MAX_VDBE_OP=250000000	MAX_VDBE_OP=250000000	MAX_VDBE_OP=250000000
MAX_WORKER_THREADS=8	MAX_WORKER_THREADS=8	MAX_WORKER_THREADS=0	MAX_WORKER_THREADS=8	MAX_WORKER_THREADS=8
	MUTEX_NOOP			MUTEX_NOOP
		MUTEX_OMIT
MUTEX_PTHREADS			MUTEX_PTHREADS
		OMIT_AUTOINIT	OMIT_AUTOINIT
OMIT_DEPRECATED		OMIT_DEPRECATED	OMIT_DEPRECATED
		OMIT_DESERIALIZE
		OMIT_LOAD_EXTENSION	OMIT_LOAD_EXTENSION
			OMIT_PROGRESS_CALLBACK
		OMIT_SHARED_CACHE	OMIT_SHARED_CACHE	SOUNDEX
	SOUNDEX	SOUNDEX
SYSTEM_MALLOC	SYSTEM_MALLOC	SYSTEM_MALLOC	SYSTEM_MALLOC	SYSTEM_MALLOC
TEMP_STORE=1	TEMP_STORE=1	TEMP_STORE=1	TEMP_STORE=1	TEMP_STORE=1
THREADSAFE=1	THREADSAFE=1	THREADSAFE=0	THREADSAFE=2	THREADSAFE=1
		UNTESTABLE
		USE_ALLOCA	USE_ALLOCA

1 reply

michaellenaghan Mar 19, 2025
Author

Ah, and here are the compile-time options from the sqlite3 that ships with macOS:

ATOMIC_INTRINSICS=1
BUG_COMPATIBLE_20160819
CCCRYPT256
COMPILER=clang-16.0.0
DEFAULT_AUTOVACUUM
DEFAULT_CACHE_SIZE=2000
DEFAULT_CKPTFULLFSYNC
DEFAULT_FILE_FORMAT=4
DEFAULT_JOURNAL_SIZE_LIMIT=32768
DEFAULT_LOOKASIDE=1200,102
DEFAULT_MEMSTATUS=0
DEFAULT_MMAP_SIZE=0
DEFAULT_PAGE_SIZE=4096
DEFAULT_PCACHE_INITSZ=20
DEFAULT_RECURSIVE_TRIGGERS
DEFAULT_SECTOR_SIZE=4096
DEFAULT_SYNCHRONOUS=2
DEFAULT_WAL_AUTOCHECKPOINT=1000
DEFAULT_WAL_SYNCHRONOUS=1
DEFAULT_WORKER_THREADS=0
DQS=3
ENABLE_API_ARMOR
ENABLE_BYTECODE_VTAB
ENABLE_COLUMN_METADATA
ENABLE_DBPAGE_VTAB
ENABLE_DBSTAT_VTAB
ENABLE_EXPLAIN_COMMENTS
ENABLE_FTS3
ENABLE_FTS3_PARENTHESIS
ENABLE_FTS3_TOKENIZER
ENABLE_FTS4
ENABLE_FTS5
ENABLE_LOCKING_STYLE=1
ENABLE_MATH_FUNCTIONS
ENABLE_NORMALIZE
ENABLE_PREUPDATE_HOOK
ENABLE_RTREE
ENABLE_SESSION
ENABLE_SNAPSHOT
ENABLE_SQLLOG
ENABLE_STMT_SCANSTATUS
ENABLE_UNKNOWN_SQL_FUNCTION
ENABLE_UPDATE_DELETE_LIMIT
HAS_CODEC_RESTRICTED
HAVE_ISNAN
MALLOC_SOFT_LIMIT=1024
MAX_ATTACHED=10
MAX_COLUMN=2000
MAX_COMPOUND_SELECT=500
MAX_DEFAULT_PAGE_SIZE=8192
MAX_EXPR_DEPTH=1000
MAX_FUNCTION_ARG=127
MAX_LENGTH=2147483645
MAX_LIKE_PATTERN_LENGTH=50000
MAX_MMAP_SIZE=1073741824
MAX_PAGE_COUNT=1073741823
MAX_PAGE_SIZE=65536
MAX_SQL_LENGTH=1000000000
MAX_TRIGGER_DEPTH=1000
MAX_VARIABLE_NUMBER=500000
MAX_VDBE_OP=250000000
MAX_WORKER_THREADS=8
MUTEX_UNFAIR
OMIT_AUTORESET
OMIT_LOAD_EXTENSION
STMTJRNL_SPILL=131072
SYSTEM_MALLOC
TEMP_STORE=1
THREADSAFE=2
USE_URI

michaellenaghan · 2025-03-20T02:55:10Z

michaellenaghan
Mar 20, 2025
Author

There's now a repo at https://github.com/michaellenaghan/go-sqlite-bench.

I haven't written any docs yet. I will; there's lots to explain in terms of goals, rules, etc.

As far as the benchmarks themselves go: I've made various changes.

One important change is that I used to run many benchmarks in parallel. I think that made results hard to interpret. Now only the "ReadOrWrite" benchmarks run in parallel; everything else runs in a b.Loop(). (With that change, running multiple -cpu values became less important. It's still possible, but it's no longer done by default.)

The results below were generated using:

make benchstat-by-category BENCH_BIG=1 BENCH_SLOW=1

To run a subset of implementations, use TAGS:

make benchstat-by-category BENCH_BIG=1 BENCH_SLOW=1 TAGS="ncruces_direct ncruces_driver"

The first tag listed becomes the baseline in the benchstats. For example, to make everything relative to tailscale, list tailscale_driver first. (ncruces_direct is first by default because that's what I'm interested in. :-)

Benchmarks tee to bench_*.txt files.

Benchstat runs off of the bench_*.txt files.

Tests tee to test_*.txt files.

The tests log useful info like compile-time options and the values of selected PRAGMAs.

I haven't had time to look into the "why" of any of the differences.

I'd like to add tailscale_direct, but I need to finish some other (unrelated) work first.

On to the results...

Baseline

benchstat bench_ncruces_direct.txt bench_ncruces_driver.txt bench_modernc_driver.txt bench_zombiezen_direct.txt bench_mattn_driver.txt bench_tailscale_driver.txt
goos: darwin
goarch: arm64
pkg: github.com/michaellenaghan/go-sqlite-bench
cpu: Apple M2 Pro
                   │ bench_ncruces_direct.txt │      bench_ncruces_driver.txt       │      bench_modernc_driver.txt      │     bench_zombiezen_direct.txt     │       bench_mattn_driver.txt       │     bench_tailscale_driver.txt     │
                   │          sec/op          │   sec/op     vs base                │   sec/op     vs base               │   sec/op     vs base               │   sec/op     vs base               │   sec/op     vs base               │
Baseline/Select1-4                1.876µ ± 4%   7.734µ ± 3%  +312.23% (p=0.002 n=6)   2.093µ ± 1%  +11.57% (p=0.002 n=6)   2.741µ ± 1%  +46.11% (p=0.002 n=6)   2.650µ ± 2%  +41.26% (p=0.002 n=6)   1.107µ ± 2%  -40.99% (p=0.002 n=6)

                   │ bench_ncruces_direct.txt │     bench_ncruces_driver.txt      │      bench_modernc_driver.txt       │     bench_zombiezen_direct.txt      │       bench_mattn_driver.txt        │     bench_tailscale_driver.txt      │
                   │           B/op           │    B/op     vs base               │    B/op      vs base                │    B/op      vs base                │    B/op      vs base                │    B/op      vs base                │
Baseline/Select1-4                 80.00 ± 0%   32.00 ± 0%  -60.00% (p=0.002 n=6)   252.00 ± 0%  +215.00% (p=0.002 n=6)   713.00 ± 0%  +791.25% (p=0.002 n=6)   288.00 ± 0%  +260.00% (p=0.002 n=6)   240.00 ± 0%  +200.00% (p=0.002 n=6)

                   │ bench_ncruces_direct.txt │     bench_ncruces_driver.txt      │      bench_modernc_driver.txt      │     bench_zombiezen_direct.txt     │       bench_mattn_driver.txt       │     bench_tailscale_driver.txt     │
                   │        allocs/op         │ allocs/op   vs base               │ allocs/op   vs base                │ allocs/op   vs base                │ allocs/op   vs base                │ allocs/op   vs base                │
Baseline/Select1-4                 2.000 ± 0%   1.000 ± 0%  -50.00% (p=0.002 n=6)   7.000 ± 0%  +250.00% (p=0.002 n=6)   9.000 ± 0%  +350.00% (p=0.002 n=6)   9.000 ± 0%  +350.00% (p=0.002 n=6)   5.000 ± 0%  +150.00% (p=0.002 n=6)

Populate

benchstat bench_ncruces_direct.txt bench_ncruces_driver.txt bench_modernc_driver.txt bench_zombiezen_direct.txt bench_mattn_driver.txt bench_tailscale_driver.txt
goos: darwin
goarch: arm64
pkg: github.com/michaellenaghan/go-sqlite-bench
cpu: Apple M2 Pro
                             │ bench_ncruces_direct.txt │      bench_ncruces_driver.txt      │      bench_modernc_driver.txt      │    bench_zombiezen_direct.txt     │       bench_mattn_driver.txt       │     bench_tailscale_driver.txt     │
                             │          sec/op          │   sec/op     vs base               │   sec/op     vs base               │   sec/op     vs base              │   sec/op     vs base               │   sec/op     vs base               │
Populate/PopulateDB-4                       2.576 ± 11%   2.763 ±  6%        ~ (p=0.093 n=6)   2.765 ±  6%        ~ (p=0.065 n=6)   2.461 ±  6%       ~ (p=0.394 n=6)   2.229 ±  9%  -13.47% (p=0.009 n=6)   2.050 ± 46%        ~ (p=0.394 n=6)
Populate/PopulateDBWithTx-4                 1.277 ±  4%   1.557 ± 80%  +21.99% (p=0.002 n=6)   1.556 ±  9%  +21.88% (p=0.002 n=6)   1.305 ±  9%       ~ (p=0.818 n=6)   1.214 ±  8%   -4.92% (p=0.026 n=6)   1.085 ±  8%  -14.99% (p=0.002 n=6)
Populate/PopulateDBWithTxs-4                1.599 ± 14%   1.806 ±  9%  +12.94% (p=0.041 n=6)   1.811 ± 11%  +13.23% (p=0.002 n=6)   1.656 ± 10%       ~ (p=0.394 n=6)   1.475 ± 22%        ~ (p=0.699 n=6)   1.484 ± 10%        ~ (p=0.240 n=6)
geomean                                     1.739         1.981        +13.90%                 1.982        +13.99%                 1.746        +0.38%                 1.586         -8.78%                 1.489        -14.37%

                             │ bench_ncruces_direct.txt │       bench_ncruces_driver.txt        │        bench_modernc_driver.txt        │      bench_zombiezen_direct.txt      │          bench_mattn_driver.txt          │      bench_tailscale_driver.txt       │
                             │           B/op           │     B/op       vs base                │     B/op       vs base                 │     B/op      vs base                │      B/op       vs base                  │     B/op       vs base                │
Populate/PopulateDB-4                     11.668Mi ± 1%   28.351Mi ± 0%  +142.99% (p=0.002 n=6)   34.817Mi ± 0%   +198.41% (p=0.002 n=6)   6.161Mi ± 2%   -47.19% (p=0.002 n=6)   273.194Mi ± 0%   +2241.45% (p=0.002 n=6)   18.956Mi ± 0%   +62.47% (p=0.002 n=6)
Populate/PopulateDBWithTx-4                2.274Mi ± 4%   18.959Mi ± 0%  +733.91% (p=0.002 n=6)   34.811Mi ± 0%  +1431.16% (p=0.002 n=6)   6.161Mi ± 0%  +170.98% (p=0.002 n=6)   273.193Mi ± 0%  +11916.30% (p=0.002 n=6)   18.956Mi ± 0%  +733.77% (p=0.002 n=6)
Populate/PopulateDBWithTxs-4               2.470Mi ± 0%   20.169Mi ± 0%  +716.55% (p=0.002 n=6)   35.833Mi ± 0%  +1350.71% (p=0.002 n=6)   6.207Mi ± 0%  +151.28% (p=0.002 n=6)   274.462Mi ± 0%  +11011.59% (p=0.002 n=6)   20.204Mi ± 0%  +717.95% (p=0.002 n=6)
geomean                                    4.031Mi         22.13Mi       +448.99%                  35.15Mi        +771.91%                 6.176Mi        +53.20%                   273.6Mi        +6686.99%                  19.36Mi       +380.30%

                             │ bench_ncruces_direct.txt │      bench_ncruces_driver.txt       │       bench_modernc_driver.txt       │     bench_zombiezen_direct.txt      │        bench_mattn_driver.txt        │     bench_tailscale_driver.txt      │
                             │        allocs/op         │  allocs/op   vs base                │  allocs/op    vs base                │  allocs/op   vs base                │  allocs/op    vs base                │  allocs/op   vs base                │
Populate/PopulateDB-4                       447.8k ± 0%   905.8k ± 1%  +102.30% (p=0.002 n=6)   1463.3k ± 1%  +226.80% (p=0.002 n=6)   445.1k ± 3%    -0.60% (p=0.002 n=6)   1057.5k ± 1%  +136.17% (p=0.002 n=6)   598.1k ± 1%   +33.57% (p=0.002 n=6)
Populate/PopulateDBWithTx-4                 140.0k ± 0%   598.1k ± 0%  +327.14% (p=0.002 n=6)   1463.2k ± 0%  +944.96% (p=0.002 n=6)   445.1k ± 0%  +217.86% (p=0.002 n=6)   1057.5k ± 0%  +655.21% (p=0.002 n=6)   598.1k ± 0%  +327.12% (p=0.002 n=6)
Populate/PopulateDBWithTxs-4                146.5k ± 0%   625.5k ± 0%  +327.09% (p=0.002 n=6)   1484.3k ± 0%  +913.37% (p=0.002 n=6)   448.1k ± 0%  +205.93% (p=0.002 n=6)   1092.5k ± 0%  +645.87% (p=0.002 n=6)   623.2k ± 0%  +325.46% (p=0.002 n=6)
geomean                                     209.4k        697.2k       +232.94%                  1.470M       +602.08%                 446.1k       +113.02%                  1.069M       +410.48%                 606.3k       +189.54%

ReadWrite

benchstat bench_ncruces_direct.txt bench_ncruces_driver.txt bench_modernc_driver.txt bench_zombiezen_direct.txt bench_mattn_driver.txt bench_tailscale_driver.txt
goos: darwin
goarch: arm64
pkg: github.com/michaellenaghan/go-sqlite-bench
cpu: Apple M2 Pro
                                                           │ bench_ncruces_direct.txt │       bench_ncruces_driver.txt       │       bench_modernc_driver.txt        │     bench_zombiezen_direct.txt      │        bench_mattn_driver.txt        │     bench_tailscale_driver.txt     │
                                                           │          sec/op          │    sec/op     vs base                │    sec/op      vs base                │    sec/op     vs base               │    sec/op     vs base                │   sec/op     vs base               │
ReadWrite/ReadPost-4                                                     13.59µ ±  1%    37.17µ ± 1%  +173.48% (p=0.002 n=6)    25.29µ ±  2%   +86.05% (p=0.002 n=6)    20.81µ ± 2%  +53.13% (p=0.002 n=6)    18.16µ ± 3%   +33.62% (p=0.002 n=6)   16.64µ ± 1%  +22.44% (p=0.002 n=6)
ReadWrite/ReadPostWithTx-4                                               15.59µ ±  1%    78.76µ ± 1%  +405.28% (p=0.002 n=6)    27.34µ ±  4%   +75.38% (p=0.002 n=6)    21.33µ ± 1%  +36.85% (p=0.002 n=6)    21.39µ ± 1%   +37.24% (p=0.002 n=6)   20.56µ ± 1%  +31.92% (p=0.002 n=6)
ReadWrite/ReadPostAndComments-4                                          80.04µ ±  3%   360.06µ ± 1%  +349.83% (p=0.002 n=6)   169.35µ ±  1%  +111.58% (p=0.002 n=6)    91.31µ ± 3%  +14.07% (p=0.002 n=6)   174.62µ ± 2%  +118.15% (p=0.002 n=6)   87.00µ ± 1%   +8.69% (p=0.002 n=6)
ReadWrite/ReadPostAndCommentsWithTx-4                                    80.25µ ±  2%   397.80µ ± 1%  +395.69% (p=0.002 n=6)   169.06µ ±  1%  +110.66% (p=0.002 n=6)    90.02µ ± 1%  +12.17% (p=0.002 n=6)   176.94µ ± 1%  +120.48% (p=0.002 n=6)   92.07µ ± 1%  +14.73% (p=0.002 n=6)
ReadWrite/WritePost-4                                                    137.4µ ±  8%    152.6µ ± 3%   +11.08% (p=0.002 n=6)    139.7µ ± 15%         ~ (p=0.180 n=6)    179.1µ ± 3%  +30.34% (p=0.002 n=6)    129.1µ ± 3%         ~ (p=0.093 n=6)   116.7µ ± 4%  -15.04% (p=0.002 n=6)
ReadWrite/WritePostWithTx-4                                              141.0µ ±  2%    164.6µ ± 2%   +16.78% (p=0.002 n=6)    145.1µ ±  1%    +2.97% (p=0.015 n=6)    181.6µ ± 3%  +28.80% (p=0.002 n=6)    130.7µ ± 5%    -7.25% (p=0.002 n=6)   131.6µ ± 5%   -6.61% (p=0.002 n=6)
ReadWrite/WritePostAndComments-4                                         2.557m ±  5%    2.863m ± 2%   +11.96% (p=0.002 n=6)    2.906m ±  5%   +13.63% (p=0.002 n=6)    2.776m ± 1%   +8.55% (p=0.002 n=6)    2.412m ± 9%    -5.66% (p=0.026 n=6)   2.131m ± 5%  -16.68% (p=0.002 n=6)
ReadWrite/WritePostAndCommentsWithTx-4                                   979.1µ ±  6%   1246.0µ ± 1%   +27.26% (p=0.002 n=6)   1221.3µ ±  2%   +24.73% (p=0.002 n=6)   1275.3µ ± 5%  +30.26% (p=0.002 n=6)    953.6µ ± 2%    -2.61% (p=0.015 n=6)   804.9µ ± 2%  -17.79% (p=0.002 n=6)
ReadWrite/ReadOrWritePostAndComments/write_rate=10-4                     341.3µ ±  5%    425.6µ ± 8%   +24.72% (p=0.002 n=6)    393.8µ ±  7%   +15.38% (p=0.002 n=6)    380.8µ ± 6%  +11.59% (p=0.002 n=6)    305.7µ ± 7%   -10.43% (p=0.004 n=6)   264.3µ ± 6%  -22.55% (p=0.002 n=6)
ReadWrite/ReadOrWritePostAndComments/write_rate=90-4                     2.667m ±  1%    3.125m ± 3%   +17.18% (p=0.002 n=6)    2.557m ±  3%    -4.10% (p=0.002 n=6)    2.583m ± 2%   -3.12% (p=0.002 n=6)    2.184m ± 2%   -18.10% (p=0.002 n=6)   1.965m ± 5%  -26.32% (p=0.002 n=6)
ReadWrite/ReadOrWritePostAndCommentsWithTx/write_rate=10-4               100.3µ ± 14%    159.7µ ± 4%   +59.23% (p=0.002 n=6)    226.5µ ±  5%  +125.85% (p=0.002 n=6)    199.0µ ± 5%  +98.43% (p=0.002 n=6)    138.4µ ± 2%   +38.02% (p=0.002 n=6)   113.1µ ± 6%        ~ (p=0.065 n=6)
ReadWrite/ReadOrWritePostAndCommentsWithTx/write_rate=90-4               756.4µ ±  3%   1030.8µ ± 4%   +36.28% (p=0.002 n=6)   1158.1µ ±  3%   +53.11% (p=0.002 n=6)   1168.4µ ± 5%  +54.47% (p=0.002 n=6)    937.8µ ± 4%   +23.98% (p=0.002 n=6)   730.8µ ± 5%   -3.38% (p=0.041 n=6)
geomean                                                                  201.8µ          376.9µ        +86.81%                  291.5µ         +44.46%                  260.1µ       +28.93%                  242.0µ        +19.95%                 195.4µ        -3.15%

                                                           │ bench_ncruces_direct.txt │        bench_ncruces_driver.txt        │        bench_modernc_driver.txt         │       bench_zombiezen_direct.txt        │          bench_mattn_driver.txt           │       bench_tailscale_driver.txt        │
                                                           │           B/op           │     B/op       vs base                 │     B/op       vs base                  │      B/op       vs base                 │      B/op       vs base                   │      B/op       vs base                 │
ReadWrite/ReadPost-4                                                    40.30Ki ±  0%    41.43Ki ± 0%     +2.81% (p=0.002 n=6)    81.32Ki ± 0%    +101.81% (p=0.002 n=6)    40.99Ki ±  0%     +1.73% (p=0.002 n=6)     41.43Ki ± 0%       +2.81% (p=0.002 n=6)    41.41Ki ±  0%     +2.75% (p=0.002 n=6)
ReadWrite/ReadPostWithTx-4                                              40.30Ki ±  0%    41.98Ki ± 0%     +4.16% (p=0.002 n=6)    81.87Ki ± 0%    +103.16% (p=0.002 n=6)    41.04Ki ±  0%     +1.84% (p=0.002 n=6)     42.16Ki ± 0%       +4.61% (p=0.002 n=6)    41.85Ki ±  0%     +3.86% (p=0.002 n=6)
ReadWrite/ReadPostAndComments-4                                         250.4Ki ±  0%    259.4Ki ± 0%     +3.62% (p=0.002 n=6)    505.2Ki ± 0%    +101.77% (p=0.002 n=6)    251.5Ki ±  0%     +0.44% (p=0.002 n=6)     265.0Ki ± 0%       +5.84% (p=0.002 n=6)    258.6Ki ±  0%     +3.29% (p=0.002 n=6)
ReadWrite/ReadPostAndCommentsWithTx-4                                   250.3Ki ±  0%    259.9Ki ± 0%     +3.86% (p=0.002 n=6)    505.7Ki ± 0%    +102.07% (p=0.002 n=6)    251.5Ki ±  0%     +0.50% (p=0.002 n=6)     265.7Ki ± 0%       +6.17% (p=0.002 n=6)    259.1Ki ±  0%     +3.51% (p=0.002 n=6)
ReadWrite/WritePost-4                                                     226.0 ±  0%      644.0 ± 0%   +184.96% (p=0.002 n=6)      528.0 ± 0%    +133.63% (p=0.002 n=6)      649.5 ±  1%   +187.39% (p=0.002 n=6)     42464.0 ± 0%   +18689.38% (p=0.002 n=6)      432.0 ±  0%    +91.15% (p=0.002 n=6)
ReadWrite/WritePostWithTx-4                                               226.0 ±  0%     1045.5 ± 0%   +362.61% (p=0.002 n=6)     1113.0 ± 0%    +392.48% (p=0.002 n=6)      697.5 ±  0%   +208.63% (p=0.002 n=6)     43219.0 ± 0%   +19023.45% (p=0.002 n=6)      857.0 ±  0%   +279.20% (p=0.002 n=6)
ReadWrite/WritePostAndComments-4                                        9.647Ki ±  0%   24.711Ki ± 0%   +156.14% (p=0.002 n=6)   30.447Ki ± 0%    +215.60% (p=0.002 n=6)   17.058Ki ±  0%    +76.81% (p=0.002 n=6)   309.984Ki ± 0%    +3113.12% (p=0.002 n=6)   15.297Ki ±  0%    +58.56% (p=0.002 n=6)
ReadWrite/WritePostAndCommentsWithTx-4                                    237.0 ±  0%    15510.5 ± 0%  +6444.51% (p=0.002 n=6)    31208.0 ± 0%  +13067.93% (p=0.002 n=6)    17511.5 ±  0%  +7288.82% (p=0.002 n=6)    317627.0 ± 0%  +133919.83% (p=0.002 n=6)    15551.5 ±  0%  +6461.81% (p=0.002 n=6)
ReadWrite/ReadOrWritePostAndComments/write_rate=10-4                    226.5Ki ±  0%    238.0Ki ± 1%     +5.05% (p=0.002 n=6)    456.8Ki ± 1%    +101.63% (p=0.002 n=6)    227.2Ki ±  1%          ~ (p=0.180 n=6)     269.5Ki ± 0%      +18.95% (p=0.002 n=6)    234.1Ki ±  1%     +3.34% (p=0.002 n=6)
ReadWrite/ReadOrWritePostAndComments/write_rate=90-4                    38.47Ki ± 15%    54.39Ki ± 6%    +41.38% (p=0.002 n=6)    80.62Ki ± 3%    +109.57% (p=0.002 n=6)    40.81Ki ± 11%     +6.08% (p=0.026 n=6)    305.65Ki ± 0%     +694.52% (p=0.002 n=6)    39.44Ki ±  9%          ~ (p=0.394 n=6)
ReadWrite/ReadOrWritePostAndCommentsWithTx/write_rate=10-4              225.6Ki ±  0%    235.6Ki ± 0%     +4.41% (p=0.002 n=6)    458.7Ki ± 1%    +103.32% (p=0.002 n=6)    227.9Ki ±  1%     +1.03% (p=0.002 n=6)     270.1Ki ± 0%      +19.72% (p=0.002 n=6)    234.7Ki ±  0%     +4.02% (p=0.002 n=6)
ReadWrite/ReadOrWritePostAndCommentsWithTx/write_rate=90-4              25.95Ki ± 12%    40.64Ki ± 7%    +56.61% (p=0.002 n=6)    78.64Ki ± 5%    +203.06% (p=0.002 n=6)    40.42Ki ±  9%    +55.75% (p=0.002 n=6)    305.81Ki ± 0%    +1078.51% (p=0.002 n=6)    39.40Ki ± 10%    +51.86% (p=0.002 n=6)
geomean                                                                 16.96Ki          35.10Ki        +106.99%                  57.00Ki         +236.14%                  31.98Ki          +88.62%                   151.0Ki          +790.70%                  31.08Ki          +83.31%

                                                           │ bench_ncruces_direct.txt │       bench_ncruces_driver.txt        │        bench_modernc_driver.txt         │       bench_zombiezen_direct.txt       │         bench_mattn_driver.txt         │      bench_tailscale_driver.txt       │
                                                           │        allocs/op         │  allocs/op    vs base                 │   allocs/op    vs base                  │  allocs/op    vs base                  │  allocs/op    vs base                  │  allocs/op    vs base                 │
ReadWrite/ReadPost-4                                                       9.000 ± 0%    41.000 ± 0%   +355.56% (p=0.002 n=6)     49.000 ± 0%    +444.44% (p=0.002 n=6)    26.000 ± 0%    +188.89% (p=0.002 n=6)    43.000 ± 0%    +377.78% (p=0.002 n=6)    34.000 ± 0%   +277.78% (p=0.002 n=6)
ReadWrite/ReadPostWithTx-4                                                 9.000 ± 0%    55.000 ± 0%   +511.11% (p=0.002 n=6)     60.000 ± 0%    +566.67% (p=0.002 n=6)    29.000 ± 0%    +222.22% (p=0.002 n=6)    64.000 ± 0%    +611.11% (p=0.002 n=6)    45.000 ± 0%   +400.00% (p=0.002 n=6)
ReadWrite/ReadPostAndComments-4                                            269.0 ± 0%     695.0 ± 0%   +158.36% (p=0.002 n=6)     1204.0 ± 0%    +347.58% (p=0.002 n=6)     302.0 ± 0%     +12.27% (p=0.002 n=6)     702.0 ± 0%    +160.97% (p=0.002 n=6)     582.0 ± 0%   +116.36% (p=0.002 n=6)
ReadWrite/ReadPostAndCommentsWithTx-4                                      266.0 ± 0%     708.0 ± 0%   +166.17% (p=0.002 n=6)     1217.0 ± 0%    +357.52% (p=0.002 n=6)     305.0 ± 0%     +14.66% (p=0.002 n=6)     725.0 ± 0%    +172.56% (p=0.002 n=6)     595.0 ± 0%   +123.68% (p=0.002 n=6)
ReadWrite/WritePost-4                                                      7.000 ± 0%    16.000 ± 0%   +128.57% (p=0.002 n=6)     20.000 ± 0%    +185.71% (p=0.002 n=6)    18.000 ± 0%    +157.14% (p=0.002 n=6)    18.000 ± 0%    +157.14% (p=0.002 n=6)    10.000 ± 0%    +42.86% (p=0.002 n=6)
ReadWrite/WritePostWithTx-4                                                7.000 ± 0%    24.000 ± 0%   +242.86% (p=0.002 n=6)     32.000 ± 0%    +357.14% (p=0.002 n=6)    21.000 ± 0%    +200.00% (p=0.002 n=6)    40.000 ± 0%    +471.43% (p=0.002 n=6)    20.000 ± 0%   +185.71% (p=0.002 n=6)
ReadWrite/WritePostAndComments-4                                           308.0 ± 0%     732.0 ± 0%   +137.66% (p=0.002 n=6)     1184.0 ± 0%    +284.42% (p=0.002 n=6)     818.0 ± 0%    +165.58% (p=0.002 n=6)     833.0 ± 0%    +170.45% (p=0.002 n=6)     427.0 ± 0%    +38.64% (p=0.002 n=6)
ReadWrite/WritePostAndCommentsWithTx-4                                     7.000 ± 0%   434.000 ± 0%  +6100.00% (p=0.002 n=6)   1191.000 ± 0%  +16914.29% (p=0.002 n=6)   821.000 ± 0%  +11628.57% (p=0.002 n=6)   850.000 ± 0%  +12042.86% (p=0.002 n=6)   432.000 ± 0%  +6071.43% (p=0.002 n=6)
ReadWrite/ReadOrWritePostAndComments/write_rate=10-4                       284.5 ± 0%     715.0 ± 0%   +151.32% (p=0.002 n=6)     1202.0 ± 0%    +322.50% (p=0.002 n=6)     355.0 ± 1%     +24.78% (p=0.002 n=6)     715.0 ± 0%    +151.32% (p=0.002 n=6)     566.0 ± 0%    +98.95% (p=0.002 n=6)
ReadWrite/ReadOrWritePostAndComments/write_rate=90-4                       397.0 ± 1%     840.5 ± 1%   +111.71% (p=0.002 n=6)     1186.0 ± 0%    +198.74% (p=0.002 n=6)     765.5 ± 1%     +92.82% (p=0.002 n=6)     820.0 ± 0%    +106.55% (p=0.002 n=6)     442.0 ± 0%    +11.34% (p=0.002 n=6)
ReadWrite/ReadOrWritePostAndCommentsWithTx/write_rate=10-4                 241.0 ± 0%     680.5 ± 0%   +182.37% (p=0.002 n=6)     1215.0 ± 0%    +404.15% (p=0.002 n=6)     356.5 ± 1%     +47.93% (p=0.002 n=6)     737.0 ± 0%    +205.81% (p=0.002 n=6)     578.0 ± 0%   +139.83% (p=0.002 n=6)
ReadWrite/ReadOrWritePostAndCommentsWithTx/write_rate=90-4                 42.00 ± 7%    474.00 ± 0%  +1028.57% (p=0.002 n=6)    1193.50 ± 0%   +2741.67% (p=0.002 n=6)    769.50 ± 1%   +1732.14% (p=0.002 n=6)    837.00 ± 0%   +1892.86% (p=0.002 n=6)    447.50 ± 1%   +965.48% (p=0.002 n=6)
geomean                                                                    54.59          233.5        +327.71%                    376.2         +589.18%                   181.9         +233.27%                   282.5         +417.50%                   181.3        +232.21%

Query/Correlated

benchstat bench_ncruces_direct.txt bench_ncruces_driver.txt bench_modernc_driver.txt bench_zombiezen_direct.txt bench_mattn_driver.txt bench_tailscale_driver.txt
goos: darwin
goarch: arm64
pkg: github.com/michaellenaghan/go-sqlite-bench
cpu: Apple M2 Pro
                   │ bench_ncruces_direct.txt │     bench_ncruces_driver.txt      │      bench_modernc_driver.txt      │     bench_zombiezen_direct.txt     │       bench_mattn_driver.txt       │     bench_tailscale_driver.txt     │
                   │          sec/op          │   sec/op     vs base              │   sec/op     vs base               │   sec/op     vs base               │   sec/op     vs base               │   sec/op     vs base               │
Query/Correlated-4                644.6m ± 1%   704.9m ± 1%  +9.36% (p=0.002 n=6)   306.8m ± 0%  -52.40% (p=0.002 n=6)   328.4m ± 3%  -49.06% (p=0.002 n=6)   175.8m ± 1%  -72.72% (p=0.002 n=6)   170.8m ± 2%  -73.51% (p=0.002 n=6)

                   │ bench_ncruces_direct.txt │        bench_ncruces_driver.txt         │        bench_modernc_driver.txt        │  bench_zombiezen_direct.txt   │          bench_mattn_driver.txt          │       bench_tailscale_driver.txt       │
                   │           B/op           │     B/op       vs base                  │     B/op      vs base                  │     B/op      vs base         │     B/op       vs base                   │     B/op      vs base                  │
Query/Correlated-4              176.0 ± 4757%   47367.5 ± 33%  +26813.35% (p=0.002 n=6)   71233.0 ± 0%  +40373.30% (p=0.002 n=6)   2469.0 ± 21%  ~ (p=0.052 n=6)   207186.0 ± 0%  +117619.32% (p=0.002 n=6)   47203.5 ± 0%  +26720.17% (p=0.002 n=6)

                   │ bench_ncruces_direct.txt │         bench_ncruces_driver.txt         │         bench_modernc_driver.txt         │     bench_zombiezen_direct.txt      │          bench_mattn_driver.txt          │       bench_tailscale_driver.txt        │
                   │        allocs/op         │   allocs/op    vs base                   │   allocs/op    vs base                   │  allocs/op   vs base                │   allocs/op    vs base                   │   allocs/op    vs base                  │
Query/Correlated-4                4.000 ± 25%   4772.000 ± 0%  +119200.00% (p=0.002 n=6)   7777.000 ± 0%  +194325.00% (p=0.002 n=6)   41.000 ± 5%  +925.00% (p=0.002 n=6)   6772.000 ± 0%  +169200.00% (p=0.002 n=6)   3769.000 ± 0%  +94125.00% (p=0.002 n=6)

Query/GroupBy

benchstat bench_ncruces_direct.txt bench_ncruces_driver.txt bench_modernc_driver.txt bench_zombiezen_direct.txt bench_mattn_driver.txt bench_tailscale_driver.txt
goos: darwin
goarch: arm64
pkg: github.com/michaellenaghan/go-sqlite-bench
cpu: Apple M2 Pro
                │ bench_ncruces_direct.txt │      bench_ncruces_driver.txt      │      bench_modernc_driver.txt      │     bench_zombiezen_direct.txt     │       bench_mattn_driver.txt       │     bench_tailscale_driver.txt     │
                │          sec/op          │   sec/op     vs base               │   sec/op     vs base               │   sec/op     vs base               │   sec/op     vs base               │   sec/op     vs base               │
Query/GroupBy-4                569.3µ ± 0%   947.7µ ± 0%  +66.47% (p=0.002 n=6)   507.8µ ± 1%  -10.80% (p=0.002 n=6)   481.0µ ± 1%  -15.51% (p=0.002 n=6)   342.1µ ± 0%  -39.91% (p=0.002 n=6)   266.5µ ± 0%  -53.19% (p=0.002 n=6)

                │ bench_ncruces_direct.txt │       bench_ncruces_driver.txt       │       bench_modernc_driver.txt       │     bench_zombiezen_direct.txt      │        bench_mattn_driver.txt        │      bench_tailscale_driver.txt      │
                │           B/op           │    B/op      vs base                 │    B/op      vs base                 │    B/op      vs base                │    B/op      vs base                 │    B/op      vs base                 │
Query/GroupBy-4                 128.0 ± 2%   1811.5 ± 1%  +1315.23% (p=0.002 n=6)   1992.0 ± 0%  +1456.25% (p=0.002 n=6)   360.0 ± 29%  +181.25% (p=0.002 n=6)   7194.5 ± 0%  +5520.70% (p=0.002 n=6)   1816.0 ± 0%  +1318.75% (p=0.002 n=6)

                │ bench_ncruces_direct.txt │       bench_ncruces_driver.txt       │       bench_modernc_driver.txt        │     bench_zombiezen_direct.txt     │        bench_mattn_driver.txt         │      bench_tailscale_driver.txt      │
                │        allocs/op         │  allocs/op   vs base                 │  allocs/op    vs base                 │  allocs/op   vs base               │  allocs/op    vs base                 │  allocs/op   vs base                 │
Query/GroupBy-4                 4.000 ± 0%   89.000 ± 0%  +2125.00% (p=0.002 n=6)   124.000 ± 0%  +3000.00% (p=0.002 n=6)   6.000 ± 17%  +50.00% (p=0.002 n=6)   156.000 ± 0%  +3800.00% (p=0.002 n=6)   55.000 ± 0%  +1275.00% (p=0.002 n=6)

Query/JSON

benchstat bench_ncruces_direct.txt bench_ncruces_driver.txt bench_modernc_driver.txt bench_zombiezen_direct.txt bench_mattn_driver.txt bench_tailscale_driver.txt
goos: darwin
goarch: arm64
pkg: github.com/michaellenaghan/go-sqlite-bench
cpu: Apple M2 Pro
             │ bench_ncruces_direct.txt │      bench_ncruces_driver.txt       │   bench_modernc_driver.txt   │      bench_zombiezen_direct.txt      │      bench_mattn_driver.txt       │     bench_tailscale_driver.txt     │
             │          sec/op          │    sec/op     vs base               │   sec/op     vs base         │    sec/op      vs base               │   sec/op     vs base              │   sec/op     vs base               │
Query/JSON-4                9.657m ± 5%   15.266m ± 3%  +58.09% (p=0.002 n=6)   9.986m ± 1%  ~ (p=0.065 n=6)   11.068m ± 28%  +14.62% (p=0.002 n=6)   8.767m ± 3%  -9.22% (p=0.002 n=6)   7.108m ± 2%  -26.39% (p=0.002 n=6)

             │ bench_ncruces_direct.txt │        bench_ncruces_driver.txt        │        bench_modernc_driver.txt        │      bench_zombiezen_direct.txt      │          bench_mattn_driver.txt          │       bench_tailscale_driver.txt       │
             │           B/op           │     B/op      vs base                  │     B/op      vs base                  │     B/op      vs base                │     B/op       vs base                   │     B/op      vs base                  │
Query/JSON-4                128.0 ± 41%   41130.0 ± 1%  +32032.81% (p=0.002 n=6)   57027.0 ± 0%  +44452.34% (p=0.002 n=6)   371.5 ± 555%  +190.23% (p=0.002 n=6)   201050.0 ± 0%  +156970.31% (p=0.002 n=6)   33123.5 ± 0%  +25777.73% (p=0.002 n=6)

             │ bench_ncruces_direct.txt │        bench_ncruces_driver.txt         │         bench_modernc_driver.txt         │     bench_zombiezen_direct.txt      │          bench_mattn_driver.txt          │       bench_tailscale_driver.txt        │
             │        allocs/op         │   allocs/op    vs base                  │   allocs/op    vs base                   │  allocs/op    vs base               │   allocs/op    vs base                   │   allocs/op    vs base                  │
Query/JSON-4                 4.000 ± 0%   3023.000 ± 0%  +75475.00% (p=0.002 n=6)   4027.000 ± 0%  +100575.00% (p=0.002 n=6)   6.000 ± 500%  +50.00% (p=0.002 n=6)   5022.000 ± 0%  +125450.00% (p=0.002 n=6)   2023.000 ± 0%  +50475.00% (p=0.002 n=6)

Query/NonrecursiveCTE

benchstat bench_ncruces_direct.txt bench_ncruces_driver.txt bench_modernc_driver.txt bench_zombiezen_direct.txt bench_mattn_driver.txt bench_tailscale_driver.txt
goos: darwin
goarch: arm64
pkg: github.com/michaellenaghan/go-sqlite-bench
cpu: Apple M2 Pro
                        │ bench_ncruces_direct.txt │      bench_ncruces_driver.txt       │     bench_modernc_driver.txt      │     bench_zombiezen_direct.txt     │       bench_mattn_driver.txt       │     bench_tailscale_driver.txt     │
                        │          sec/op          │   sec/op     vs base                │   sec/op     vs base              │   sec/op     vs base               │   sec/op     vs base               │   sec/op     vs base               │
Query/NonRecursiveCTE-4                2.157m ± 1%   7.714m ± 1%  +257.61% (p=0.002 n=6)   2.224m ± 2%  +3.09% (p=0.002 n=6)   1.667m ± 2%  -22.70% (p=0.002 n=6)   2.544m ± 0%  +17.94% (p=0.002 n=6)   1.042m ± 1%  -51.68% (p=0.002 n=6)

                        │ bench_ncruces_direct.txt │        bench_ncruces_driver.txt        │        bench_modernc_driver.txt        │     bench_zombiezen_direct.txt      │          bench_mattn_driver.txt          │       bench_tailscale_driver.txt       │
                        │           B/op           │     B/op      vs base                  │     B/op      vs base                  │    B/op      vs base                │     B/op       vs base                   │     B/op      vs base                  │
Query/NonRecursiveCTE-4                128.0 ± 25%   39034.0 ± 0%  +30395.31% (p=0.002 n=6)   54938.0 ± 0%  +42820.31% (p=0.002 n=6)   362.5 ± 94%  +183.20% (p=0.002 n=6)   198962.5 ± 0%  +155339.45% (p=0.002 n=6)   31027.5 ± 0%  +24140.23% (p=0.002 n=6)

                        │ bench_ncruces_direct.txt │        bench_ncruces_driver.txt         │        bench_modernc_driver.txt         │     bench_zombiezen_direct.txt     │          bench_mattn_driver.txt          │       bench_tailscale_driver.txt        │
                        │        allocs/op         │   allocs/op    vs base                  │   allocs/op    vs base                  │  allocs/op   vs base               │   allocs/op    vs base                   │   allocs/op    vs base                  │
Query/NonRecursiveCTE-4                 4.000 ± 0%   2767.000 ± 0%  +69075.00% (p=0.002 n=6)   3770.000 ± 0%  +94150.00% (p=0.002 n=6)   6.000 ± 83%  +50.00% (p=0.002 n=6)   4766.000 ± 0%  +119050.00% (p=0.002 n=6)   1767.000 ± 0%  +44075.00% (p=0.002 n=6)

Query/OrderBy

benchstat bench_ncruces_direct.txt bench_ncruces_driver.txt bench_modernc_driver.txt bench_zombiezen_direct.txt bench_mattn_driver.txt bench_tailscale_driver.txt
goos: darwin
goarch: arm64
pkg: github.com/michaellenaghan/go-sqlite-bench
cpu: Apple M2 Pro
                │ bench_ncruces_direct.txt │       bench_ncruces_driver.txt       │      bench_modernc_driver.txt       │     bench_zombiezen_direct.txt     │       bench_mattn_driver.txt        │     bench_tailscale_driver.txt     │
                │          sec/op          │    sec/op     vs base                │    sec/op     vs base               │   sec/op     vs base               │    sec/op     vs base               │   sec/op     vs base               │
Query/OrderBy-4               76.95m ± 10%   297.67m ± 2%  +286.84% (p=0.002 n=6)   100.73m ± 1%  +30.90% (p=0.002 n=6)   91.59m ± 5%  +19.02% (p=0.002 n=6)   117.85m ± 1%  +53.15% (p=0.002 n=6)   61.54m ± 2%  -20.02% (p=0.002 n=6)

                │ bench_ncruces_direct.txt │          bench_ncruces_driver.txt          │          bench_modernc_driver.txt          │      bench_zombiezen_direct.txt      │           bench_mattn_driver.txt            │        bench_tailscale_driver.txt         │
                │           B/op           │      B/op       vs base                    │      B/op       vs base                    │     B/op      vs base                │      B/op        vs base                    │      B/op       vs base                   │
Query/OrderBy-4                478.0 ± 83%   5199618.0 ± 0%  +1087686.19% (p=0.002 n=6)   7600842.5 ± 0%  +1590034.41% (p=0.002 n=6)   4065.5 ± 11%  +750.52% (p=0.002 n=6)   11999112.0 ± 0%  +2510174.48% (p=0.002 n=6)   2799071.0 ± 0%  +585479.71% (p=0.002 n=6)

                │ bench_ncruces_direct.txt │          bench_ncruces_driver.txt          │          bench_modernc_driver.txt          │     bench_zombiezen_direct.txt     │           bench_mattn_driver.txt           │         bench_tailscale_driver.txt         │
                │        allocs/op         │   allocs/op     vs base                    │   allocs/op     vs base                    │ allocs/op   vs base                │   allocs/op     vs base                    │   allocs/op     vs base                    │
Query/OrderBy-4                 12.00 ± 0%   299783.00 ± 0%  +2498091.67% (p=0.002 n=6)   549808.00 ± 0%  +4581633.33% (p=0.002 n=6)   64.00 ± 2%  +433.33% (p=0.002 n=6)   349772.00 ± 0%  +2914666.67% (p=0.002 n=6)   149768.00 ± 0%  +1247966.67% (p=0.002 n=6)

Query/RecursiveCTE

benchstat bench_ncruces_direct.txt bench_ncruces_driver.txt bench_modernc_driver.txt bench_zombiezen_direct.txt bench_mattn_driver.txt bench_tailscale_driver.txt
goos: darwin
goarch: arm64
pkg: github.com/michaellenaghan/go-sqlite-bench
cpu: Apple M2 Pro
                     │ bench_ncruces_direct.txt │      bench_ncruces_driver.txt      │      bench_modernc_driver.txt      │     bench_zombiezen_direct.txt     │       bench_mattn_driver.txt       │     bench_tailscale_driver.txt     │
                     │          sec/op          │   sec/op     vs base               │   sec/op     vs base               │   sec/op     vs base               │   sec/op     vs base               │   sec/op     vs base               │
Query/RecursiveCTE-4                6.193m ± 2%   8.464m ± 1%  +36.68% (p=0.002 n=6)   5.332m ± 0%  -13.90% (p=0.002 n=6)   5.292m ± 1%  -14.55% (p=0.002 n=6)   2.484m ± 1%  -59.89% (p=0.002 n=6)   2.410m ± 0%  -61.08% (p=0.002 n=6)

                     │ bench_ncruces_direct.txt │       bench_ncruces_driver.txt       │       bench_modernc_driver.txt       │      bench_zombiezen_direct.txt      │        bench_mattn_driver.txt        │      bench_tailscale_driver.txt      │
                     │           B/op           │    B/op      vs base                 │    B/op      vs base                 │     B/op      vs base                │    B/op      vs base                 │    B/op      vs base                 │
Query/RecursiveCTE-4                128.0 ± 73%   1999.5 ± 8%  +1462.11% (p=0.002 n=6)   2390.5 ± 0%  +1767.58% (p=0.002 n=6)   368.5 ± 279%  +187.89% (p=0.002 n=6)   6892.5 ± 0%  +5284.77% (p=0.002 n=6)   1753.0 ± 0%  +1269.53% (p=0.002 n=6)

                     │ bench_ncruces_direct.txt │       bench_ncruces_driver.txt       │       bench_modernc_driver.txt        │     bench_zombiezen_direct.txt      │        bench_mattn_driver.txt         │      bench_tailscale_driver.txt      │
                     │        allocs/op         │  allocs/op   vs base                 │  allocs/op    vs base                 │  allocs/op    vs base               │  allocs/op    vs base                 │  allocs/op   vs base                 │
Query/RecursiveCTE-4                 4.000 ± 0%   83.000 ± 0%  +1975.00% (p=0.002 n=6)   114.000 ± 0%  +2750.00% (p=0.002 n=6)   6.000 ± 233%  +50.00% (p=0.002 n=6)   144.000 ± 0%  +3500.00% (p=0.002 n=6)   52.000 ± 0%  +1200.00% (p=0.002 n=6)

5 replies

michaellenaghan Mar 20, 2025
Author

The performance of ncruces_driver on Baseline/Select1 is curious, no? (All of the *_driver implementations run the exact same code.)

                   │ bench_ncruces_direct.txt │    bench_ncruces_driver.txt     │    bench_modernc_driver.txt     │   bench_zombiezen_direct.txt    │     bench_mattn_driver.txt      │   bench_tailscale_driver.txt    │
                   │          sec/op          │    sec/op     vs base           │    sec/op     vs base           │    sec/op     vs base           │    sec/op     vs base           │    sec/op     vs base           │
Baseline/Select1-4               1.755µ ± ∞ ¹   7.882µ ± ∞ ¹  ~ (p=1.000 n=1) ²   2.093µ ± ∞ ¹  ~ (p=1.000 n=1) ²   2.710µ ± ∞ ¹  ~ (p=1.000 n=1) ²   2.721µ ± ∞ ¹  ~ (p=1.000 n=1) ²   1.107µ ± ∞ ¹  ~ (p=1.000 n=1) ²

daenney Mar 20, 2025

It doesn't look like you're calling sqlite3.Initialise for ncruces_driver. So I think that benchmark is skewed because it's also including WASM compilation time.

You do it for ncruces_direct: https://github.com/michaellenaghan/go-sqlite-bench/blob/d53efde5fa7a5b28cdcfcd64ead1344c7739cf83/gsb_ncruces_direct.go#L36. But I don't see the equivalent in https://github.com/michaellenaghan/go-sqlite-bench/blob/d53efde5fa7a5b28cdcfcd64ead1344c7739cf83/gsb_ncruces_driver.go.

If OpenDB isn't part of what the benchmark measures, then you can put it in there. Otherwise, a func init() that does it would probably also do the trick.

michaellenaghan Mar 20, 2025
Author

I'll add an explicit call, but NewDB() in the common driver code gets and immediately closes a connection before timing starts so it won't actually impact any numbers. (I'm re-running the benchmarks now, and there's no change.)

Having said that, the lack of an explicit call may explain an issue I thought I was seeing with ncruces, an issue that looked like a race condition in parallel tests. That's why I added the get and close calls in the first place! I'll test that too.

ncruces Mar 20, 2025
Maintainer

The race you're experiencing from parallel tests is probably due to the fact that creating a new database and converting it to WAL mode (databases are always created in rollback mode) is kinda racy.

I'm not sure if this is due to something wrong on my end (my VFS), or a property the SQLite locking protocol. I had to do this to fix some flakes in my parallel tests (though they mostly affected BSD).

michaellenaghan Mar 20, 2025
Author

What you're saying makes sense, but a) the database always needs to be "prepared" for use before any tests start — so one connection is always opened before the rest anyway — and b) ncruces_driver was the only *_driver that had the problem.

In any event, I don't think my workaround is needed anymore — and I never saw this is a major problem, just a quirk that I didn't understand.

ncruces · 2025-03-26T11:51:37Z

ncruces
Mar 26, 2025
Maintainer

The current state on head should address most issues.

The only remaining one that I see could potentially be fixed, is unnecessary allocations in Rows.Next, particularly visible in the Query/OrderBy benchmark.

These happen because database/sql/Rows.Next calls database/sql/driver.Rows.Next, which does the driver side of scanning, even if the values are then ignored. That's why all drivers consume lots of memory, compared to the lower level interfaces; if the lower level interfaces actually accessed the data, they'd do some allocations, at least though.

Where my driver allocates more in Rows.Next, it's because one of the strings is a timestamp. This can double (or triple) allocations. Stmt.Columns creates a string, driver.Rows.Next converts it to a time.Time, and Rows.Scan can convert it back to another string.

It's not clear how to fix this without breaking some abstractions, or not automatically decoding timestamps, which IMO is a nice feature. This feature means that, if you bind any of the following types into a cell, you can always scan them back with the same type, regardless of the declared the type of the column:

int64
float64
bool
[]byte
string
time.Time

6 replies

ncruces Mar 26, 2025
Maintainer

golang/go#67546 will help a bit if it gets merged.

Otherwise, it's possible to improve matters by breaking some abstractions in Stmt.Columns.

I can either change it, or add an overload that receives a callback (?) that decides whether to allocate the string or return a[]byte owned by SQLite. Or I can create a type sqlite3.RawText []byte and use that for strings. It's just annoying because then it's a bit confusing that all the ColumnRawText and etc methods return a []byte.

Or I could go super hack and start addding unexported go:linkname functions that do exactly what I need. I already have a few internal APIs used for cross package stuff no one is meant to call (they're exported, but documented as "don't use"). It's ugly but go:linkname of this form is accepted, and the standard library does the same.

ncruces Mar 27, 2025
Maintainer

I just had an interesting idea.

I could add a Stmt.ColumnsRaw method that uses []byte for TEXT values, like all Raw methods do. The problem then becomes: how to distinguish a TEXT from a BLOB?

Well, it turns out there is a way, which I could use for this, and I think all the other exiting APIs, consistently: C-strings returned by SQLite are NUL terminated. So I can represent them as a []byte where cap=len+1 and a zero on that last byte. All it takes to make the distinction, if you care, is len!=cap.

Wdyt? Not sure if this is ingenious or batshit crazy...

michaellenaghan Mar 27, 2025
Author

I don't understand the underlying issues here, but it sounds like you're suggesting a heuristic based on the value of the last byte, and it sounds like there would be no way to override the heuristic. I would prefer to opt in to something like that rather than opt out — or, worse, not being able to opt out at all.

Where my driver allocates more in Rows.Next, it's because one of the strings is a timestamp.

Maybe that's the answer there too? Not just because of the performance/memory implications, but because the heuristic might fail — or fail to meet the needs of a particular use case.

I've had problems like that in Excel, where it tries to help by doing something automatically but just creates more work for me.

ncruces Mar 27, 2025
Maintainer

You have 2 options to retrieve TEXT from a column:

call ColumnText and get back a string
this must allocate a string
call ColumnRawText and get back a []byte
no allocation, it gives you a view into C memory that's only valid until the next Step

Similar for BLOB:

call ColumnBlob
append to a []byte, makes a copy, may allocate
call ColumnRawBlob
no allocation, it gives you a view into C memory that's only valid until the next Step

Then you have Columns(dest ...any). This already allocates because values converted to any escape, but to distinguish between TEXT and BLOB, the TEXT become newly allocated string, and the BLOB are (currently) []byte views into C memory.

I'm proposing Columns(dest ...any) always allocs/appends/copies, and ColumnsRaw(dest ...any) always returns views. But this doesn't allow to distinguish TEXT from BLOB.

I'm further proposing that the []byte created from a TEXT "abc" is this []byte{'a','b','c',0}[:3]: the length is 3, and the capacity is 4. For almost all purposes the difference between that and []byte{'a','b','c'} is completely irrelevant. But you can now make the distinction between TEXT from BLOB.

ncruces Mar 27, 2025
Maintainer

About timestamps, database/sql really wants you to support bool and time.Time which SQLite doesn't have natively.

With the current scheme:

Bind `T`	SQLite	Scan `*any`	Scan `*T`
`nil`	`NULL`	`nil`	`nil`
`bool`	`INTEGER`	`int64`	`bool`
`int64`	`INTEGER`	`int64`	`int64`
`float64`	`FLOAT`	`float64`	`float64`
`[]byte`	`BLOB`	`[]byte`	`[]byte`
`string`	`TEXT`	`string` or `time.Time`	`string`
`time.Time`	`TEXT`	`time.Time`	`time.Time`

For bool, database/sql saves our bacon; we don't need to do anything. So only the last two lines matter. Only those have an “heuristic”. If we remove the heuristic they become:

Bind `T`	SQLite	Scan `*any`	Scan `*T`
`string`	`TEXT`	`string`	`string`
`time.Time`	`TEXT`	`string`	`error`

The only surprising behavior here, besides some additional allocs we can optimize away, is that some strings (those that exactly match RFC3339Nano) will come out as time.Time. This does not prevent you from scanning them as strings, losslessly.

I happen to think this is a good heuristic, @daenney seems to agree.

ncruces · 2025-03-29T12:15:06Z

ncruces
Mar 29, 2025
Maintainer

So I guess #256 is what I'm proposing. RFC3339Nano timestamps do not escape; other time formats do, but could be similarly optimized. Without golang/go#67546 it's probably the best we can do.

2 replies

daenney Mar 29, 2025

That looks workable to me. The most likely case for timestamps in a database is either RFC3339(Nano) text or int64 epoch counters in my experience.

ncruces Mar 31, 2025
Maintainer

So #256 improves the allocation issue to my satisfaction. It still allocates twice as much as crawshaw for RFC3339 (if you really want a string), but at least it's not 3 times as much. The additional allocs were a reversion from when I initially introduced Columns, so I'm happy that ColumnsRaw fixes it with a not terrible API. It means that people using low level also get access to nicely optimized APIs.

daenney · 2025-03-31T10:43:03Z

daenney
Mar 31, 2025

Apparently there's also https://pkg.go.dev/modernc.org/sqlite-bench2 for database/sql drivers and https://pkg.go.dev/modernc.org/sqlite-bench for non-database/sql drivers.

3 replies

ncruces Mar 31, 2025
Maintainer

Seems to be the same benchmark as cvilsmeier/go-sqlite-bench, just the modernc maintainer runs it across platforms, with wildly different results.

I'm a bit sceptical, because I can consistently repro the concurrent issues in modernc and glebarez (independently from @cvilsmeier), yet they are totally absent in those runs by modernc; glebarez is a different driver based on the same low level transpilation, and it shows the same issue. I don't know if this is a recent fix, maybe it is.

Anyway, I've profiled cvilsmeier/go-sqlite-bench extensively, so I think there's little value to me in these benchmarks. I'm more concerned with how I can improve my driver, than how it measures against others. That can be informative (esp. regarding number of allocs while keeping similarly useful semantics), but mostly it's informative for users.

daenney Mar 31, 2025

Yeap, I agree. I looked at the results of those benchmarks and I'm honestly a bit surprised by the numbers.

ncruces Mar 31, 2025
Maintainer

It's plausible that the concurrency issues were due to the mutex implementation. It's surprising that they didn't affect zombiezen if that's so. I always assumed it was something at the driver level. I honestly never had the energy to investigate.

I also don't really want to get into a competition.

I mostly don't love transpilation because it produces something akin to a platform specific blob: it's Go source code, but it's not so easy to inspect, full of unsafe code, and very platform specific. The transpiler is getting better (and libc probably as well), so I know it's become easier to compile your own SQLite, but still harder IMO than just using wasi-sdk+binaryen.

My version uses off the shelf tools to compile SQLite (wasi-sdk+binaryen that the SQLite developers themselves use through emscripten), uses some unsafe, yes (the wrapper and certainly wazero), but sandboxes most of the resulting unsafety, with each connection isolated from the rest. OTOH it reimplements the VFS, which is risky business.

One advantage of their approach is memory usage/copies: not using a sandbox allows you to share memory more effectively. It also allows you to corrupt memory more easily. With wazero it's possible to corrupt the C heap, and even the C stack, but it's much harder to corrupt the C call stack, the Go stack and heap, or to do so across connections. There's a price to pay though.

ncruces · 2025-03-31T16:44:04Z

ncruces
Mar 31, 2025
Maintainer

Released v0.25.0 with these improvements.

In the future there might be a small regression: the next time I build SQLite, I'm considering enabling SQLITE_ENABLE_API_ARMOR.

I've tried it and it didn't seem to be catastrophic. I used speedtest1 for impact in internal SQLite usage – which should be mostly optimized away – and on the stats package, which does a large amount of Go-to-C-to-Go API calls. The impact was almost negligible in both cases.

I'll pour over the code and check how useful it is for memory safety, which IMO, beats a few percentage points.

0 replies

ncruces · 2025-04-01T12:28:47Z

ncruces
Apr 1, 2025
Maintainer

@michaellenaghan many of the allocs in the Baseline/Parallel benchmarks go away if you add this to your connections:

[DB].SetMaxIdleConns(runtime.GOMAXPROCS(0))
[DB].SetMaxOpenConns([MAX])

The reason is that the queries you're benchmarking are so fast, that the default for idle connections (2) is insufficient to deal with the churn. So the pool is opening and closing connections left and right, and that's what you're actually benchmarking (opening a connection for every e.g. 4 queries).

Also 256 connections is a lot.

11 replies

michaellenaghan Apr 3, 2025
Author

I added eatonphil. Sometimes faster than tailscale, sometimes slower.

When it's slower it's because it exhibits the same kind of concurrency issue as modernc and mattn.

michaellenaghan Apr 3, 2025
Author

I can't figure out why two CGO implementations, eatonphil and mattn, and all of the modernc implementations, have concurrency problems in benchmarks like this one:

                                 │ bench_ncruces_direct.txt │        bench_ncruces_driver.txt        │       bench_eatonphil_direct.txt       │       bench_glebarez_driver.txt        │         bench_mattn_driver.txt         │        bench_modernc_driver.txt        │       bench_tailscale_driver.txt       │       bench_zombiezen_direct.txt       │
                                 │          sec/op          │    sec/op      vs base                 │    sec/op      vs base                 │    sec/op      vs base                 │    sec/op      vs base                 │    sec/op      vs base                 │    sec/op      vs base                 │    sec/op      vs base                 │
Query/NonRecursiveCTEParallel                 2100.4µ ± ∞ ¹   2377.5µ ± ∞ ¹        ~ (p=0.100 n=3) ²    993.1µ ± ∞ ¹        ~ (p=0.100 n=3) ²   1894.3µ ± ∞ ¹        ~ (p=0.100 n=3) ²   1593.8µ ± ∞ ¹        ~ (p=0.100 n=3) ²   1882.1µ ± ∞ ¹        ~ (p=0.100 n=3) ²   1049.8µ ± ∞ ¹        ~ (p=0.100 n=3) ²   1611.4µ ± ∞ ¹        ~ (p=0.100 n=3) ²
Query/NonRecursiveCTEParallel-2               1060.2µ ± ∞ ¹   1217.4µ ± ∞ ¹        ~ (p=0.100 n=3) ²    617.9µ ± ∞ ¹        ~ (p=0.100 n=3) ²    972.2µ ± ∞ ¹        ~ (p=0.100 n=3) ²    828.8µ ± ∞ ¹        ~ (p=0.100 n=3) ²    975.5µ ± ∞ ¹        ~ (p=0.100 n=3) ²    537.7µ ± ∞ ¹        ~ (p=0.100 n=3) ²    833.0µ ± ∞ ¹        ~ (p=0.100 n=3) ²
Query/NonRecursiveCTEParallel-4                542.4µ ± ∞ ¹    629.4µ ± ∞ ¹        ~ (p=0.100 n=3) ²    888.9µ ± ∞ ¹        ~ (p=0.100 n=3) ²    525.8µ ± ∞ ¹        ~ (p=0.100 n=3) ²    470.2µ ± ∞ ¹        ~ (p=0.100 n=3) ²    531.1µ ± ∞ ¹        ~ (p=0.400 n=3) ²    280.6µ ± ∞ ¹        ~ (p=0.100 n=3) ²    452.7µ ± ∞ ¹        ~ (p=0.100 n=3) ²
Query/NonRecursiveCTEParallel-8                272.6µ ± ∞ ¹    343.8µ ± ∞ ¹        ~ (p=0.100 n=3) ²   1636.5µ ± ∞ ¹        ~ (p=0.100 n=3) ²    877.1µ ± ∞ ¹        ~ (p=0.100 n=3) ²    992.4µ ± ∞ ¹        ~ (p=0.100 n=3) ²    871.7µ ± ∞ ¹        ~ (p=0.100 n=3) ²    149.5µ ± ∞ ¹        ~ (p=0.100 n=3) ²    842.8µ ± ∞ ¹        ~ (p=0.100 n=3) ²
Query/NonRecursiveCTEParallel-16               231.7µ ± ∞ ¹    313.8µ ± ∞ ¹        ~ (p=0.100 n=3) ²   1448.6µ ± ∞ ¹        ~ (p=0.100 n=3) ²    918.0µ ± ∞ ¹        ~ (p=0.100 n=3) ²   1027.8µ ± ∞ ¹        ~ (p=0.100 n=3) ²    906.9µ ± ∞ ¹        ~ (p=0.100 n=3) ²    138.6µ ± ∞ ¹        ~ (p=0.100 n=3) ²    854.7µ ± ∞ ¹        ~ (p=0.100 n=3) ²
geomean                                        597.7µ          722.3µ        +20.84%                    1.053m        +76.13%                    951.4µ        +59.18%                    912.8µ        +52.71%                    949.3µ        +58.82%                    318.6µ        -46.70%                    847.7µ        +41.82%

but not in benchmarks like this one:

                              │ bench_ncruces_direct.txt │       bench_ncruces_driver.txt        │      bench_eatonphil_direct.txt       │       bench_glebarez_driver.txt        │        bench_mattn_driver.txt         │        bench_modernc_driver.txt        │      bench_tailscale_driver.txt       │       bench_zombiezen_direct.txt       │
                              │          sec/op          │    sec/op      vs base                │    sec/op     vs base                 │    sec/op      vs base                 │    sec/op     vs base                 │    sec/op      vs base                 │    sec/op     vs base                 │    sec/op      vs base                 │
Query/RecursiveCTEParallel                  6.134m ± ∞ ¹    6.069m ± ∞ ¹       ~ (p=0.100 n=3) ²   2.505m ± ∞ ¹        ~ (p=0.100 n=3) ²    5.285m ± ∞ ¹        ~ (p=0.100 n=3) ²   2.400m ± ∞ ¹        ~ (p=0.100 n=3) ²    5.304m ± ∞ ¹        ~ (p=0.100 n=3) ²   2.434m ± ∞ ¹        ~ (p=0.100 n=3) ²    5.227m ± ∞ ¹        ~ (p=0.100 n=3) ²
Query/RecursiveCTEParallel-2                3.083m ± ∞ ¹    3.076m ± ∞ ¹       ~ (p=0.700 n=3) ²   1.266m ± ∞ ¹        ~ (p=0.100 n=3) ²    2.671m ± ∞ ¹        ~ (p=0.100 n=3) ²   1.223m ± ∞ ¹        ~ (p=0.100 n=3) ²    2.668m ± ∞ ¹        ~ (p=0.100 n=3) ²   1.231m ± ∞ ¹        ~ (p=0.100 n=3) ²    2.645m ± ∞ ¹        ~ (p=0.100 n=3) ²
Query/RecursiveCTEParallel-4               1577.8µ ± ∞ ¹   1584.4µ ± ∞ ¹       ~ (p=0.400 n=3) ²   646.7µ ± ∞ ¹        ~ (p=0.100 n=3) ²   1374.3µ ± ∞ ¹        ~ (p=0.100 n=3) ²   620.3µ ± ∞ ¹        ~ (p=0.100 n=3) ²   1371.0µ ± ∞ ¹        ~ (p=0.100 n=3) ²   626.8µ ± ∞ ¹        ~ (p=0.100 n=3) ²   1358.6µ ± ∞ ¹        ~ (p=0.100 n=3) ²
Query/RecursiveCTEParallel-8                853.6µ ± ∞ ¹    797.2µ ± ∞ ¹       ~ (p=0.200 n=3) ²   328.9µ ± ∞ ¹        ~ (p=0.100 n=3) ²    762.4µ ± ∞ ¹        ~ (p=0.100 n=3) ²   318.0µ ± ∞ ¹        ~ (p=0.100 n=3) ²    754.8µ ± ∞ ¹        ~ (p=0.100 n=3) ²   318.3µ ± ∞ ¹        ~ (p=0.100 n=3) ²    745.6µ ± ∞ ¹        ~ (p=0.100 n=3) ²
Query/RecursiveCTEParallel-16               717.7µ ± ∞ ¹    705.8µ ± ∞ ¹       ~ (p=0.700 n=3) ²   286.1µ ± ∞ ¹        ~ (p=0.100 n=3) ²    662.0µ ± ∞ ¹        ~ (p=0.100 n=3) ²   280.0µ ± ∞ ¹        ~ (p=0.100 n=3) ²    651.0µ ± ∞ ¹        ~ (p=0.100 n=3) ²   276.9µ ± ∞ ¹        ~ (p=0.100 n=3) ²    662.3µ ± ∞ ¹        ~ (p=0.100 n=3) ²
geomean                                     1.788m          1.755m        -1.86%                   719.6µ        -59.75%                    1.578m        -11.74%                   695.0µ        -61.13%                    1.570m        -12.21%                   697.9µ        -60.97%                    1.561m        -12.69%

while one CGO implementation, tailscale, and one WASM implementation, ncruces, do not.

Here's the SQL for the first benchmark:

	WITH day_totals AS (
		SELECT date(created) as day, COUNT(*) as day_total
		FROM posts
		GROUP BY day
	)
	SELECT day, day_total,
		SUM(day_total) OVER (ORDER BY day) as running_total
	FROM day_totals
	ORDER BY day

Here's the SQL for the second benchmark:

	WITH RECURSIVE dates(day) AS (
		SELECT date('now', '-30 days')
		UNION ALL
		SELECT date(day, '+1 day')
		FROM dates
		WHERE day < date('now')
	)
	SELECT day,
		(SELECT COUNT(*) FROM posts WHERE date(created) = day) as day_total
	FROM dates
	ORDER BY day

What could be causing the difference?

(The compiler options and pragmas are available in the tests/ directory, but so far I haven't found anything that accounts for the difference.)

It doesn't really matter, but I'm curious. And maybe there's an interesting lesson in the answer?

michaellenaghan Apr 5, 2025
Author

The README file now contains the benchmark results and various implementation details (compile-time options, pragmas, SQLite version).

I still need to document more things, but you no longer need to jump in and out of files to review everything.

michaellenaghan Apr 8, 2025
Author

What could be causing the difference?

I tracked the difference down to this:

#cgo CFLAGS: -DSQLITE_DEFAULT_MEMSTATUS=0

For eatonphil, for example, here's the before (without the #define):

$ make bench-all TAGS="eatonphil_direct" BENCH_PATTERN="Query/.*Parallel"
...
BenchmarkQuery/CorrelatedParallel-12         	    2430	    490540 ns/op	       2 B/op	       0 allocs/op
BenchmarkQuery/GroupByParallel-12            	    8054	    152996 ns/op	       1 B/op	       0 allocs/op
BenchmarkQuery/JSONParallel-12               	    4710	    239792 ns/op	       1 B/op	       0 allocs/op
BenchmarkQuery/OrderByParallel-12            	     860	   1502472 ns/op	       7 B/op	       0 allocs/op
BenchmarkQuery/RecursiveCTEParallel-12       	   15930	     76062 ns/op	       0 B/op	       0 allocs/op
BenchmarkQuery/WindowParallel-12             	    3801	    323481 ns/op	       1 B/op	       0 allocs/op

and here's the after (with the #define):

$ make bench-all TAGS="eatonphil_direct" BENCH_PATTERN="Query/.*Parallel"
...
BenchmarkQuery/CorrelatedParallel-12         	    2212	    483308 ns/op	       2 B/op	       0 allocs/op
BenchmarkQuery/GroupByParallel-12            	  131900	     10000 ns/op	       0 B/op	       0 allocs/op
BenchmarkQuery/JSONParallel-12               	   60294	     19672 ns/op	       0 B/op	       0 allocs/op
BenchmarkQuery/OrderByParallel-12            	    2414	    479563 ns/op	       4 B/op	       0 allocs/op
BenchmarkQuery/RecursiveCTEParallel-12       	   20524	     59374 ns/op	       0 B/op	       0 allocs/op
BenchmarkQuery/WindowParallel-12             	   53106	     23863 ns/op	       0 B/op	       0 allocs/op

So:

GroupByParallel: ~15x faster
JSONParallel: ~12x faster
OrderByParallel: ~3x faster
WindowParallel: ~13x faster

"Faster" here means parallel performance, not sequential. But, you know, wow; not bad for a one line change!

ncruces Apr 9, 2025
Maintainer

That probably increases malloc lock contention. Not a problem for me, as each connection has a separate heap.

While I don't wrap sqlite3_status64 to track memory usage, I do wrap sqlite3_soft/hard_heap_limit64 and need the define. Also, given how memory works with sandboxing, PRAGMA soft_heap_limit can actually be a good idea for this driver. I've even wondered if I should call it automatically (set a soft limit X MB below the maximum heap size).

Might be a good tip for modernc and mattn. None of them seem to expose these features directly, unless their users are using PRAGMA soft_heap_limit.

ncruces · 2025-04-02T18:17:19Z

ncruces
Apr 2, 2025
Maintainer

This HN thread lead to #257 which seems a nice improvement, though, overall it's not measurable on speedtest.

Can't believe this'll lead me to reimplement libc. 🤷
I'll test more.

10 replies

ncruces Apr 9, 2025
Maintainer

Yes, which is nice: that doesn't cause a regression, and improves “undefined” behavior.

For the most part, this means “use after close” of prepared statements, blobs, backups, should now more consistently return MISUSE without me added a check to every API.

For connections, I tear down wazero on close so it shouldn't matter much, but if mess up pointers that I can't use in a type safe manner, this should catch mistakes.

ncruces Apr 9, 2025
Maintainer

Most checks are just a if ptr != nil, return MISUSE on the native C side, so they shouldn't cause much pain, in the compiler, for the no error case.

The interpreter would probably gain from me doing the check on Go before the call, but I may forget. I'm sure the SQlite developers are more consistent.

ncruces Apr 13, 2025
Maintainer

Overall, happy that it's even measurable. I'm betting most of the win is from… strlen; NUL terminated strings are just stupid.

It seems the new strlen is not an improvement on ARM (tested on the Ubuntu and macOS runners).

The combination of wasm_i8x16_bitmask and wasm_i8x16_eq must be really slow, and much slower than wasm_i8x16_all_true.

ncruces Apr 14, 2025
Maintainer

OK, so #262 "fixes" strlen and strchr on ARM, and tetratelabs/wazero#2395 will improve memset, but only for n > 16KB, which is not that common.

It also warms my heart that I wasn't crazy, most allocs are 16 byte aligned after all.

I think this covers everything SQLite needs.

ncruces Apr 15, 2025
Maintainer

Nope, this never ends: #263. 200 LoC, dragons at every corner. 🙄

Benchmarking performance #247

michaellenaghan Mar 15, 2025

TL;DR

Baseline

Populate

ReadWrite

Query

Correlated Aggregation

CTE

CTE (Recursive)

Group By Aggregation

JSON

Replies: 12 comments · 81 replies

michaellenaghan Mar 15, 2025 Author

daenney Mar 15, 2025

michaellenaghan Mar 18, 2025 Author

michaellenaghan Mar 18, 2025 Author

ncruces Mar 15, 2025 Maintainer

michaellenaghan Mar 21, 2025 Author

michaellenaghan Mar 21, 2025 Author

michaellenaghan Mar 21, 2025 Author

ncruces Mar 21, 2025 Maintainer

michaellenaghan Mar 24, 2025 Author

danp Mar 15, 2025

ncruces Mar 15, 2025 Maintainer

michaellenaghan Mar 19, 2025 Author

michaellenaghan Mar 19, 2025 Author

michaellenaghan Mar 20, 2025 Author

Baseline

Populate

ReadWrite

Query/Correlated

Query/GroupBy

Query/JSON

Query/NonrecursiveCTE

Query/OrderBy

Query/RecursiveCTE

michaellenaghan Mar 20, 2025 Author

daenney Mar 20, 2025

michaellenaghan Mar 20, 2025 Author

ncruces Mar 20, 2025 Maintainer

michaellenaghan Mar 20, 2025 Author

ncruces Mar 26, 2025 Maintainer

ncruces Mar 26, 2025 Maintainer

ncruces Mar 27, 2025 Maintainer

michaellenaghan Mar 27, 2025 Author

ncruces Mar 27, 2025 Maintainer

ncruces Mar 27, 2025 Maintainer

ncruces Mar 29, 2025 Maintainer

daenney Mar 29, 2025

ncruces Mar 31, 2025 Maintainer

daenney Mar 31, 2025

ncruces Mar 31, 2025 Maintainer

daenney Mar 31, 2025

ncruces Mar 31, 2025 Maintainer

ncruces Mar 31, 2025 Maintainer

ncruces Apr 1, 2025 Maintainer

michaellenaghan Apr 3, 2025 Author

michaellenaghan Apr 3, 2025 Author

michaellenaghan Apr 5, 2025 Author

michaellenaghan Apr 8, 2025 Author

ncruces Apr 9, 2025 Maintainer

ncruces Apr 2, 2025 Maintainer

ncruces Apr 9, 2025 Maintainer

ncruces Apr 9, 2025 Maintainer

ncruces Apr 13, 2025 Maintainer

ncruces Apr 14, 2025 Maintainer

ncruces Apr 15, 2025 Maintainer

michaellenaghan
Mar 15, 2025

Replies: 12 comments 81 replies

michaellenaghan
Mar 15, 2025
Author

daenney
Mar 15, 2025

michaellenaghan Mar 18, 2025
Author

michaellenaghan Mar 18, 2025
Author

ncruces
Mar 15, 2025
Maintainer

michaellenaghan Mar 21, 2025
Author

michaellenaghan Mar 21, 2025
Author

michaellenaghan Mar 21, 2025
Author

ncruces Mar 21, 2025
Maintainer

michaellenaghan Mar 24, 2025
Author

danp
Mar 15, 2025

ncruces Mar 15, 2025
Maintainer

michaellenaghan
Mar 19, 2025
Author

michaellenaghan Mar 19, 2025
Author

michaellenaghan
Mar 20, 2025
Author

michaellenaghan Mar 20, 2025
Author

michaellenaghan Mar 20, 2025
Author

ncruces Mar 20, 2025
Maintainer

michaellenaghan Mar 20, 2025
Author

ncruces
Mar 26, 2025
Maintainer

ncruces Mar 26, 2025
Maintainer

ncruces Mar 27, 2025
Maintainer

michaellenaghan Mar 27, 2025
Author

ncruces Mar 27, 2025
Maintainer

ncruces Mar 27, 2025
Maintainer

ncruces
Mar 29, 2025
Maintainer

ncruces Mar 31, 2025
Maintainer

daenney
Mar 31, 2025

ncruces Mar 31, 2025
Maintainer

ncruces Mar 31, 2025
Maintainer

ncruces
Mar 31, 2025
Maintainer

ncruces
Apr 1, 2025
Maintainer

michaellenaghan Apr 3, 2025
Author

michaellenaghan Apr 3, 2025
Author

michaellenaghan Apr 5, 2025
Author

michaellenaghan Apr 8, 2025
Author

ncruces Apr 9, 2025
Maintainer

ncruces
Apr 2, 2025
Maintainer

ncruces Apr 9, 2025
Maintainer

ncruces Apr 9, 2025
Maintainer

ncruces Apr 13, 2025
Maintainer

ncruces Apr 14, 2025
Maintainer

ncruces Apr 15, 2025
Maintainer