Benchmarking performance #247
Replies: 12 comments 81 replies
-
Btw, here's the SQL for the various queries: const SQLForQueryCorrelatedAggregation = `
SELECT
id,
title,
(SELECT COUNT(*) FROM comments WHERE post_id = posts.id) as comment_count,
(SELECT AVG(LENGTH(content)) FROM comments WHERE post_id = posts.id) AS avg_comment_length,
(SELECT MAX(LENGTH(content)) FROM comments WHERE post_id = posts.id) AS max_comment_length
FROM posts`
const SQLForQueryCTE = `
WITH day_totals AS (
SELECT date(created) as day, COUNT(*) as day_total
FROM posts
GROUP BY day
)
SELECT day, day_total,
SUM(day_total) OVER (ORDER BY day) as running_total
FROM day_totals
ORDER BY day`
const SQLForQueryCTERecursive = `
WITH RECURSIVE dates(day) AS (
SELECT date('now', '-30 days')
UNION ALL
SELECT date(day, '+1 day')
FROM dates
WHERE day < date('now')
)
SELECT day,
(SELECT COUNT(*) FROM posts WHERE date(created) = day) as day_total
FROM dates
ORDER BY day`
const SQLForQueryGroupByAggregation = `
SELECT
strftime('%Y-%m', created) AS month,
COUNT(*) as month_total
FROM posts
GROUP BY month
ORDER BY month`
const SQLForQueryJSON = `
SELECT
date(created) as day,
SUM(json_extract(stats, '$.lorem')) as sum_lorem,
AVG(json_extract(stats, '$.ipsum.dolor')) as avg_dolor,
MAX(json_extract(stats, '$.lorem.sit')) as max_sit
FROM posts
GROUP BY day
ORDER BY day` These results used a database with 1,000 posts and 25 comments per post. The post and comments both had 100 "paragraphs" of content. (The correlated aggregation query is the only one that uses comments; the rest only use posts.) |
Beta Was this translation helpful? Give feedback.
-
This is great! I've been meaning to do a benchmark along the same lines, but I'm glad someone beat me to the punch. One less thing to do. Also, thanks for doing a set of benchmarks that are more representative of interactive use of SQLite, instead of focussing on raw insert/query performance which is more useful in the "using SQLite to exchange datasettes" approach. Looking at the numbers, especially memory, I suspect most of the difference is down to One other interesting thing here could be to add the https://github.com/zombiezen/go-sqlite driver. This one uses the ModernC transpile under the hood, but does not use the provided |
Beta Was this translation helpful? Give feedback.
-
Few notes. Maybe call It's expected that for pure CPU stuff (complex queries) Wasm is slower; Once you push the repo I'll run profiling to figure out if there are any allocations I can avoid; but there are a few almost insurmountable challenges. It shouldn't be much worse than other I know of a few cheats (not necessarily from these drivers), but I'd rather not go there. Like for instance |
Beta Was this translation helpful? Give feedback.
-
I'd be curious to hear if golang/go#67546 would help at all. I think it'd be great for at least pgx to have that support so I'm hoping the related CL makes it in for 1.25. |
Beta Was this translation helpful? Give feedback.
-
In case it's of interest, here's a table comparing the compile-time options of each of the implementations.
|
Beta Was this translation helpful? Give feedback.
-
There's now a repo at https://github.com/michaellenaghan/go-sqlite-bench. I haven't written any docs yet. I will; there's lots to explain in terms of goals, rules, etc. As far as the benchmarks themselves go: I've made various changes. One important change is that I used to run many benchmarks in parallel. I think that made results hard to interpret. Now only the "ReadOrWrite" benchmarks run in parallel; everything else runs in a The results below were generated using: make benchstat-by-category BENCH_BIG=1 BENCH_SLOW=1 To run a subset of implementations, use make benchstat-by-category BENCH_BIG=1 BENCH_SLOW=1 TAGS="ncruces_direct ncruces_driver" The first tag listed becomes the baseline in the benchstats. For example, to make everything relative to tailscale, list Benchmarks tee to Benchstat runs off of the Tests tee to The tests log useful info like compile-time options and the values of selected I haven't had time to look into the "why" of any of the differences. I'd like to add On to the results... Baseline
Populate
ReadWrite
Query/Correlated
Query/GroupBy
Query/JSON
Query/NonrecursiveCTE
Query/OrderBy
Query/RecursiveCTE
|
Beta Was this translation helpful? Give feedback.
-
The current state on head should address most issues. The only remaining one that I see could potentially be fixed, is unnecessary allocations in These happen because Where my driver allocates more in It's not clear how to fix this without breaking some abstractions, or not automatically decoding timestamps, which IMO is a nice feature. This feature means that, if you bind any of the following types into a cell, you can always scan them back with the same type, regardless of the declared the type of the column:
|
Beta Was this translation helpful? Give feedback.
-
So I guess #256 is what I'm proposing. |
Beta Was this translation helpful? Give feedback.
-
Apparently there's also https://pkg.go.dev/modernc.org/sqlite-bench2 for database/sql drivers and https://pkg.go.dev/modernc.org/sqlite-bench for non-database/sql drivers. |
Beta Was this translation helpful? Give feedback.
-
Released In the future there might be a small regression: the next time I build SQLite, I'm considering enabling I've tried it and it didn't seem to be catastrophic. I used I'll pour over the code and check how useful it is for memory safety, which IMO, beats a few percentage points. |
Beta Was this translation helpful? Give feedback.
-
@michaellenaghan many of the allocs in the [DB].SetMaxIdleConns(runtime.GOMAXPROCS(0))
[DB].SetMaxOpenConns([MAX]) The reason is that the queries you're benchmarking are so fast, that the default for idle connections (2) is insufficient to deal with the churn. So the pool is opening and closing connections left and right, and that's what you're actually benchmarking (opening a connection for every e.g. 4 queries). Also 256 connections is a lot. |
Beta Was this translation helpful? Give feedback.
-
This HN thread lead to #257 which seems a nice improvement, though, overall it's not measurable on speedtest. Can't believe this'll lead me to reimplement libc. 🤷 |
Beta Was this translation helpful? Give feedback.
-
I've created a benchmark for Go SQLite drivers.
At the moment, it includes ncruces, modernc, mattn, and tailscale.
For ncruces it benchmarks both "direct" and "driver" performance.
(All of the "driver" implementations use exactly the same code for everything other than opening the database.)
I hope to push the repo next week, but in the meantime here are some results.
I've broken them into sections:
"Baseline" measures the performance of a "SELECT 1" query.
"Populate" measures populating a database of posts and comments with no transactions; one transaction for the entire population; and a transaction per post + comments.
"ReadWrite" measures various combinations of reading and writing posts and comments, with and without transactions.
"Query" measures various complicated queries.
TL;DR
Baseline
Populate
ReadWrite
Query
Correlated Aggregation
CTE
CTE (Recursive)
Group By Aggregation
JSON
Beta Was this translation helpful? Give feedback.
All reactions