Rewrite the "sync_local" query #78

rkistner · 2025-05-26T12:25:03Z

Supersedes #56. This is the same basic optimization as in that PR, but now also supports bucket priorities, and updated with the latest changes.

This now tracks filesystem operations (pages written/read) as the main metric to optimize for this query. It is not the only metric that matters, but it is consistent, and a good indication of real-world performance with large amounts of data on mobile and web platforms.

For the base test case, 500k rows:

# Data setup stats (for reference only)
init stats: Reads: 1348044 + 0 | Writes: 1188543 + 0 # (main db page operations) + (temp storage page operations)
# Before
4283ms Reads: 918018 + 531168 | Writes: 68904 + 521079
# After:
3800ms Reads: 1289503 + 5518 | Writes: 69023 + 5518

There is some increase in reads on the data db, but a massive reduction in temporary storage operations.
The above is with the default SQLite cache_size. If we set say PRAGMA cache_size=-50000 (50MB), we can do 500k rows without using any temporary storage with the changes here. With the previous query, temporary storage was already used after 30-40k rows.

There are further optimizations to be made for the initial sync or when we're re-syncing most of the data, but the gains here already significant.

Copilot

Pull Request Overview

This PR rewrites the "sync_local" query to optimize filesystem operations by reducing temporary storage usage and adds support for bucket priorities. Key changes include:

Refactoring the Virtual FileSystem tracking and test utilities to use unique VFS names for concurrency safety.
Updating SQL queries in the Rust implementation to replace bucket count checks with NULL checks and simplify query logic.
Upgrading the sqlite3 dependency version.

Reviewed Changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
dart/test/utils/tracking_vfs.dart	Implements a tracking VFS to count filesystem reads and writes.
dart/test/utils/native_test_utils.dart	Updates database open function to support unique file names.
dart/test/sync_test.dart	Adjusts test setup to avoid concurrency issues with unique VFS names.
dart/test/perf_test.dart	Adds performance tests for tracking filesystem operations.
dart/test/js_key_encoding_test.dart	Updates test configurations to use unique file system names.
dart/pubspec.yaml	Upgrades the sqlite3 dependency to a newer version.
crates/core/src/sync_local.rs	Rewrites the sync_local query and optimizes its SQL logic.

Comments suppressed due to low confidence (1)

crates/core/src/sync_local.rs:197

Confirm that the switch from DISTINCT to UNION ALL in the 'updated_rows' CTE is intentional and that the subsequent GROUP BY clause correctly handles any potential duplicate rows.

UNION ALL SELECT row_type, row_id FROM ps_updated_rows

crates/core/src/sync_local.rs

simolus3

I really like the idea of using a VFS for instrumentation 👍 The query change also looks good to me from a quick look (apart from one question).

crates/core/src/sync_local.rs

simolus3

Looks good to me 👍

rkistner added 5 commits May 26, 2025 13:00

Add test setup for tracking VFS operations on sync_local.

887eb8e

Optimize queries.

207bc90

Fix for test concurrency issue.

ce944d1

Expand performance tests; add hard limits.

764d39c

Change to is_err.

43aa8f8

rkistner requested a review from Copilot May 26, 2025 12:25

rkistner mentioned this pull request May 26, 2025

[WIP] Rewrite the "sync_local" query #56

Closed

2 tasks

Copilot AI reviewed May 26, 2025

View reviewed changes

crates/core/src/sync_local.rs Show resolved Hide resolved

Expand on comments.

4a2f448

simolus3 reviewed May 26, 2025

View reviewed changes

crates/core/src/sync_local.rs Show resolved Hide resolved

rkistner added 3 commits May 27, 2025 09:57

Add various queries to the performance tests.

44e1cba

Rename test file.

cf83c7e

Move annotations to the test file; add a link to docs.

bf38faf

simolus3 approved these changes May 27, 2025

View reviewed changes

rkistner merged commit fee756b into main May 28, 2025
21 checks passed

rkistner deleted the optimize-sync-local branch May 28, 2025 14:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Rewrite the "sync_local" query #78

Rewrite the "sync_local" query #78

Uh oh!

rkistner commented May 26, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

simolus3 left a comment

Uh oh!

Uh oh!

simolus3 left a comment

Uh oh!

Uh oh!

Uh oh!

Rewrite the "sync_local" query #78

Rewrite the "sync_local" query #78

Uh oh!

Conversation

rkistner commented May 26, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

simolus3 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

simolus3 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!