Skip to content

Commit 6672b66

Browse files
authored
Merge pull request #1 from powersync-ja/api-experiments
sqlite-js
2 parents de6eaa1 + ceccd12 commit 6672b66

File tree

101 files changed

+9039
-2861
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

101 files changed

+9039
-2861
lines changed

.env

Whitespace-only changes.

.envrc

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
layout node
2+
use node
3+
[ -f .env ] && dotenv

.github/workflows/test.yaml

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
# Ensures packages test correctly
2+
name: Test Packages
3+
4+
on:
5+
push:
6+
7+
jobs:
8+
test:
9+
name: Test Packages
10+
runs-on: ubuntu-latest
11+
steps:
12+
- uses: actions/checkout@v4
13+
with:
14+
persist-credentials: false
15+
16+
- uses: pnpm/action-setup@v4
17+
name: Install pnpm
18+
19+
- name: Setup NodeJS
20+
uses: actions/setup-node@v4
21+
with:
22+
node-version-file: '.node-version'
23+
24+
- name: Install dependencies
25+
run: pnpm install
26+
27+
- name: Build
28+
run: pnpm build
29+
30+
- name: Test
31+
run: pnpm test

.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,6 @@
11
node_modules/
22
test-db/
33
*.db
4+
lib/
5+
tsconfig.tsbuildinfo
6+
benchmarks/db

.node-version

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
v22.5.1

.prettierignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
node_modules/
2+
lib/
3+
pnpm-lock.yaml

.prettierrc

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
{
2+
"trailingComma": "none",
3+
"tabWidth": 2,
4+
"semi": true,
5+
"singleQuote": true
6+
}

DRIVER-API.md

Lines changed: 139 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,139 @@
1+
## Driver API
2+
3+
The driver API aims to have a small surface area, with little performance overhead. Ease of use is not important.
4+
5+
To support all potential implementations, the main APIs are asynchronous. This does add overhead, but this is unavoidable when our goal is to have a universal driver API. We do however aim to keep the performance overhead as low as possible.
6+
7+
The driver API primarily exposes:
8+
9+
1. Connection pooling. Even when using a single connection, that connection should be locked for exclusive use by one consumer at a time.
10+
2. Prepared statements. Even if the underlying implementation does not use actual prepared statements, the same APIs can be exposed.
11+
12+
In general, the setup of prepared statements (preparing a statement, binding parameters) are synchronous APIs, and don't throw on invalid queries. Executing the statement is asynchronous, and this is where errors are thrown.
13+
14+
The driver API does not include transaction management. This is easily implemented on top of connection pooling/locking + prepared statements for begin/commit/rollback.
15+
16+
### The API
17+
18+
This is a simplified version of the API. For full details, see:
19+
[packages/driver/src/driver-api.ts](packages/driver/src/driver-api.ts).
20+
21+
```ts
22+
export interface SqliteDriverConnectionPool {
23+
/**
24+
* Reserve a connection for exclusive use.
25+
*
26+
* If there is no available connection, this will wait until one is available.
27+
*/
28+
reserveConnection(
29+
options?: ReserveConnectionOptions
30+
): Promise<ReservedConnection>;
31+
32+
close(): Promise<void>;
33+
34+
[Symbol.asyncDispose](): Promise<void>;
35+
}
36+
37+
export interface ReservedConnection {
38+
/** Direct handle to the underlying connection. */
39+
connection: SqliteDriverConnection;
40+
41+
/** Proxied to the underlying connection */
42+
prepare(sql: string, options?: PrepareOptions): SqliteDriverStatement;
43+
44+
[Symbol.asyncDispose](): Promise<void>;
45+
}
46+
47+
export interface SqliteDriverConnection {
48+
/**
49+
* Prepare a statement.
50+
*
51+
* Does not return any errors.
52+
*/
53+
prepare(sql: string, options?: PrepareOptions): SqliteDriverStatement;
54+
}
55+
56+
/**
57+
* Represents a single prepared statement.
58+
* Loosely modeled on the SQLite API.
59+
*/
60+
export interface SqliteDriverStatement {
61+
bind(parameters: SqliteParameterBinding): void;
62+
63+
step(n?: number, options?: StepOptions): Promise<SqliteStepResult>;
64+
getColumns(): Promise<string[]>;
65+
finalize(): void;
66+
67+
reset(options?: ResetOptions): void;
68+
69+
[Symbol.dispose](): void;
70+
}
71+
```
72+
73+
## Design decisions
74+
75+
### Small surface area
76+
77+
We want the driver to have as small surface area as possible. In rare cases we do allow exceptions for performance or simplicity reasons.
78+
79+
### Reusability
80+
81+
The same driver connection pool should be usable by multiple different consumers within the same process. For example, the same connection pool can be used directly, by an ORM, and/or by a sync library, without running into concurrency issues. This specifically affects connection pooling (see below).
82+
83+
### Synchronous vs asynchronous
84+
85+
Many implementations can only support asynchronous methods. However, having _every_ method asynchronous can add significant overhead, if you need to chain multiple methods to run a single query. We therefore aim to have a single asynchronous call per query for most use cases. This does mean that we defer errors until that asynchronous call, and do not throw errors in `prepare()` or `bind()`.
86+
87+
### Transactions
88+
89+
Full transaction support requires a large surface area, with many design possibilities. For example, do we support nested transactions (savepoints in SQLite)? Do we expose immediate/defferred/exclusive transactions? Do we use a wrapper function, explicit resource management, or manual commit/rollback calls to manage transactions?
90+
91+
Instead, the driver API just provides the building blocks for transactions - connection pooling and prepared statements.
92+
93+
### Connection pooling
94+
95+
The driver API requires a connection pooling implementation, even if there is only a single underlying connection. Even in that case, it is important that a connection can be "reserved" for a single consumer at a time. This is needed for example to implement transactions, without requiring additional locking mechanisms (which would break the reusability requirement).
96+
97+
Connection pooling also supports specifically requesting a read-only vs read-write connection. This is important for concurrency in SQLite, which can only support a single writer at a time, but any number of concurrent readers.
98+
99+
### Read vs write queries
100+
101+
There is no fundamental distinction between read and write queries in the driver prepared statement API. This is important for use cases such as `INSERT INTO ... RETURNING *` - a "write" api that also returns data. However, read vs write locks are taken into account with connection pooling.
102+
103+
### "run" with results
104+
105+
The `run` API that returns the last insert row id and number of changes are primarily for compatibility with current libraries/APIs. Many libraries in use return that automatically for any "run" statement, and splitting that out into a separate prepared statement could add significant performance overhead (requiring two prepared statements for every single "write" query).
106+
107+
### Row arrays vs objects
108+
109+
Returning an array of cells for each row, along with a separate "columns" array, is more flexible than just using an object per row. It is always possible to convert the array to an object, given the columns header.
110+
111+
However, many current SQLite bindings do not expose the raw array calls. Even if they do, this path may be slower than using objects from the start. Since using the results as an array is quite rare in practice, this is left as an optional configuration, rather than a requirement for the all queries.
112+
113+
### Separate bind/step/reset
114+
115+
This allows a lot of flexibility, for example partial rebinding of parameters instead of specifying all parameters each time a prepared statement is used. However, those type of use cases are rare, and this is not important in the overall architecture. These could all be combined into a single "query with parameters" call, but would need to take into account optional streaming of results.
116+
117+
### bigint
118+
119+
SQLite supports up to 8-byte signed integers (up to 2^64-1), while JavaScript's number is limited to 2^53-1. General approaches include:
120+
121+
1. Always use JS numbers. This requires using TEXT for larger integers, but can still store as INTEGER and cast when inserting or returning results.
122+
2. Automatically switching to bigint if the number is `>= 2^53`. This can easily introduce issues in the client, since `bigint` an `number` are not interoperable.
123+
3. Require an explicit option to get `bigint` results. This is the approach we went for here.
124+
4. Always use `number` for `REAL`, and `bigint` for `INTEGER`. You can use `cast(n to REAL)` to get a value back as a `number`. Since many users will just use small integers, this may not be ideal.
125+
126+
### Pipelining
127+
128+
The APIs guarantee that statements on a connection will be ordered in the order that calls were made. This allows pipelining statements to improve performance - the client can issue many queries before waiting for the results. One place where this breaks down is within transactions: It is possible for one statement to trigger a transaction rollback, in which case the next pipelined statement will run outside the transaction.
129+
130+
The current API includes a flag to indicate a statement may only be run within a transaction to work around this issue, but other suggestions are welcome.
131+
132+
## Driver implementation helpers
133+
134+
The driver package also includes helpers to assist in implementating drivers. These are optional, and not part of the driver spec. It does however make it simple to support:
135+
136+
1. Connection pooling - the driver itself just needs to implement logic for a single connection, and the utilities will handle connection pooling.
137+
2. Worker threads - this can assist in spawing a separate worker thread per conneciton, to get true concurrency. The same approaches could work to support web workers in browsers in the future.
138+
139+
Some drivers may use different approaches for concurrency and connection pooling, without using these utilities.

README.md

Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
# sqlite-js
2+
3+
Universal SQLite APIs for JavaScript.
4+
5+
The project provides two primary APIs:
6+
7+
1. The driver API. This aims to expose a minimum API for drivers to implement, while supporting a rich set of functionality. This should have as little as possible performance overhead, while still supporting asynchronous implementations.
8+
9+
2. The end-user API. This is a library built on top of the driver API, that exposes higher-level functionality such as transactions, convenience methods, template strings (later), pipelining.
10+
11+
## @sqlite-js/driver
12+
13+
This is a universal driver API and utilities for implementing drivers.
14+
15+
The APIs here are low-level. These are intended to be implemented by drivers, and used by higher-level libraries.
16+
17+
See [DRIVER-API.md](./DRIVER-API.md) for details on the design.
18+
19+
### @sqlite-js/driver/node
20+
21+
This is a driver implementation for NodeJS based on the experimental `node:sqlite` package.
22+
23+
## @sqlite-js/better-sqlite3-driver
24+
25+
This is a driver implementation for NodeJS implementation based `better-sqlite3`.
26+
27+
## @sqlite-js/api
28+
29+
This contains a higher-level API, with simple methods to execute queries, and supports transactions and pipelining.
30+
31+
This is largely a proof-of-concept to validate and test the underlying driver APIs, rather than having a fixed design.
32+
33+
The current iteration of the APIs is visible at [packages/api/src/api.ts](packages/api/src/api.ts).
34+
35+
# Why split the APIs?
36+
37+
A previous iteration used a single API for both the end-user API and the driver API. This had serveral disadvantages:
38+
39+
1. The implementation per driver requires a lot more effort.
40+
2. Iterating on the API becomes much more difficult.
41+
1. Implementing minor quality-of-life improvements for the end user becomes a required change in every driver.
42+
3. Optimizing the end-user API for performance is difficult. To cover all the different use cases, it requires implementing many different features such as prepared statements, batching, pipelining. This becomes a very large API for drivers to implement.
43+
4. The goals for the end-user API is different from the driver API:
44+
1. End-users want a rich but simple-to-use API to access the database.
45+
2. Drivers want a small surface area, that doesn't change often.
46+
47+
Splitting out a separate driver API, and implementing the end-user API as a separate library, avoids all the above issues.

benchmarks/package.json

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
{
2+
"name": "benchmarks",
3+
"type": "module",
4+
"scripts": {
5+
"build": "tsc -b",
6+
"start": "NODE_OPTIONS=\"--experimental-sqlite --disable-warning=ExperimentalWarning\" node lib/index.js"
7+
},
8+
"dependencies": {
9+
"better-sqlite3": "^11.0.0",
10+
"prando": "^6.0.1",
11+
"sqlite": "^5.1.1",
12+
"sqlite3": "^5.1.7",
13+
"@sqlite-js/driver": "workspace:^",
14+
"@sqlite-js/better-sqlite3-driver": "workspace:^",
15+
"@sqlite-js/api": "workspace:^"
16+
},
17+
"devDependencies": {
18+
"@types/node": "^20.14.2",
19+
"typescript": "^5.4.5"
20+
}
21+
}

0 commit comments

Comments
 (0)