Skip to content

Commit d38aa2a

Browse files
committed
Add benchmark results
1 parent bbda8a6 commit d38aa2a

File tree

3 files changed

+114
-2
lines changed

3 files changed

+114
-2
lines changed

Diff for: README.md

+8-2
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,9 @@ CREATE EXTENSION ptrack;
4949

5050
## Configuration
5151

52-
The only one configurable option is `ptrack.map_size` (in MB). Default is `-1`, which means `ptrack` is turned off. To completely avoid false positives it is recommended to set `ptrack.map_size` to `1 / 1000` of expected `PGDATA` size (i.e. `1000` for a 1 TB database), since a single 8 byte `ptrack` map record tracks changes in a standard 8 KB PostgreSQL page. To disable `ptrack` and clean up all remaining service files set `ptrack.map_size` to `0`.
52+
The only one configurable option is `ptrack.map_size` (in MB). Default is `-1`, which means `ptrack` is turned off. To completely avoid false positives it is recommended to set `ptrack.map_size` to `1 / 1000` of expected `PGDATA` size (i.e. `1000` for a 1 TB database), since a single 8 byte `ptrack` map record tracks changes in a standard 8 KB PostgreSQL page.
53+
54+
To disable `ptrack` and clean up all remaining service files set `ptrack.map_size` to `0`.
5355

5456
## Public SQL API
5557

@@ -87,10 +89,14 @@ postgres=# SELECT ptrack_get_pagemapset('0/186F4C8');
8789

8890
2. The only one production-ready backup utility, that fully supports `ptrack` is [pg_probackup](https://github.com/postgrespro/pg_probackup).
8991

90-
3. Currently, you cannot resize `ptrack` map in runtime, only on postmaster restart. Also, you will loose all tracked changes, so it is recommended to do so in the maintainance window and accompany this operation with full backup. See [TODO](#TODO) for details.
92+
3. Currently, you cannot resize `ptrack` map in runtime, only on postmaster start. Also, you will loose all tracked changes, so it is recommended to do so in the maintainance window and accompany this operation with full backup. See [TODO](#TODO) for details.
9193

9294
4. You will need up to `ptrack.map_size * 3` of additional disk space, since `ptrack` uses two additional temporary files for durability purpose. See [TODO](#Architecture) for details.
9395

96+
## Benchmarks
97+
98+
Briefly, an overhead of using `ptrack` on TPS usually does not exceed a couple of percent (~1-3%) for a database of dozens to hundreds of gigabytes in size, while the backup time scales down linearly with backup size with a coefficient ~1. It means that an incremental `ptrack` backup of a database with only 20% of changed pages will be 5 times faster than a full backup. More details [here](benchmarks).
99+
94100
## Architecture
95101

96102
We use a single shared hash table in `ptrack`, which is mapped in memory from the file on disk using `mmap`. Due to the fixed size of the map there may be false positives (when some block is marked as changed without being actually modified), but not false negative results. However, these false postives may be completely eliminated by setting a high enough `ptrack.map_size`.

Diff for: benchmarks/README.md

+97
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,97 @@
1+
# Ptrack benchmarks
2+
3+
## Runtime overhead
4+
5+
First target was to measure `ptrack` overhead on TPS due to marking modified pages in the map in memory. We used PostgreSQL 12 cluster of approximately 1 GB size, initialized with `pgbench` on a `tmpfs` partition:
6+
7+
```sh
8+
pgbench -i -s 133
9+
```
10+
11+
Default `pgbench` transaction script [were modified](pgb.sql) to exclude `pgbench_tellers` and `pgbench_branches` updates in order to lower lock contention and make `ptrack` overhead more visible. So `pgbench` was invoked as following:
12+
13+
```sh
14+
pgbench -s133 -c40 -j1 -n -P15 -T300 -f pgb.sql
15+
```
16+
17+
Results:
18+
19+
| ptrack.map_size, MB | 0 (turned off) | 32 | 64 | 256 | 512 | 1024 |
20+
|---------------------|----------------|----|----|-----|-----|------|
21+
| TPS | 16900 | 16890 | 16855 | 16468 | 16490 | 16220 |
22+
23+
TPS fluctuates in a several percent range around 16500 on the used machine, but in average `ptrack` overhead does not exceed 1-3% for any reasonable `ptrack.map_size`. It only becomes noticeable closer to 1 GB `ptrack.map_size` (~3-4%), which is enough to track changes in the database of up to 1 TB size without false positives.
24+
25+
26+
<!-- ## Checkpoint overhead
27+
28+
Since `ptrack` map is completely flushed to disk during checkpoints, the same test were performed on HDD, but with slightly different configuration:
29+
```conf
30+
synchronous_commit = off
31+
shared_buffers = 1GB
32+
```
33+
and `pg_prewarm` run prior to the test. -->
34+
35+
## Backups speedup
36+
37+
To test incremental backups speed a fresh cluster were initialized with following DDL:
38+
39+
```sql
40+
CREATE TABLE large_test (num1 bigint, num2 double precision, num3 double precision);
41+
CREATE TABLE large_test2 (num1 bigint, num2 double precision, num3 double precision);
42+
```
43+
44+
These relations were populated with approximately 2 GB of data that way:
45+
46+
```sql
47+
INSERT INTO large_test (num1, num2, num3)
48+
SELECT s, random(), random()*142
49+
FROM generate_series(1, 20000000) s;
50+
```
51+
52+
Then a part of one relation was touched with a following query:
53+
54+
```sql
55+
UPDATE large_test2 SET num3 = num3 + 1 WHERE num1 < 20000000 / 5;
56+
```
57+
58+
After that, incremental `ptrack` backups were taken with `pg_probackup` followed by full backups. Tests show that `ptrack_backup_time / full_backup_time ~= ptrack_backup_size / full_backup_size`, i.e. if the only 20% of data were modified, then `ptrack` backup will be 5 times faster than full backup. Thus, the overhead of building `ptrack` map during backup is minimal. Example:
59+
60+
```log
61+
21:02:43 postgres:~/dev/ptrack_test$ time pg_probackup backup -B $(pwd)/backup --instance=node -p5432 -b ptrack --no-sync --stream
62+
INFO: Backup start, pg_probackup version: 2.3.1, instance: node, backup ID: QAA89O, backup mode: PTRACK, wal mode: STREAM, remote: false, compress-algorithm: none, compress-level: 1
63+
INFO: Parent backup: QAA7FL
64+
INFO: PGDATA size: 2619MB
65+
INFO: Extracting pagemap of changed blocks
66+
INFO: Pagemap successfully extracted, time elapsed: 0 sec
67+
INFO: Start transferring data files
68+
INFO: Data files are transferred, time elapsed: 3s
69+
INFO: wait for pg_stop_backup()
70+
INFO: pg_stop backup() successfully executed
71+
WARNING: Backup files are not synced to disk
72+
INFO: Validating backup QAA89O
73+
INFO: Backup QAA89O data files are valid
74+
INFO: Backup QAA89O resident size: 632MB
75+
INFO: Backup QAA89O completed
76+
77+
real 0m11.574s
78+
user 0m1.924s
79+
sys 0m1.100s
80+
81+
21:20:23 postgres:~/dev/ptrack_test$ time pg_probackup backup -B $(pwd)/backup --instance=node -p5432 -b full --no-sync --stream
82+
INFO: Backup start, pg_probackup version: 2.3.1, instance: node, backup ID: QAA8A6, backup mode: FULL, wal mode: STREAM, remote: false, compress-algorithm: none, compress-level: 1
83+
INFO: PGDATA size: 2619MB
84+
INFO: Start transferring data files
85+
INFO: Data files are transferred, time elapsed: 32s
86+
INFO: wait for pg_stop_backup()
87+
INFO: pg_stop backup() successfully executed
88+
WARNING: Backup files are not synced to disk
89+
INFO: Validating backup QAA8A6
90+
INFO: Backup QAA8A6 data files are valid
91+
INFO: Backup QAA8A6 resident size: 2653MB
92+
INFO: Backup QAA8A6 completed
93+
94+
real 0m42.629s
95+
user 0m8.904s
96+
sys 0m11.960s
97+
```

Diff for: benchmarks/pgb.sql

+9
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
\set aid random(1, 100000 * :scale)
2+
\set bid random(1, 1 * :scale)
3+
\set tid random(1, 10 * :scale)
4+
\set delta random(-5000, 5000)
5+
BEGIN;
6+
UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;
7+
SELECT abalance FROM pgbench_accounts WHERE aid = :aid;
8+
INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);
9+
END;

0 commit comments

Comments
 (0)