Skip to content

Commit bbda8a6

Browse files
committed
Add more tech details to readme
1 parent 47cc23c commit bbda8a6

File tree

1 file changed

+21
-9
lines changed

1 file changed

+21
-9
lines changed

Diff for: README.md

+21-9
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,10 @@ USE_PGXS=1 make -C /path/to/ptrack/ install
4747
CREATE EXTENSION ptrack;
4848
```
4949

50+
## Configuration
51+
52+
The only one configurable option is `ptrack.map_size` (in MB). Default is `-1`, which means `ptrack` is turned off. To completely avoid false positives it is recommended to set `ptrack.map_size` to `1 / 1000` of expected `PGDATA` size (i.e. `1000` for a 1 TB database), since a single 8 byte `ptrack` map record tracks changes in a standard 8 KB PostgreSQL page. To disable `ptrack` and clean up all remaining service files set `ptrack.map_size` to `0`.
53+
5054
## Public SQL API
5155

5256
* ptrack_version() — returns ptrack version string.
@@ -77,25 +81,32 @@ postgres=# SELECT ptrack_get_pagemapset('0/186F4C8');
7781
(3 rows)
7882
```
7983

80-
## Config options
84+
## Limitations
85+
86+
1. You can only use `ptrack` safely with `wal_level >= 'replica'`. Otherwise, you can lose tracking of some changes if crash-recovery occurs, since [certain commands are designed not to write WAL at all if wal_level is minimal](https://www.postgresql.org/docs/12/populate.html#POPULATE-PITR), but we only durably flush `ptrack` map at checkpoint time.
8187

82-
The only one configurable option is `ptrack.map_size` (in MB). Default is `-1`, which means `ptrack` is turned off. To completely avoid false positives it is recommended to set `ptrack.map_size` to `1 / 1000` of expected `PGDATA` size (i.e. `1000` for 1 TB database), since a single 8 byte `ptrack` map record tracks changes in a standard 8 KB PostgreSQL page. To disable `ptrack` and clean up all remaining service files set `ptrack.map_size` to `0`.
88+
2. The only one production-ready backup utility, that fully supports `ptrack` is [pg_probackup](https://github.com/postgrespro/pg_probackup).
89+
90+
3. Currently, you cannot resize `ptrack` map in runtime, only on postmaster restart. Also, you will loose all tracked changes, so it is recommended to do so in the maintainance window and accompany this operation with full backup. See [TODO](#TODO) for details.
91+
92+
4. You will need up to `ptrack.map_size * 3` of additional disk space, since `ptrack` uses two additional temporary files for durability purpose. See [TODO](#Architecture) for details.
8393

8494
## Architecture
8595

86-
TBA
96+
We use a single shared hash table in `ptrack`, which is mapped in memory from the file on disk using `mmap`. Due to the fixed size of the map there may be false positives (when some block is marked as changed without being actually modified), but not false negative results. However, these false postives may be completely eliminated by setting a high enough `ptrack.map_size`.
8797

88-
## Limitations
98+
All reads/writes are made using atomic operations on `uint64` entries, so the map is completely lockless during the normal PostgreSQL operation. Because we do not use locks for read/write access and cannot control `mmap` eviction back to disk, `ptrack` keeps a map (`ptrack.map`) since the last checkpoint intact and uses up to 2 additional temporary files:
8999

90-
1. You can only use `ptrack` safely with `wal_level >= 'replica'`. Otherwise, you can lose tracking of some changes if crash-recovery occurs, since [certain commands are designed not to write WAL at all if wal_level is minimal](https://www.postgresql.org/docs/12/populate.html#POPULATE-PITR), but we only durably flush `ptrack` map at checkpoint time.
100+
* working copy `ptrack.map.mmap` for doing `mmap` on it (there is a [TODO](#TODO) item);
101+
* temporary file `ptrack.map.tmp` to durably replace `ptrack.map` during checkpoint.
91102

92-
2. The only one production-ready backup utility, that fully supports `ptrack` is [pg_probackup](https://github.com/postgrespro/pg_probackup).
103+
Map is written on disk at the end of checkpoint atomically block by block involving the CRC32 checksum calculation that is checked on the next whole map re-read after crash-recovery or restart.
93104

94-
3. Currently, you cannot resize `ptrack` map in runtime, only on postmaster restart. Also, you will loose all tracked changes, so it is recommended to do so in the maintainance window and accompany this operation with full backup. See [#TODO](TODO) for details.
105+
To gather the whole changeset of modified blocks in `ptrack_get_pagemapset()` we walk the entire `PGDATA` (`base/**/*`, `global/*`, `pg_tblspc/**/*`) and verify using map whether each block of each relation was modified since the specified LSN or not.
95106

96107
## Contribution
97108

98-
Feel free to send pull requests, fill up issues, or just reach one of us directly (e.g. <[Alexey Kondratov](mailto:[email protected]?subject=[GitHub]%20Ptrack), @ololobus>) if you are interested in `ptrack`.
109+
Feel free to [send pull requests](https://github.com/postgrespro/ptrack/compare), [fill up issues](https://github.com/postgrespro/ptrack/issues/new), or just reach one of us directly (e.g. <[Alexey Kondratov](mailto:[email protected]?subject=[GitHub]%20Ptrack), [@ololobus](https://github.com/ololobus)>) if you are interested in `ptrack`.
99110

100111
### Tests
101112

@@ -115,7 +126,8 @@ Available test modes (`MODE`) are `basic` (default) and `paranoia` (per-block ch
115126

116127
### TODO
117128

129+
* Use POSIX `shm_open()` instead of `open()` to do not create an additional working copy of `ptrack` map file.
118130
* Should we introduce `ptrack.map_path` to allow `ptrack` service files storage outside of `PGDATA`? Doing that we will avoid patching PostgreSQL binary utilities to ignore `ptrack.map.*` files.
119-
* Can we resize `ptrack` map on restart but keeping previously tracked changes?
131+
* Can we resize `ptrack` map on restart but keep the previously tracked changes?
120132
* Can we resize `ptrack` map dynamicaly?
121133
* Can we write a formal proof, that we never loose any modified page with `ptrack`? With TLA+?

0 commit comments

Comments
 (0)