GitHub - ShreeChaturvedi/entropy: High-performance ACID relational database in C++ with MVCC, B+ tree indexes, and SQL query optimization

Entropy is a high-performance relational database engine built from scratch in modern C++20. It showcases core database internals: B+ tree storage, MVCC, ACID transactions, write-ahead logging, and a cost-based query optimizer.

Highlights

End-to-end SQL engine: parser, binder, optimizer, execution.
Storage engine with B+ trees, hash indexes, and buffer pool caching.
MVCC + WAL for snapshot isolation and crash recovery.
Unit and integration tests wired to CTest + GoogleTest.
Benchmarks with optional SQLite comparison and reproducible scripts.
Cross-platform CMake build with Linux/macOS/Windows CI.

Architecture

Architecture Details

Entropy is organized as focused C++ libraries that mirror the logical database pipeline. The system is split into layers with explicit boundaries.

SQL Frontend

Lexer and parser build an AST for SELECT, INSERT, UPDATE, DELETE, and CREATE TABLE.
Binder resolves names and types using catalog metadata and column indexes.

Planning and Optimization

Statistics track row counts and simple selectivity estimates.
Cost model compares index scans and sequential scans for predicates.
Index selector chooses point lookup and range scan paths when an index exists.

Execution

Iterator operators for seq scan, index scan, hash join, nested loop join, sort, aggregate, filter, and limit.
Projection and DML executors materialize rows and apply insert, update, and delete.

Transactions and Recovery

MVCC version chains with timestamp visibility checks.
Lock manager enforces isolation with deadlock detection.
WAL records are flushed on commit and replayed on recovery.

Storage and Buffering

Slotted pages back a table heap for variable length tuples.
B+ tree and extendible hash indexes support range and point access.
Buffer pool uses an LRU replacer with pin and dirty tracking.
Disk manager handles page and WAL IO.

Performance Snapshot (vs SQLite)

Run file: docs/benchmarks/runs/bench-20251226-214051.json

Machine: Apple M2 (arm64), macOS 15.5
Compiler: Apple clang 17.0.0 (clang-1700.0.13.5)
Build: Release (-O3) SQLite baselines are collected when ENTROPY_BENCH_COMPARE_SQLITE=ON.

Per-iteration ns/op (ratio = Entropy / SQLite, lower is better):

Case	Rows	Entropy (ns/op)	SQLite (ns/op)	Ratio
Insert batch (txn)	1k	`940,970`	`1,048,274`	`0.90x`
Insert batch (txn)	10k	`9,693,951`	`6,992,261`	`1.39x`
Point select	1k	`46,041`	`23,020`	`2.00x`
Point select	10k	`459,723`	`179,783`	`2.56x`

Rows are the batch size for inserts and the table cardinality for point selects.

Chart units are microseconds per op. The insert and point select panels use different y axis ranges.

Full results: docs/benchmarks/bench_summary.csv

Benchmarks

./scripts/bench/run.sh

Windows:

powershell -ExecutionPolicy Bypass -File scripts/bench/run.ps1

The script writes a timestamped JSON run file in docs/benchmarks/runs/ and updates docs/benchmarks/bench_summary.csv.

SQLite comparison is ON by default for the script. If SQLite3 headers are missing, the script fails with an explicit error. To run without SQLite, set ENTROPY_BENCH_COMPARE_SQLITE=OFF. Manual steps and methodology are in docs/benchmarks.md.

Quick Start

#include <entropy/entropy.hpp>
#include <iostream>

int main() {
    entropy::Database db("mydb.entropy");
    db.execute("CREATE TABLE users (id INT, name VARCHAR(100))");
    db.execute("INSERT INTO users VALUES (1, 'Alice')");

    auto result = db.execute("SELECT * FROM users");
    for (const auto &row : result) {
        std::cout << row["name"].as_string() << "\n";
    }
}

Build and Test

Prerequisites

C++20 compiler (GCC 10+, Clang 12+, MSVC 19.29+)
CMake 3.20+
Git

Configure + Build

Dependencies (spdlog, GoogleTest, Google Benchmark) are fetched with CMake FetchContent on first configure (requires network access). Optional comparisons use system SQLite3, while compression uses LZ4 when enabled.

cmake -S . -B build -DCMAKE_BUILD_TYPE=Release
cmake --build build --config Release

Run Tests

ctest --test-dir build --output-on-failure -C Release

CMake Presets

cmake --preset dev
cmake --build --preset dev
ctest --preset dev

Build Options

Option	Default	Description
`ENTROPY_BUILD_TESTS`	✓	Build unit + integration tests
`ENTROPY_BUILD_BENCHMARKS`	✗	Build benchmarks
`ENTROPY_BENCH_COMPARE_SQLITE`	✗	Build SQLite comparison benchmarks
`ENTROPY_BUILD_EXAMPLES`	✓	Build example programs
`ENTROPY_ENABLE_LZ4`	✗	Enable page compression (LZ4)

LZ4 compression tests are only built when ENTROPY_ENABLE_LZ4=ON.

CI/CD and Releases

CI builds and tests on Linux/macOS/Windows via GitHub Actions.
Releases are created automatically on v* tags with OS-specific binaries.

Documentation

DESIGN.md for architecture notes and component details
docs/benchmarks.md for benchmark methodology and reporting

License

MIT. See LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 70 Commits
.cache/clangd/index		.cache/clangd/index
.github/workflows		.github/workflows
benchmarks		benchmarks
cmake		cmake
docs		docs
examples		examples
include/entropy		include/entropy
scripts/bench		scripts/bench
src		src
tests		tests
third_party		third_party
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CMakeLists.txt		CMakeLists.txt
CMakePresets.json		CMakePresets.json
DESIGN.md		DESIGN.md
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Highlights

Architecture

Architecture Details

SQL Frontend

Planning and Optimization

Execution

Transactions and Recovery

Storage and Buffering

Performance Snapshot (vs SQLite)

Benchmarks

Quick Start

Build and Test

Prerequisites

Configure + Build

Run Tests

CMake Presets

Build Options

CI/CD and Releases

Documentation

License

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

License

ShreeChaturvedi/entropy

Folders and files

Latest commit

History

Repository files navigation

Highlights

Architecture

Architecture Details

SQL Frontend

Planning and Optimization

Execution

Transactions and Recovery

Storage and Buffering

Performance Snapshot (vs SQLite)

Benchmarks

Quick Start

Build and Test

Prerequisites

Configure + Build

Run Tests

CMake Presets

Build Options

CI/CD and Releases

Documentation

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages