mongodb-rr-experiment

Over the years, we at MongoDB have developed tooling within our correctness testing infrastructure to make it easier to debug crashes (by collecting core dumps), hangs (by collecting thread stacks and lock requests), and data corruption (by collecting data files). However, we have yet to evolve a better strategy around debugging race conditions and still depend on an engineer to run the failed test many times with additional logging, or to have them think really hard about where in the code to add a sleep. Technologies such as rr may help us form a better story for investigating race-related issues without requiring effort from an engineer to manually reproduce the failure.

Setup

git clone https://github.com/visemet/mongodb-rr-experiment.git
cd mongodb-rr-experiment

Building `rr`

The following instructions were adapted from https://github.com/mozilla/rr/wiki/Building-And-Installing.

sudo apt update
sudo apt install     \
    capnproto        \
    ccache           \
    clang            \
    cmake            \
    coreutils        \
    g++-multilib     \
    gdb              \
    git              \
    libcapnp-dev     \
    make             \
    manpages-dev     \
    ninja-build      \
    pkg-config       \
    python-pexpect   \
    python3-pexpect

git clone https://github.com/mozilla/rr.git
cd rr
git checkout 5.2.0

CC=clang CXX=clang++ cmake -B build/ -G Ninja -Ddisable32bit=ON .
cmake --build .

sudo cmake --build . --target install
sudo sysctl kernel.perf_event_paranoid=1

Building MongoDB

The following instructions were adapted from https://github.com/mongodb/mongo/wiki/Build-Mongodb-From-Source.

sudo apt install libcurl4-openssl-dev python-pip

git clone https://github.com/mongodb/mongo.git
cd mongo

git remote add visemet https://github.com/visemet/mongo.git
git fetch visemet mongodb-rr-experiment
git checkout visemet/mongodb-rr-experiment

python2 -m pip install -r etc/pip/dev-requirements.txt
python2 -m pip install --user psutil==5.4.8

Results

You may notice when comparing the columns in the tables below that (1) there weren't any cases where a failure could only be reproduced using rr, and (2) there were multiple cases where a failure could only be reproduced manually. This shouldn't be interpreted as saying rr is ineffective. It is still very likely that rr would save an engineer both time and effort when investigating a build failure. The results simply demonstrate that it isn't possible to solely rely on rr as the answer to investigating all race-related issues.

Single-process failures

Build failure	Able to reproduce?
Build failure	using rr	manually
BF-9810
BF-9958	✓	✓
BF-10742	✓	✓
BF-10932	✓	✓

Single server process failures

Build failure	Able to reproduce?
Build failure	using rr	manually
BF-6346		✓
BF-8424	✓	✓
BF-9030

Multi server process failures

Build failure	Able to reproduce?
Build failure	using rr	manually
BF-7114		✓
BF-7588	✓	✓
BF-7888		✓
BF-8258
BF-8642	✓	✓
BF-9248		✓
BF-9426
BF-9552	✓	✓
BF-9864
BF-10729	✓	✓
BF-11054	✓	✓

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
bf-10729		bf-10729
bf-10742		bf-10742
bf-10932		bf-10932
bf-11054		bf-11054
bf-6346		bf-6346
bf-7114		bf-7114
bf-7588		bf-7588
bf-7888		bf-7888
bf-8258		bf-8258
bf-8424		bf-8424
bf-8642		bf-8642
bf-9030		bf-9030
bf-9248		bf-9248
bf-9426		bf-9426
bf-9552		bf-9552
bf-9810		bf-9810
bf-9864		bf-9864
bf-9958		bf-9958
README.rst		README.rst

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

mongodb-rr-experiment

Setup

Building `rr`

Building MongoDB

Results

Single-process failures

Single server process failures

Multi server process failures

About

Releases

Packages

Languages

visemet/mongodb-rr-experiment

Folders and files

Latest commit

History

Repository files navigation

mongodb-rr-experiment

Setup

Building rr

Building MongoDB

Results

Single-process failures

Single server process failures

Multi server process failures

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Building `rr`

Packages