-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME
40 lines (30 loc) · 2.14 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
A proof-of-concept and simulator for container to container replication of local state.
Current modeling restrictions:
1. Models blocking commit only where commit is exclusive with process.
2. Assumes replication factor of 3 (including the original copy).
3. Blocks commit until both Replicators are fully replicated. No "min available replicas".
How it works:
First runs all 3 containers without host affinity. Randomly simulates one of the following scenarios at
approximately every 'min-runtime' interval:
1. Random Producer dies and moves to a replicator host. Replicator moves to a new host.
2. Random Replicator dies and moves to a new host.
3. Both Replicators for a producer die and move to new hosts.
4. A Replicator dies and moves to a new host, then the Producer dies and moves to the
remaining replicator's host and the remaining replicator moves to a new host.
Then runs all three containers with host affinity, where containers restart randomly with their
previous state intact. Each container runs for a random amount of time between 'min-runtime' and 'max-runtime'.
Finally verifies that task store contents match each replica store's contents up to the last commit for each task.
Current modeling restrictions:
1. Does not account for replicator/producer moving to a host with stale state when running without host affinity.
Potential issues and workarounds:
1. Too many open files errors
Cause: Can happen if Task message produce rate to commit interval ratio is too small since this creates of too many sst files.
Resolution: Tweak Constants.Task.COMMIT_INTERVAL and Constants.Task.TASK_SLEEP_MS
2. No lock file available
Cause: Can happen if a container was destroyed forcibly (kill -9) and restarted soon after.
Resolution: Increase --interval to allow OS to release file locks.
Execution:
gradle clean (optional, clears previous state)
gradle run -Pargs='--execution-id 0 --iterations 10 --total-runtime 600 --max-runtime 60 --min-runtime 30 --interval 5'
It is safe to kill the run at any point. Next run should continue from where the previous run left off.
Please report any [ERROR] level messages in output since they're unexpected failures.