Skip to content

Commit b4e2167

Browse files
authored
fix: readme typos (#227)
1 parent 554db58 commit b4e2167

File tree

1 file changed

+3
-3
lines changed

1 file changed

+3
-3
lines changed

README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4,12 +4,12 @@ Java Library that implements and integrates concepts from TCP congestion control
44

55
# Background
66

7-
When thinking of service availability operators traditionally think in terms of RPS (requests per second). Stress tests are normally performed to determine the RPS at which point the service tips over. RPS limits are then set somewhere below this tipping point (say 75% of this value) and enforced via a token bucket. However, in large distributed systems that auto-scale this value quickly goes out of date and the service falls over by becoming non-responsive as it is unable to gracefully shed excess load. Instead of thinking in terms of RPS, we should be thinking in terms of concurrent request where we apply queuing theory to determine the number of concurrent requests a service can handle before a queue starts to build up, latencies increase and the service eventually exhausts a hard limit such as CPU, memory, disk or network. This relationship is covered very nicely with Little's Law where `Limit = Average RPS * Average Latency`.
7+
When thinking of service availability operators traditionally think in terms of RPS (requests per second). Stress tests are normally performed to determine the RPS at which point the service tips over. RPS limits are then set somewhere below this tipping point (say 75% of this value) and enforced via a token bucket. However, in large distributed systems that auto-scale this value quickly goes out of date and the service falls over by becoming non-responsive as it is unable to gracefully shed excess load. Instead of thinking in terms of RPS, we should be thinking in terms of concurrent requests where we apply queuing theory to determine the number of concurrent requests a service can handle before a queue starts to build up, latencies increase and the service eventually exhausts a hard limit such as CPU, memory, disk or network. This relationship is covered very nicely with Little's Law where `Limit = Average RPS * Average Latency`.
88

99
Concurrency limits are very easy to enforce but difficult to determine as they would require operators to fully understand the hardware services run on and coordinate how they scale. Instead we'd prefer to measure or estimate the concurrency limits at each point in the network. As systems scale and hit limits each node will adjust and enforce its local view of the limit. To estimate the limit we borrow from common TCP congestion control algorithms by equating a system's concurrency limit to a TCP congestion window.
1010

1111
Before applying the algorithm we need to set some ground rules.
12-
* We accept that every system has an inherent concurrency limit that is determined by a hard resources, such as number of CPU cores.
12+
* We accept that every system has an inherent concurrency limit that is determined by a hard resource, such as number of CPU cores.
1313
* We accept that this limit can change as a system auto-scales.
1414
* For large and complex distributed systems it's impossible to know all the hard resources.
1515
* We can use latency measurements to determine when queuing happens.
@@ -27,7 +27,7 @@ At the end of each sampling window the limit is increased by 1 if the queue is l
2727

2828
## Gradient2
2929

30-
This algorithm attempts to address bias and drift when using minimum latency measurements. To do this the algorithm tracks uses the measure of divergence between two exponential averages over a long and short time time window. Using averages the algorithm can smooth out the impact of outliers for bursty traffic. Divergence duration is used as a proxy to identify a queueing trend at which point the algorithm aggresively reduces the limit.
30+
This algorithm attempts to address bias and drift when using minimum latency measurements. To do this the algorithm tracks the measure of divergence between two exponential averages over a long and short time window. Using averages the algorithm can smooth out the impact of outliers for bursty traffic. Divergence duration is used as a proxy to identify a queueing trend at which point the algorithm aggressively reduces the limit.
3131

3232
# Enforcement Strategies
3333

0 commit comments

Comments
 (0)