Skip to content

Commit a6ce545

Browse files
author
Abram Sanderson
committed
docs and cleanup
1 parent 990cc74 commit a6ce545

File tree

2 files changed

+59
-58
lines changed

2 files changed

+59
-58
lines changed

packages/dds/merge-tree/src/partialLengths.ts

Lines changed: 59 additions & 55 deletions
Original file line numberDiff line numberDiff line change
@@ -136,14 +136,19 @@ interface UnsequencedPartialLengthInfo {
136136
* Like PerClientAdjustments, except we store one set of PartialSequenceLengthsSet for each refSeq. The "seq" keys in these sets
137137
* are all local seqs.
138138
*
139-
* TODO: Explain the lazy computation bit and how it works.
139+
* These entries are aggregated by {@link PartialSequenceLengths.computeOverallRefSeqAdjustment} when a local perspective for a
140+
* given refSeq is requested.
141+
*
142+
* In general, adjustments in this map are added to avoid double-counting an operation performed by both the local client and some
143+
* remote client, and an adjustment at (refSeq = A, clientSeq = B) takes effect for all perspectives (refSeq = C, clientSeq = D) where
144+
* A \<= C and B \<= D.
140145
*/
141146
perRefSeqAdjustments: Map<number, PartialSequenceLengthsSet>;
142147

143148
/**
144149
* Cache keyed on refSeq which stores length information for the total overlap of removed segments at
145150
* that refSeq.
146-
* This information is derivable from the entries of `overlappingRemoves`.
151+
* This information is derivable from the entries of `perRefSeqAdjustments`.
147152
*
148153
* Like the `partialLengths` field, `seq` on each entry is actually the local seq.
149154
* See `computeOverlappingLocalRemoves` for more information.
@@ -171,13 +176,13 @@ export interface PartialSequenceLengthsOptions {
171176
* "What is the length of `block` from the perspective of some particular seq and clientId?".
172177
*
173178
* It also supports incremental updating of state for newly-sequenced ops that don't affect the structure of the
174-
* MergeTree.
179+
* MergeTree (in most cases--see AB#31003 or comments on {@link PartialSequenceLengths.update}).
175180
*
176181
* To answer these queries, it pre-builds several lists which track the length of the block at a per-sequence-number
177182
* level. These lists are:
178183
*
179184
* 1. (`partialLengths`): Stores the total length of the block.
180-
* 2. (`clientSeqNumbers[clientId]`): Stores only the total lengths of segments submitted by `clientId`. [see footnote]
185+
* 2. (`perClientAdjustments[clientId]`): Stores adjustments to the base length which account for all changes submitted by `clientId`. [see footnote]
181186
*
182187
* The reason both lists are necessary is that resolving the length of the block from the perspective of
183188
* (clientId, refSeq) requires including both of the following types of segments:
@@ -200,13 +205,19 @@ export interface PartialSequenceLengthsOptions {
200205
* (length of the block at the minimum sequence number)
201206
* + (partialLengths total length at refSeq)
202207
* + (unsequenced edits' total length submitted before localSeq)
203-
* - (overlapping remove of the unsequenced edits' total length at refSeq)
208+
* + (adjustments for changes double-counted by happening at or before both localSeq and refSeq)
204209
*
205210
* This algorithm scales roughly linearly with number of editing clients and the size of the collab window.
206211
* (certain unlikely sequences of operations may introduce log factors on those variables)
207212
*
208-
* Note: there is some slight complication with clientSeqNumbers resulting from the possibility of different clients
209-
* concurrently removing the same segment. See the field's documentation for more details.
213+
* @privateRemarks
214+
* If you are looking to understand this class in more detail, a suggested order of internalization is:
215+
*
216+
* 1. The above description and how it relates to the implementation of `getPartialLength` (which implements the above high-level description
217+
* 2. `PartialSequenceLengthsSet`, which allows binary searching for overall length deltas at a given sequence number and handles updates.
218+
* 3. The `fromLeaves` method, which is the base case for the [potential] recursion in `combine`
219+
* 4. The logic in `combine` to aggregate smaller block entries into larger ones
220+
* 5. The incremental code path of `update`
210221
*/
211222
export class PartialSequenceLengths {
212223
public static options: PartialSequenceLengthsOptions = {
@@ -275,6 +286,18 @@ export class PartialSequenceLengths {
275286
* Contains information required to answer queries for the length of this segment from the perspective of
276287
* the local client but not including all local segments (i.e., `localSeq !== collabWindow.localSeq`).
277288
* This field is only computed if requested in the constructor (i.e. `computeLocalPartials === true`).
289+
*
290+
* Note that the usage pattern for this list is a bit different from perClientAdjustments: when dealing with perspectives of remote clients,
291+
* we generally want to know what their view of the block was accounting for all changes made by that client as well as all \<= some refSeq.
292+
*
293+
* However, when dealing with perspectives relevant to the local client, we are still interested in changes made \<= some refSeq, but instead
294+
* of caring about all changes made by the local client, we additionally want the subset of them that were made \<= some localSeq.
295+
*
296+
* The PartialSequenceLengthsSets stored in this field therefore track localSeqs rather than seqs (it's still named seq for ease of implementation).
297+
* Furthermore, when computing the length of the block at a given refSeq/localSeq perspective,
298+
* rather than add something like `perClientAdjustments[clientId].latestLeq(latestSeq) - perClientAdjustments[clientId].latestLeq(refSeq)` [to
299+
* get the tail end of adjustments necessary for a remote client client], we instead add `unsequencedRecords.partialLengths.latestLeq(localSeq)`
300+
* [to get the head end of adjustments necessary for the local client].
278301
*/
279302
private unsequencedRecords: UnsequencedPartialLengthInfo | undefined;
280303

@@ -290,7 +313,6 @@ export class PartialSequenceLengths {
290313
this.unsequencedRecords = {
291314
partialLengths: new PartialSequenceLengthsSet(),
292315
perRefSeqAdjustments: new Map(),
293-
// overlappingRemoves: [],
294316
cachedAdjustmentByRefSeq: new Map(),
295317
};
296318
}
@@ -407,7 +429,11 @@ export class PartialSequenceLengths {
407429
}
408430

409431
for (const partial of perClientAdjustments[clientId].items) {
410-
combinedPartialLengths.addClientSeqNumber(clientId, partial.seq, partial.seglen);
432+
combinedPartialLengths.addClientAdjustment(
433+
clientId,
434+
partial.seq,
435+
partial.seglen,
436+
);
411437
}
412438
}
413439
}
@@ -467,6 +493,11 @@ export class PartialSequenceLengths {
467493
return combinedPartialLengths;
468494
}
469495

496+
/**
497+
* Assuming this segment was moved on insertion, inserts length information about that operation
498+
* into the appropriate per-client adjustments (the overall view needs no such adjustment since
499+
* from an observing client's perspective, the segment never exists).
500+
*/
470501
private static accountForMoveOnInsert(
471502
combinedPartialLengths: PartialSequenceLengths,
472503
segment: ISegmentPrivate,
@@ -525,27 +556,14 @@ export class PartialSequenceLengths {
525556
if (!wasRemovedByInsertingClient && !wasMovedByInsertingClient) {
526557
const moveSeq = moveInfo?.movedSeq;
527558
assert(moveSeq !== undefined, "ObliterateOnInsertion implies moveSeq is defined");
528-
combinedPartialLengths.addClientSeqNumber(clientId, moveSeq, segment.cachedLength);
559+
combinedPartialLengths.addClientAdjustment(clientId, moveSeq, segment.cachedLength);
529560
}
530561
}
531562
}
532563

533564
/**
534565
* Inserts length information about the insertion of `segment` into
535-
* `combinedPartialLengths.partialLengths`.
536-
*
537-
* Does not update the clientSeqNumbers field to account for this segment.
538-
*
539-
* If `removalInfo` or `moveInfo` are defined, this operation updates the
540-
* bookkeeping to account for the (re)moval of this segment at the (re)movedSeq
541-
* instead.
542-
*
543-
* When the insertion or (re)moval of the segment is un-acked and
544-
* `combinedPartialLengths` is meant to compute such records, this does the
545-
* analogous addition to the bookkeeping for the local segment in
546-
* `combinedPartialLengths.unsequencedRecords`.
547-
*
548-
* TODO: Update this comment
566+
* `combinedPartialLengths.partialLengths` and the appropriate per-client adjustments.
549567
*/
550568
private static accountForInsertion(
551569
combinedPartialLengths: PartialSequenceLengths,
@@ -586,10 +604,14 @@ export class PartialSequenceLengths {
586604
len: 0,
587605
seglen: segmentLen,
588606
});
589-
combinedPartialLengths.addClientSeqNumber(clientId, seqOrLocalSeq, segmentLen);
607+
combinedPartialLengths.addClientAdjustment(clientId, seqOrLocalSeq, segmentLen);
590608
}
591609
}
592610

611+
/**
612+
* Inserts length information about the removal or obliteration of `segment` into
613+
* `combinedPartialLengths.partialLengths` and the appropriate per-client adjustments.
614+
*/
593615
private static accountForRemoval(
594616
combinedPartialLengths: PartialSequenceLengths,
595617
segment: ISegmentPrivate,
@@ -732,15 +754,15 @@ export class PartialSequenceLengths {
732754
} else {
733755
// Note that all clients that have a remove or obliterate operation on this segment
734756
// use the seq of the winning move/obliterate in their per-client adjustments!
735-
combinedPartialLengths.addClientSeqNumber(id, seqOrLocalSeq, lenDelta);
757+
combinedPartialLengths.addClientAdjustment(id, seqOrLocalSeq, lenDelta);
736758

737759
// Also ensure that all these clients have seen the segment as inserted before being removed
738760
// This is technically not necessary for removes (we never ask for the length of this block with
739761
// respect to a refSeq which this entry would affect), but it's simpler to just add it here.
740762
// We already add this entry as part of the accountForInsertion codepath for the client that
741763
// actually did insert the segment, hence not doing so [again] here.
742764
if (segment.seq > collabWindow.minSeq && id !== segment.clientId) {
743-
combinedPartialLengths.addClientSeqNumber(id, segment.seq, segment.cachedLength);
765+
combinedPartialLengths.addClientAdjustment(id, segment.seq, segment.cachedLength);
744766
}
745767
}
746768
}
@@ -801,11 +823,11 @@ export class PartialSequenceLengths {
801823
moveInfo.movedSeq < segment.seq &&
802824
moveInfo.wasMovedOnInsert
803825
) {
804-
this.addClientSeqNumber(clientId, moveInfo.movedSeq, segment.cachedLength);
826+
this.addClientAdjustment(clientId, moveInfo.movedSeq, segment.cachedLength);
805827
failIncrementalPropagation = true;
806828
} else {
807829
seqSeglen += segment.cachedLength;
808-
this.addClientSeqNumber(clientId, seq, segment.cachedLength);
830+
this.addClientAdjustment(clientId, seq, segment.cachedLength);
809831
}
810832
}
811833

@@ -816,9 +838,9 @@ export class PartialSequenceLengths {
816838
if (segment.seq !== UnassignedSequenceNumber && seq === earlierDeletion) {
817839
seqSeglen -= segment.cachedLength;
818840
if (clientId !== collabWindow.clientId) {
819-
this.addClientSeqNumber(clientId, seq, -segment.cachedLength);
841+
this.addClientAdjustment(clientId, seq, -segment.cachedLength);
820842
if (segment.seq > collabWindow.minSeq && segment.clientId !== clientId) {
821-
this.addClientSeqNumber(clientId, segment.seq, segment.cachedLength);
843+
this.addClientAdjustment(clientId, segment.seq, segment.cachedLength);
822844
failIncrementalPropagation = true;
823845
}
824846
}
@@ -849,7 +871,7 @@ export class PartialSequenceLengths {
849871
branchPartialLengths.perClientAdjustments.forEach((clientAdjustments, id) => {
850872
const leqBranchPartial = clientAdjustments.latestLeq(seq);
851873
if (leqBranchPartial && leqBranchPartial.seq === seq) {
852-
this.addClientSeqNumber(id, seq, leqBranchPartial.seglen);
874+
this.addClientAdjustment(id, seq, leqBranchPartial.seglen);
853875
}
854876
});
855877
}
@@ -914,24 +936,7 @@ export class PartialSequenceLengths {
914936
}
915937

916938
/**
917-
* TODO: Adjust this comment
918-
* Computes the seglen for the double-counted removed overlap at (refSeq, localSeq). This logic is equivalent
919-
* to the following:
920-
*
921-
* ```typescript
922-
* let total = 0;
923-
* for (const partialLength of this.unsequencedRecords!.overlappingRemoves) {
924-
* if (partialLength.seq > refSeq) {
925-
* break;
926-
* }
927-
*
928-
* if (partialLength.localSeq <= localSeq) {
929-
* total += partialLength.seglen;
930-
* }
931-
* }
932-
*
933-
* return total;
934-
* ```
939+
* Computes the seglen for the double-counted removed overlap at (refSeq, localSeq).
935940
*
936941
* Reconnect happens to only need to compute these lengths for two refSeq values: before and
937942
* after the rebase. Since these lists potentially scale with O(collab window * number of local edits)
@@ -952,17 +957,17 @@ export class PartialSequenceLengths {
952957
adjustments,
953958
] of this.unsequencedRecords.perRefSeqAdjustments.entries()) {
954959
if (seq > refSeq) {
955-
// TODO: Prior code path got away with an early exit here by sorting the entries by seq.
956-
// You could consider doing the same to restore some perf.
960+
// TODO: Prior code path got away with an early exit here by sorting the entries by refSeq.
961+
// We could do the same here if we wanted.
957962
// Old codepath basically flattened the 2d array into a 1d array with both dimensions listed.
958963
continue;
959964
}
960965

961966
for (const partial of adjustments.items) {
967+
// This coalesces entries with the same localSeq as well as computes overall lengths.
962968
partials.addOrUpdate(partial);
963969
}
964970
}
965-
// This coalesces entries with the same localSeq as well as computes overall lengths.
966971
cachedAdjustment = partials;
967972
this.unsequencedRecords.cachedAdjustmentByRefSeq.set(refSeq, cachedAdjustment);
968973
}
@@ -1007,8 +1012,7 @@ export class PartialSequenceLengths {
10071012
}
10081013
}
10091014

1010-
// TODO: Rename this
1011-
private addClientSeqNumber(clientId: number, seq: number, seglen: number): void {
1015+
private addClientAdjustment(clientId: number, seq: number, seglen: number): void {
10121016
this.perClientAdjustments[clientId] ??= new PartialSequenceLengthsSet();
10131017
const cli = this.perClientAdjustments[clientId];
10141018
cli.addOrUpdate({ seq, len: 0, seglen });

packages/dds/merge-tree/src/test/obliterateOperations.ts

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -41,9 +41,6 @@ const posInField = (
4141

4242
while (endPos < client.getLength() && client.getText(endPos, endPos + 1) !== "}") {
4343
const text = client.getText(endPos, endPos + 1);
44-
if (!(Number.isInteger(Number(text)) || text === "{")) {
45-
throw new Error("uh oh");
46-
}
4744
assert(
4845
Number.isInteger(Number(text)) || text === "{",
4946
"Non-integer character found within a field",

0 commit comments

Comments
 (0)