Skip to content

Conversation

@agarwal-navin
Copy link
Contributor

@agarwal-navin agarwal-navin commented Dec 8, 2025

Added new formats for the forest summarizer's data and summaries - ForestFormatVersion.v2 and ForestSummaryFormatVersion.v3 respectively. They both require minVersionForCollab to be 2.74.0.

  • ForestFormatVersion.v2 - Incremental encoding of the forest's data is supported from this version onwards.
  • ForestSummaryFormatVersion.v3 - Incremental summary in forest is supported from this version onwards. Also, the top-level forest content is stored under a summary blob with key "contents". This is the same key where incremental chunk contents are stored making the summary tree consistent.

Incremental summaries will require these two versions to be enabled, i.e., it is supported from minVersionForCollab 2.74.0 onwards. Enabling incremental summaries is a cross-client breaking change, so this will ensure that clients can read these new formats when it is enabled.
Added validation during forest summarizer load that the top-level content is present in the correct blob key as per the summary format version.

AB#41865

Copilot AI review requested due to automatic review settings December 8, 2025 23:12
@agarwal-navin agarwal-navin requested a review from a team as a code owner December 8, 2025 23:12
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces new format versions to enable incremental summary support in the forest data structure. It adds ForestFormatVersion.v2 (supporting incremental encoding) and ForestSummaryFormatVersion.v3 (supporting incremental summaries), both requiring minVersionForCollab 2.74.0 or higher.

Key Changes:

  • Added version mapping logic to enable new formats based on client version (2.74.0+)
  • Unified the blob key naming convention - top-level and chunk contents now both use "contents" key in v3
  • Extended loadInternal methods to receive version information for proper blob key resolution

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
summaryTypes.ts Adds v3 format version enum, new blob key constants (summaryContentBlobKeyV1, summaryContentBlobKeyV3), and helper function to resolve correct key based on version
format.ts Adds ForestFormatVersion.v2 enum value and corresponding type definitions with validation set
codec.ts Updates version mapping logic to return v2 for clients 2.74.0+, making encoder/decoder support multiple format versions
forestSummarizer.ts Determines format versions based on client version, enables incremental summary only when both forest and summary formats support it, uses version-aware blob key resolution during load
versionedSummarizer.ts Passes version parameter to loadInternal method for subclasses to handle version-specific loading logic
incrementalSummaryBuilder.ts Updates to use summaryContentBlobKeyV3 for all content blobs, removes outdated comment about key inconsistency, adds forestSummaryRootContentKey parameter to completeSummary
incrementalSummaryBuilder.spec.ts Updates test mocks to pass both content and content key parameters, uses v1 key for non-incremental and v3 key for incremental tests
forestSummarizer.spec.ts Updates tests to use correct blob keys (v1 for backward compat tests, v3 for incremental tests), adds minVersionForCollab to test configurations

import { brand } from "../../util/index.js";
import { FormatCommon, ForestFormatVersion } from "./formatCommon.js";

export const FormatV1 = FormatCommon(brand<ForestFormatVersion>(ForestFormatVersion.v1));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems odd. ForestFormatVersion is an enum, so it should not be used with out custom branding logic, it should already be sufficiently type safe.

Suggested change
export const FormatV1 = FormatCommon(brand<ForestFormatVersion>(ForestFormatVersion.v1));
export const FormatV1 = FormatCommon(ForestFormatVersion.v1);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ForestSummaryFormatVersion is an enum. ForestFormatVersion is a branded const. We should probably change it to an enum as well.

import { brand } from "../../util/index.js";
import { FormatCommon, ForestFormatVersion } from "./formatCommon.js";

export const FormatV2 = FormatCommon(brand<ForestFormatVersion>(ForestFormatVersion.v2));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
export const FormatV2 = FormatCommon(brand<ForestFormatVersion>(ForestFormatVersion.v2));
export const FormatV2 = FormatCommon(ForestFormatVersion.v2);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as the above comment. This is still a branded type.

@agarwal-navin agarwal-navin merged commit 09f5901 into microsoft:main Dec 10, 2025
28 checks passed
@agarwal-navin agarwal-navin deleted the forestSummarizerVersion3 branch December 10, 2025 20:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants