Skip to content

Conversation

@njzjz
Copy link
Member

@njzjz njzjz commented Jan 16, 2026

Summary by CodeRabbit

  • New Features

    • Human-readable number formatting for clearer size displays.
    • Enhanced model serialization tree visualization with compact, size-aware, and more readable pretty-print output.
  • Tests

    • Added tests validating the improved serialization tree display and number-formatting output.

✏️ Tip: You can customize this high-level summary in your review settings.

Copilot AI review requested due to automatic review settings January 16, 2026 17:50
@dosubot dosubot bot added the new feature label Jan 16, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a Node class for deserializing and displaying serialization tree structures, along with a helper function to format large numbers with suffixes. The implementation enables hierarchical visualization of model structures stored in serialized dictionaries.

Changes:

  • Added Node class with deserialization methods (deserialize, from_dict, from_list) to convert dictionary/list structures into tree nodes
  • Implemented display functionality via __str__ method that shows hierarchical structure with size information
  • Added format_big_number helper function to format large numbers with K/M/B suffixes
  • Added test case to verify the Node display functionality

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 7 comments.

File Description
deepmd/dpmodel/utils/serialization.py Implements Node class for tree deserialization and display, plus format_big_number utility
source/tests/common/dpmodel/test_network.py Adds test for Node display functionality and updates test data with @Class field

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 16, 2026

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

📝 Walkthrough

Walkthrough

Adds a large-number formatter and a new Node class that builds, sizes, deserializes, and pretty-prints serialization trees; updates tests to exercise Node deserialization and display and adds an "@class" field in test fixtures.

Changes

Cohort / File(s) Summary
Serialization utilities
deepmd/dpmodel/utils/serialization.py
Added format_big_number(int) -> str. Introduced Node class with __init__, @cached_property size, deserialize(), from_dict(), from_list(), and __str__() to compute sizes (uses numpy) and produce pretty-printed, merged/compact tree views; handles dict- and list-based children, variables, and deserialization.
Tests
source/tests/common/dpmodel/test_network.py
Added imports and test_node_display to validate Node string output. Extended test model fixtures to include a public "@class" field and updated expected display string.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 40.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main changes: adding a Node class for serialization and implementing display functionality, which directly matches the new Node class, format_big_number function, and test_node_display method added in the changeset.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@deepmd/dpmodel/utils/serialization.py`:
- Around line 299-301: The loop handling data_dict currently deep-copies the
"@variables" payload (variables = deepcopy(vv)), which unnecessarily duplicates
large numpy arrays; change this to a shallow copy instead so we don't duplicate
underlying array buffers—e.g., assign a shallow copy appropriate to the
container type (for numpy arrays use vv.copy() or for mapping types use
dict(vv)) or simply vv if already an independent object; update the assignment
for the "@variables" branch (the variables variable assigned from
data_dict["@variables"]) to use a shallow copy strategy rather than deepcopy to
avoid doubling memory during deserialization.
📜 Review details

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e9f2a8d and 58691c6.

📒 Files selected for processing (1)
  • deepmd/dpmodel/utils/serialization.py
🧰 Additional context used
🪛 Ruff (0.14.11)
deepmd/dpmodel/utils/serialization.py

271-271: Prefer TypeError exception for invalid type

(TRY004)


271-271: Avoid specifying long messages outside the exception class

(TRY003)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (40)
  • GitHub Check: Build C++ (cuda120, cuda)
  • GitHub Check: Build C++ (cpu, cpu)
  • GitHub Check: Build C++ (rocm, rocm)
  • GitHub Check: Build C++ (clang, clang)
  • GitHub Check: Build wheels for cp311-macosx_x86_64
  • GitHub Check: Build wheels for cp310-manylinux_aarch64
  • GitHub Check: Build wheels for cp311-manylinux_x86_64
  • GitHub Check: Build wheels for cp311-win_amd64
  • GitHub Check: Build wheels for cp311-macosx_arm64
  • GitHub Check: Test Python (12, 3.10)
  • GitHub Check: Test Python (10, 3.10)
  • GitHub Check: Test Python (7, 3.13)
  • GitHub Check: Test Python (11, 3.13)
  • GitHub Check: Test Python (9, 3.10)
  • GitHub Check: Test Python (12, 3.13)
  • GitHub Check: Test Python (3, 3.10)
  • GitHub Check: Test Python (6, 3.13)
  • GitHub Check: Test Python (9, 3.13)
  • GitHub Check: Test Python (8, 3.13)
  • GitHub Check: Test Python (10, 3.13)
  • GitHub Check: Test Python (7, 3.10)
  • GitHub Check: Test Python (8, 3.10)
  • GitHub Check: Test Python (11, 3.10)
  • GitHub Check: Test Python (4, 3.13)
  • GitHub Check: Test Python (2, 3.13)
  • GitHub Check: Test Python (4, 3.10)
  • GitHub Check: Test Python (2, 3.10)
  • GitHub Check: Test Python (3, 3.13)
  • GitHub Check: Test Python (1, 3.10)
  • GitHub Check: Test Python (5, 3.10)
  • GitHub Check: Test Python (5, 3.13)
  • GitHub Check: Test Python (6, 3.10)
  • GitHub Check: Test Python (1, 3.13)
  • GitHub Check: Analyze (c-cpp)
  • GitHub Check: Analyze (python)
  • GitHub Check: Test C++ (false, false, false, true)
  • GitHub Check: Test C++ (true, true, true, false)
  • GitHub Check: Test C++ (true, false, false, true)
  • GitHub Check: Test C++ (false, true, true, false)
  • GitHub Check: Build C library (2.18, libdeepmd_c.tar.gz)
🔇 Additional comments (7)
deepmd/dpmodel/utils/serialization.py (7)

7-12: Import additions look consistent.

deepcopy and cached_property are both used later in the file.


181-201: Formatting helper is clear.

The thresholds and suffixes are straightforward and readable.


214-225: Constructor is straightforward.

The field initialization is clear and minimal.


226-250: Size aggregation logic is easy to follow.

Counting variables plus child sizes is clean and readable.


252-271: Deserialize routing is clean.

The dict/list branching is concise and readable.


314-337: List deserialization is clear.

The handling of nested dict/list entries is straightforward.


339-364: Pretty‑print logic is readable.

The indentation and merging logic are easy to follow.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Comment on lines +299 to +301
for kk, vv in data_dict.items():
if kk == "@variables":
variables = deepcopy(vv)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cd deepmd && wc -l dpmodel/utils/serialization.py

Repository: deepmodeling/deepmd-kit

Length of output: 101


🏁 Script executed:

cd deepmd && sed -n '290,310p' dpmodel/utils/serialization.py

Repository: deepmodeling/deepmd-kit

Length of output: 793


🏁 Script executed:

cd deepmd && sed -n '280,330p' dpmodel/utils/serialization.py

Repository: deepmodeling/deepmd-kit

Length of output: 1524


🏁 Script executed:

cd deepmd && sed -n '260,320p' dpmodel/utils/serialization.py

Repository: deepmodeling/deepmd-kit

Length of output: 1844


🏁 Script executed:

cd deepmd && grep -n "class Node" dpmodel/utils/serialization.py

Repository: deepmodeling/deepmd-kit

Length of output: 82


🏁 Script executed:

cd deepmd && sed -n '1,150p' dpmodel/utils/serialization.py

Repository: deepmodeling/deepmd-kit

Length of output: 3840


🏁 Script executed:

cd deepmd && sed -n '150,250p' dpmodel/utils/serialization.py

Repository: deepmodeling/deepmd-kit

Length of output: 2698


🏁 Script executed:

cd deepmd && sed -n '204,364p' dpmodel/utils/serialization.py

Repository: deepmodeling/deepmd-kit

Length of output: 4772


🏁 Script executed:

cd deepmd && rg "\.variables\s*=" dpmodel/utils/serialization.py

Repository: deepmodeling/deepmd-kit

Length of output: 49


🏁 Script executed:

cd deepmd && rg "variables\[" dpmodel/utils/serialization.py

Repository: deepmodeling/deepmd-kit

Length of output: 49


🏁 Script executed:

cd deepmd && rg "variables\.pop|variables\.clear|variables\.update" dpmodel/utils/serialization.py

Repository: deepmodeling/deepmd-kit

Length of output: 49


🏁 Script executed:

cd deepmd && rg "Node\.from_dict|Node\.from_list|Node\.deserialize" dpmodel/ --type py -A 2 -B 2

Repository: deepmodeling/deepmd-kit

Length of output: 371


🏁 Script executed:

cd deepmd && rg "from_dict|load_dp_model" dpmodel/ --type py | head -20

Repository: deepmodeling/deepmd-kit

Length of output: 747


🏁 Script executed:

cd deepmd && grep -n "from_dict" dpmodel/utils/serialization.py

Repository: deepmodeling/deepmd-kit

Length of output: 277


🏁 Script executed:

cd deepmd && python3 << 'EOF'
# Let's understand what vv.copy() vs deepcopy(vv) would do
# vv is the "@variables" dict which may contain numpy arrays

import numpy as np
from copy import deepcopy

# Example structure
vv = {
    "param1": np.array([[1, 2], [3, 4]]),
    "param2": np.array([5, 6, 7]),
}

shallow = vv.copy()
deep = deepcopy(vv)

# Test if numpy arrays are shared
print("Shallow copy - arrays are same object?")
print(f"  param1: {shallow['param1'] is vv['param1']}")
print(f"  param2: {shallow['param2'] is vv['param2']}")

print("\nDeep copy - arrays are same object?")
print(f"  param1: {deep['param1'] is vv['param1']}")
print(f"  param2: {deep['param2'] is vv['param2']}")

print("\nMemory sharing - shallow copy modifies original?")
shallow['param1'][0, 0] = 999
print(f"  Original param1[0,0]: {vv['param1'][0, 0]}")

# Restore
vv['param1'][0, 0] = 1

print("\nMemory sharing - deep copy modifies original?")
deep['param1'][0, 0] = 999
print(f"  Original param1[0,0]: {vv['param1'][0, 0]}")
EOF

Repository: deepmodeling/deepmd-kit

Length of output: 153


🏁 Script executed:

cd deepmd && sed -n '140,200p' dpmodel/utils/serialization.py

Repository: deepmodeling/deepmd-kit

Length of output: 1774


🏁 Script executed:

cd deepmd && rg "load_dp_model" --type py -A 5 -B 2

Repository: deepmodeling/deepmd-kit

Length of output: 4857


🏁 Script executed:

cd deepmd && sed -n '260,340p' dpmodel/utils/serialization.py | cat -n

Repository: deepmodeling/deepmd-kit

Length of output: 2956


🏁 Script executed:

cd deepmd && grep -A 30 "def __init__" dpmodel/utils/serialization.py | head -40

Repository: deepmodeling/deepmd-kit

Length of output: 1131


🏁 Script executed:

cd deepmd && rg "from_dict.*=" dpmodel/infer/deep_eval.py -B 3 -A 3

Repository: deepmodeling/deepmd-kit

Length of output: 49


🏁 Script executed:

cd deepmd && rg "model_data\[|node\.|Node\." dpmodel/infer/deep_eval.py -B 2 -A 2 | head -40

Repository: deepmodeling/deepmd-kit

Length of output: 300


🏁 Script executed:

cd deepmd && sed -n '1,50p' dpmodel/infer/deep_eval.py

Repository: deepmodeling/deepmd-kit

Length of output: 1051


🏁 Script executed:

cd deepmd && sed -n '50,100p' dpmodel/infer/deep_eval.py

Repository: deepmodeling/deepmd-kit

Length of output: 1761


Avoid deep‑copying large @variables payloads.

Line 301 unnecessarily duplicates numpy arrays via deepcopy(), which doubles memory usage during deserialization. Since Node is read-only and @variables arrays are already independent copies from load_dp_model(), use a shallow copy instead:

Suggested change (shallow copy)
-                variables = deepcopy(vv)
+                # Shallow copy to avoid duplicating large ndarray payloads
+                variables = vv.copy()
🤖 Prompt for AI Agents
In `@deepmd/dpmodel/utils/serialization.py` around lines 299 - 301, The loop
handling data_dict currently deep-copies the "@variables" payload (variables =
deepcopy(vv)), which unnecessarily duplicates large numpy arrays; change this to
a shallow copy instead so we don't duplicate underlying array buffers—e.g.,
assign a shallow copy appropriate to the container type (for numpy arrays use
vv.copy() or for mapping types use dict(vv)) or simply vv if already an
independent object; update the assignment for the "@variables" branch (the
variables variable assigned from data_dict["@variables"]) to use a shallow copy
strategy rather than deepcopy to avoid doubling memory during deserialization.

@codecov
Copy link

codecov bot commented Jan 16, 2026

Codecov Report

❌ Patch coverage is 87.62887% with 12 lines in your changes missing coverage. Please review.
✅ Project coverage is 81.93%. Comparing base (8b0c3b0) to head (9766074).

Files with missing lines Patch % Lines
deepmd/dpmodel/utils/serialization.py 87.62% 12 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##           master    #5158   +/-   ##
=======================================
  Coverage   81.92%   81.93%           
=======================================
  Files         714      714           
  Lines       73301    73399   +98     
  Branches     3616     3616           
=======================================
+ Hits        60055    60139   +84     
- Misses      12083    12096   +13     
- Partials     1163     1164    +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@deepmd/dpmodel/utils/serialization.py`:
- Around line 348-358: The current merge logic in the loop over self.children
(using children_buff, child_repr, last_repr) collapses distinct dict keys when
their stringified child subtrees match; restrict merging to only list-type
parents by checking the parent node type before merging (e.g., only perform the
"if child_repr == last_repr: ... merge ..." sequence when the current node
represents a list node), or add a guard that inspects the key type/format (the
kk variable) to ensure keys are positional indices before combining; update the
loop around children_buff, kk, vv and the merge branch so dict children are
never concatenated into "a, b" while preserving the existing merge behavior for
true list nodes.
- Around line 181-201: The format_big_number function can produce results like
"1000.0K" when rounding crosses a unit boundary; update format_big_number to
detect when the rounded quotient equals or exceeds 1000.0 and promote to the
next suffix (e.g., convert 1000.0K → 1.0M, 1000.0M → 1.0B). Concretely, compute
the quotient for each branch (q = x / 1_000, 1_000_000, 1_000_000_000),
round/format it to one decimal, and if q >= 1000.0, fall through to the next
unit branch so the final string uses the higher suffix; keep the same
one-decimal formatting and ensure the else branch returns str(x).
♻️ Duplicate comments (1)
deepmd/dpmodel/utils/serialization.py (1)

299-302: Avoid deep-copying large @variables payloads.

deepcopy(vv) duplicates ndarray buffers and can be very expensive for large models. A shallow copy should be sufficient if Node is read-only or if @variables already owns its arrays.

💡 Suggested change
-            if kk == "@variables":
-                variables = deepcopy(vv)
+            if kk == "@variables":
+                # Shallow copy to avoid duplicating large ndarray payloads
+                variables = vv.copy()

Comment on lines +181 to +201
def format_big_number(x: int) -> str:
"""Format a big number with suffixes.
Parameters
----------
x : int
The number to format.
Returns
-------
str
The formatted string.
"""
if x >= 1_000_000_000:
return f"{x / 1_000_000_000:.1f}B"
elif x >= 1_000_000:
return f"{x / 1_000_000:.1f}M"
elif x >= 1_000:
return f"{x / 1_000:.1f}K"
else:
return str(x)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Avoid “1000.0K/M” rounding at unit boundaries.

Values near thresholds (e.g., 999,950) can format as 1000.0K instead of 1.0M. Consider promoting to the next unit when rounding crosses 1000.

💡 Proposed adjustment to boundary handling
-    if x >= 1_000_000_000:
-        return f"{x / 1_000_000_000:.1f}B"
-    elif x >= 1_000_000:
-        return f"{x / 1_000_000:.1f}M"
-    elif x >= 1_000:
-        return f"{x / 1_000:.1f}K"
+    if x >= 1_000_000_000:
+        return f"{x / 1_000_000_000:.1f}B"
+    elif x >= 1_000_000:
+        value = x / 1_000_000
+        if value >= 999.95:
+            return f"{x / 1_000_000_000:.1f}B"
+        return f"{value:.1f}M"
+    elif x >= 1_000:
+        value = x / 1_000
+        if value >= 999.95:
+            return f"{x / 1_000_000:.1f}M"
+        return f"{value:.1f}K"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def format_big_number(x: int) -> str:
"""Format a big number with suffixes.
Parameters
----------
x : int
The number to format.
Returns
-------
str
The formatted string.
"""
if x >= 1_000_000_000:
return f"{x / 1_000_000_000:.1f}B"
elif x >= 1_000_000:
return f"{x / 1_000_000:.1f}M"
elif x >= 1_000:
return f"{x / 1_000:.1f}K"
else:
return str(x)
def format_big_number(x: int) -> str:
"""Format a big number with suffixes.
Parameters
----------
x : int
The number to format.
Returns
-------
str
The formatted string.
"""
if x >= 1_000_000_000:
return f"{x / 1_000_000_000:.1f}B"
elif x >= 1_000_000:
value = x / 1_000_000
if value >= 999.95:
return f"{x / 1_000_000_000:.1f}B"
return f"{value:.1f}M"
elif x >= 1_000:
value = x / 1_000
if value >= 999.95:
return f"{x / 1_000_000:.1f}M"
return f"{value:.1f}K"
else:
return str(x)
🤖 Prompt for AI Agents
In `@deepmd/dpmodel/utils/serialization.py` around lines 181 - 201, The
format_big_number function can produce results like "1000.0K" when rounding
crosses a unit boundary; update format_big_number to detect when the rounded
quotient equals or exceeds 1000.0 and promote to the next suffix (e.g., convert
1000.0K → 1.0M, 1000.0M → 1.0B). Concretely, compute the quotient for each
branch (q = x / 1_000, 1_000_000, 1_000_000_000), round/format it to one
decimal, and if q >= 1000.0, fall through to the next unit branch so the final
string uses the higher suffix; keep the same one-decimal formatting and ensure
the else branch returns str(x).

Comment on lines +348 to +358
for ii, (kk, vv) in enumerate(self.children.items()):
# add indentation
child_repr = str(vv)
if len(children_buff) > 0:
# check if it is the same as the last one
last_repr = children_buff[-1][1]
if child_repr == last_repr:
# merge
last_kk, _ = children_buff[-1]
children_buff[-1] = (f"{last_kk}, {kk}", last_repr)
continue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Avoid collapsing distinct dict keys with identical subtrees.

The merge logic applies to all nodes, so two different dict keys with identical child representations get merged (e.g., "a, b"), which can hide structure. Consider limiting merges to list nodes.

💡 Restrict merging to list nodes
-            if len(children_buff) > 0:
+            if self.name == "ListNode" and len(children_buff) > 0:
                 # check if it is the same as the last one
                 last_repr = children_buff[-1][1]
                 if child_repr == last_repr:
🧰 Tools
🪛 Ruff (0.14.13)

348-348: Loop control variable ii not used within loop body

Rename unused ii to _ii

(B007)

🤖 Prompt for AI Agents
In `@deepmd/dpmodel/utils/serialization.py` around lines 348 - 358, The current
merge logic in the loop over self.children (using children_buff, child_repr,
last_repr) collapses distinct dict keys when their stringified child subtrees
match; restrict merging to only list-type parents by checking the parent node
type before merging (e.g., only perform the "if child_repr == last_repr: ...
merge ..." sequence when the current node represents a list node), or add a
guard that inspects the key type/format (the kk variable) to ensure keys are
positional indices before combining; update the loop around children_buff, kk,
vv and the merge branch so dict children are never concatenated into "a, b"
while preserving the existing merge behavior for true list nodes.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@deepmd/dpmodel/utils/serialization.py`:
- Around line 348-378: The closure format_value currently captures the loop
variable ii from the outer for loop (line with self.children.items()), causing
wrong indentation; rename that outer ii to _ (unused) and change format_value to
accept ii as a parameter (e.g., def format_value(vv: str, ii: int) -> str) and
update the generator that builds buff to call format_value(vv, ii) using the ii
from enumerate(children_buff); keep format_list_keys and children_buff logic
unchanged (ListNode handling remains the same).
♻️ Duplicate comments (3)
deepmd/dpmodel/utils/serialization.py (3)

181-201: LGTM with minor caveat.

The function correctly formats large numbers with appropriate suffixes. The boundary rounding issue (where values near thresholds could display as "1000.0K" instead of "1.0M") was already noted in a previous review.


214-250: LGTM!

The constructor properly initializes all fields, and size is appropriately cached. The unused return value in count_variables (line 241) was already noted in a previous review.


273-311: LGTM!

The node construction logic correctly handles class naming, variable extraction, and recursive child building. The deepcopy concern for @variables (line 301) was already addressed in a previous review.

🧹 Nitpick comments (1)
deepmd/dpmodel/utils/serialization.py (1)

266-271: Consider using TypeError for invalid type.

When the input is neither a dict nor a list, this is a type mismatch rather than an invalid value. Per Python conventions, TypeError is more appropriate here.

♻️ Suggested fix
         else:
-            raise ValueError("Cannot deserialize Node from non-dict/list data.")
+            raise TypeError(f"Cannot deserialize Node from {type(data).__name__}; expected dict or list.")

Comment on lines +348 to +378
for ii, (kk, vv) in enumerate(self.children.items()):
# add indentation
child_repr = str(vv)
if len(children_buff) > 0:
# check if it is the same as the last one
last_repr = children_buff[-1][1]
if child_repr == last_repr:
# merge
last_kk, _ = children_buff[-1]
children_buff[-1] = (f"{last_kk}, {kk}", last_repr)
continue
children_buff.append((kk, child_repr))

def format_list_keys(kk: str) -> str:
if self.name == "ListNode":
keys = kk.split(", ")
if len(keys) > 2:
return f"[{keys[0]}...{keys[-1]}]"
return kk

def format_value(vv: str) -> str:
return vv.replace(
linebreak,
linebreak + (pipe if ii < len(children_buff) - 1 else blank),
)

buff.extend(
f"{tee if ii < len(children_buff) - 1 else elbow}{format_list_keys(kk)} -> {format_value(vv)}"
for ii, (kk, vv) in enumerate(children_buff)
)
return "\n".join(buff)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Bug: format_value closure captures wrong loop variable.

The format_value function at line 368 references ii, but it captures ii from the outer loop at line 348 (which is unused and ends with its final value), not the ii from the generator at line 376. This causes incorrect indentation for all but possibly the last child.

🐛 Proposed fix

Pass ii as a parameter to format_value to ensure correct binding:

-        def format_value(vv: str) -> str:
+        def format_value(vv: str, idx: int) -> str:
             return vv.replace(
                 linebreak,
-                linebreak + (pipe if ii < len(children_buff) - 1 else blank),
+                linebreak + (pipe if idx < len(children_buff) - 1 else blank),
             )

         buff.extend(
-            f"{tee if ii < len(children_buff) - 1 else elbow}{format_list_keys(kk)} -> {format_value(vv)}"
+            f"{tee if ii < len(children_buff) - 1 else elbow}{format_list_keys(kk)} -> {format_value(vv, ii)}"
             for ii, (kk, vv) in enumerate(children_buff)
         )

Also rename the unused ii at line 348 to _ per static analysis hint B007:

-        for ii, (kk, vv) in enumerate(self.children.items()):
+        for _, (kk, vv) in enumerate(self.children.items()):
🧰 Tools
🪛 Ruff (0.14.13)

348-348: Loop control variable ii not used within loop body

(B007)

🤖 Prompt for AI Agents
In `@deepmd/dpmodel/utils/serialization.py` around lines 348 - 378, The closure
format_value currently captures the loop variable ii from the outer for loop
(line with self.children.items()), causing wrong indentation; rename that outer
ii to _ (unused) and change format_value to accept ii as a parameter (e.g., def
format_value(vv: str, ii: int) -> str) and update the generator that builds buff
to call format_value(vv, ii) using the ii from enumerate(children_buff); keep
format_list_keys and children_buff logic unchanged (ListNode handling remains
the same).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants