Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[wip] Pandas Dataframe in Dataclass #3116

Open
wants to merge 8 commits into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
161 changes: 145 additions & 16 deletions flytekit/core/type_engine.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
from __future__ import annotations

import dataclasses
from dataclasses import dataclass, fields, make_dataclass, is_dataclass, MISSING
import asyncio
import collections
import copy
Expand Down Expand Up @@ -129,7 +130,6 @@
lit.scalar.structured_dataset.uri
)


class TypeTransformerFailedError(TypeError, AssertionError, ValueError): ...


Expand Down Expand Up @@ -683,6 +683,40 @@
Set `FLYTE_USE_OLD_DC_FORMAT=true` to use the old JSON-based format.
Note: This is deprecated and will be removed in the future.
"""
import pandas as pd
from flytekit.types.file import FlyteFile
from flytekit.types.directory import FlyteDirectory
from flytekit.types.structured.structured_dataset import StructuredDataset
from flytekit.types.schema import FlyteSchema

Check warning on line 690 in flytekit/core/type_engine.py

View check run for this annotation

Codecov / codecov/patch

flytekit/core/type_engine.py#L686-L690

Added lines #L686 - L690 were not covered by tests

from typing import get_type_hints, Type, Dict

Check warning on line 692 in flytekit/core/type_engine.py

View check run for this annotation

Codecov / codecov/patch

flytekit/core/type_engine.py#L692

Added line #L692 was not covered by tests

def transform_dataclass(cls, memo=None):
FLYTE_TYPES = [FlyteFile, FlyteDirectory, StructuredDataset, FlyteSchema]

Check warning on line 695 in flytekit/core/type_engine.py

View check run for this annotation

Codecov / codecov/patch

flytekit/core/type_engine.py#L694-L695

Added lines #L694 - L695 were not covered by tests
if cls in FLYTE_TYPES:
return cls

Check warning on line 697 in flytekit/core/type_engine.py

View check run for this annotation

Codecov / codecov/patch

flytekit/core/type_engine.py#L697

Added line #L697 was not covered by tests
if memo is None:
memo = {}

Check warning on line 699 in flytekit/core/type_engine.py

View check run for this annotation

Codecov / codecov/patch

flytekit/core/type_engine.py#L699

Added line #L699 was not covered by tests

if cls in memo:
return memo[cls]

Check warning on line 702 in flytekit/core/type_engine.py

View check run for this annotation

Codecov / codecov/patch

flytekit/core/type_engine.py#L702

Added line #L702 was not covered by tests

cls_hints = get_type_hints(cls)
new_field_defs = []

Check warning on line 705 in flytekit/core/type_engine.py

View check run for this annotation

Codecov / codecov/patch

flytekit/core/type_engine.py#L704-L705

Added lines #L704 - L705 were not covered by tests
for field in fields(cls):
orig_type = cls_hints[field.name]

Check warning on line 707 in flytekit/core/type_engine.py

View check run for this annotation

Codecov / codecov/patch

flytekit/core/type_engine.py#L707

Added line #L707 was not covered by tests
if orig_type == pd.DataFrame:
new_type = StructuredDataset

Check warning on line 709 in flytekit/core/type_engine.py

View check run for this annotation

Codecov / codecov/patch

flytekit/core/type_engine.py#L709

Added line #L709 was not covered by tests
elif is_dataclass(orig_type):
new_type = transform_dataclass(orig_type, memo)

Check warning on line 711 in flytekit/core/type_engine.py

View check run for this annotation

Codecov / codecov/patch

flytekit/core/type_engine.py#L711

Added line #L711 was not covered by tests
else:
new_type = orig_type
new_field_defs.append((field.name, new_type))

Check warning on line 714 in flytekit/core/type_engine.py

View check run for this annotation

Codecov / codecov/patch

flytekit/core/type_engine.py#L713-L714

Added lines #L713 - L714 were not covered by tests

new_cls = make_dataclass("FlyteModified" + cls.__name__, new_field_defs)
memo[cls] = new_cls
return new_cls

Check warning on line 718 in flytekit/core/type_engine.py

View check run for this annotation

Codecov / codecov/patch

flytekit/core/type_engine.py#L716-L718

Added lines #L716 - L718 were not covered by tests

if isinstance(python_val, dict):
json_str = json.dumps(python_val)
return Literal(scalar=Scalar(generic=_json_format.Parse(json_str, _struct.Struct())))
Expand All @@ -694,16 +728,17 @@
)

self._make_dataclass_serializable(python_val, python_type)
new_python_type = transform_dataclass(python_type)

Check warning on line 731 in flytekit/core/type_engine.py

View check run for this annotation

Codecov / codecov/patch

flytekit/core/type_engine.py#L731

Added line #L731 was not covered by tests

# JSON serialization using mashumaro's DataClassJSONMixin
if isinstance(python_val, DataClassJSONMixin):
json_str = python_val.to_json()
else:
try:
encoder = self._json_encoder[python_type]
encoder = self._json_encoder[new_python_type]

Check warning on line 738 in flytekit/core/type_engine.py

View check run for this annotation

Codecov / codecov/patch

flytekit/core/type_engine.py#L738

Added line #L738 was not covered by tests
except KeyError:
encoder = JSONEncoder(python_type)
self._json_encoder[python_type] = encoder
encoder = JSONEncoder(new_python_type)
self._json_encoder[new_python_type] = encoder

Check warning on line 741 in flytekit/core/type_engine.py

View check run for this annotation

Codecov / codecov/patch

flytekit/core/type_engine.py#L740-L741

Added lines #L740 - L741 were not covered by tests
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential encoder type mismatch issue

Consider using python_type instead of new_python_type when storing the encoder in self._json_encoder. The encoder is created using python_type but stored with new_python_type, which could lead to inconsistencies in encoder lookup.

Code suggestion
Check the AI-generated fix before applying
Suggested change
self._json_encoder[new_python_type] = encoder
self._json_encoder[python_type] = encoder

Code Review Run #c4bd83


Is this a valid issue, or was it incorrectly flagged by the Agent?

  • it was incorrectly flagged


try:
json_str = encoder.encode(python_val)
Expand All @@ -729,7 +764,43 @@
f"user defined datatypes in Flytekit"
)


import pandas as pd
from flytekit.types.file import FlyteFile
from flytekit.types.directory import FlyteDirectory
from flytekit.types.structured.structured_dataset import StructuredDataset
from flytekit.types.schema import FlyteSchema

Check warning on line 772 in flytekit/core/type_engine.py

View check run for this annotation

Codecov / codecov/patch

flytekit/core/type_engine.py#L768-L772

Added lines #L768 - L772 were not covered by tests

from typing import get_type_hints, Type, Dict

Check warning on line 774 in flytekit/core/type_engine.py

View check run for this annotation

Codecov / codecov/patch

flytekit/core/type_engine.py#L774

Added line #L774 was not covered by tests

def transform_dataclass(cls, memo=None):
FLYTE_TYPES = [FlyteFile, FlyteDirectory, StructuredDataset, FlyteSchema]

Check warning on line 777 in flytekit/core/type_engine.py

View check run for this annotation

Codecov / codecov/patch

flytekit/core/type_engine.py#L776-L777

Added lines #L776 - L777 were not covered by tests
if cls in FLYTE_TYPES:
return cls

Check warning on line 779 in flytekit/core/type_engine.py

View check run for this annotation

Codecov / codecov/patch

flytekit/core/type_engine.py#L779

Added line #L779 was not covered by tests
Comment on lines +777 to +779
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider moving constant to module level

Consider moving the FLYTE_TYPES list to a module-level constant since it's used in multiple places within the same file. This would improve maintainability and reduce duplication.

Code suggestion
Check the AI-generated fix before applying
Suggested change
FLYTE_TYPES = [FlyteFile, FlyteDirectory, StructuredDataset, FlyteSchema]
if cls in FLYTE_TYPES:
return cls
FLYTE_TYPES = [FlyteFile, FlyteDirectory, StructuredDataset, FlyteSchema]
if cls in FLYTE_TYPES: return cls

Code Review Run #e17dbe


Is this a valid issue, or was it incorrectly flagged by the Agent?

  • it was incorrectly flagged

Comment on lines +777 to +779
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using immutable type for constants

Consider using a tuple or frozenset instead of list for FLYTE_TYPES since it appears to be a constant collection that won't be modified. This provides better performance for lookups and makes the intent clearer.

Code suggestion
Check the AI-generated fix before applying
Suggested change
FLYTE_TYPES = [FlyteFile, FlyteDirectory, StructuredDataset, FlyteSchema]
if cls in FLYTE_TYPES:
return cls
FLYTE_TYPES = (FlyteFile, FlyteDirectory, StructuredDataset, FlyteSchema)
if cls in FLYTE_TYPES:
return cls

Code Review Run #e17dbe


Is this a valid issue, or was it incorrectly flagged by the Agent?

  • it was incorrectly flagged

if memo is None:
memo = {}

Check warning on line 781 in flytekit/core/type_engine.py

View check run for this annotation

Codecov / codecov/patch

flytekit/core/type_engine.py#L781

Added line #L781 was not covered by tests

if cls in memo:
return memo[cls]

Check warning on line 784 in flytekit/core/type_engine.py

View check run for this annotation

Codecov / codecov/patch

flytekit/core/type_engine.py#L784

Added line #L784 was not covered by tests

cls_hints = get_type_hints(cls)
new_field_defs = []

Check warning on line 787 in flytekit/core/type_engine.py

View check run for this annotation

Codecov / codecov/patch

flytekit/core/type_engine.py#L786-L787

Added lines #L786 - L787 were not covered by tests
for field in fields(cls):
orig_type = cls_hints[field.name]

Check warning on line 789 in flytekit/core/type_engine.py

View check run for this annotation

Codecov / codecov/patch

flytekit/core/type_engine.py#L789

Added line #L789 was not covered by tests
if orig_type == pd.DataFrame:
new_type = StructuredDataset

Check warning on line 791 in flytekit/core/type_engine.py

View check run for this annotation

Codecov / codecov/patch

flytekit/core/type_engine.py#L791

Added line #L791 was not covered by tests
elif is_dataclass(orig_type):
new_type = transform_dataclass(orig_type, memo)

Check warning on line 793 in flytekit/core/type_engine.py

View check run for this annotation

Codecov / codecov/patch

flytekit/core/type_engine.py#L793

Added line #L793 was not covered by tests
else:
new_type = orig_type
new_field_defs.append((field.name, new_type))

Check warning on line 796 in flytekit/core/type_engine.py

View check run for this annotation

Codecov / codecov/patch

flytekit/core/type_engine.py#L795-L796

Added lines #L795 - L796 were not covered by tests

new_cls = make_dataclass("FlyteModified" + cls.__name__, new_field_defs)
memo[cls] = new_cls
return new_cls

Check warning on line 800 in flytekit/core/type_engine.py

View check run for this annotation

Codecov / codecov/patch

flytekit/core/type_engine.py#L798-L800

Added lines #L798 - L800 were not covered by tests

self._make_dataclass_serializable(python_val, python_type)
new_python_type = transform_dataclass(python_type)

Check warning on line 803 in flytekit/core/type_engine.py

View check run for this annotation

Codecov / codecov/patch

flytekit/core/type_engine.py#L803

Added line #L803 was not covered by tests

# The `to_json` integrated through mashumaro's `DataClassJSONMixin` allows for more
# functionality than JSONEncoder
Expand All @@ -742,10 +813,10 @@
# The function looks up or creates a MessagePackEncoder specifically designed for the object's type.
# This encoder is then used to convert a data class into MessagePack Bytes.
try:
encoder = self._msgpack_encoder[python_type]
encoder = self._msgpack_encoder[new_python_type]

Check warning on line 816 in flytekit/core/type_engine.py

View check run for this annotation

Codecov / codecov/patch

flytekit/core/type_engine.py#L816

Added line #L816 was not covered by tests
except KeyError:
encoder = MessagePackEncoder(python_type)
self._msgpack_encoder[python_type] = encoder
encoder = MessagePackEncoder(new_python_type)
self._msgpack_encoder[new_python_type] = encoder

Check warning on line 819 in flytekit/core/type_engine.py

View check run for this annotation

Codecov / codecov/patch

flytekit/core/type_engine.py#L818-L819

Added lines #L818 - L819 were not covered by tests

try:
msgpack_bytes = encoder.encode(python_val)
Expand Down Expand Up @@ -836,6 +907,9 @@
}

if not dataclasses.is_dataclass(python_type):
import pandas as pd

Check warning on line 910 in flytekit/core/type_engine.py

View check run for this annotation

Codecov / codecov/patch

flytekit/core/type_engine.py#L910

Added line #L910 was not covered by tests
if isinstance(python_val, pd.DataFrame):
python_val = StructuredDataset(dataframe=python_val, file_format="parquet")

Check warning on line 912 in flytekit/core/type_engine.py

View check run for this annotation

Codecov / codecov/patch

flytekit/core/type_engine.py#L912

Added line #L912 was not covered by tests
Comment on lines +910 to +912
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider dedicated transformer for DataFrame conversion

Consider moving the pandas DataFrame conversion logic to a dedicated transformer class instead of handling it in the _make_dataclass_serializable method. This would improve code organization and maintainability.

Code suggestion
Check the AI-generated fix before applying
Suggested change
import pandas as pd
if isinstance(python_val, pd.DataFrame):
python_val = StructuredDataset(dataframe=python_val, file_format="parquet")

Code Review Run #fedbf7


Is this a valid issue, or was it incorrectly flagged by the Agent?

  • it was incorrectly flagged

return python_val

# Transform str to FlyteFile or FlyteDirectory so that mashumaro can serialize the path.
Expand Down Expand Up @@ -874,6 +948,10 @@
if t == int:
return int(val)

import pandas as pd

Check warning on line 951 in flytekit/core/type_engine.py

View check run for this annotation

Codecov / codecov/patch

flytekit/core/type_engine.py#L951

Added line #L951 was not covered by tests
if t == pd.DataFrame:
return val().open(dataframe_type=pd.DataFrame).all()

Check warning on line 953 in flytekit/core/type_engine.py

View check run for this annotation

Codecov / codecov/patch

flytekit/core/type_engine.py#L953

Added line #L953 was not covered by tests

if isinstance(val, list):
# Handle nested List. e.g. [[1, 2], [3, 4]]
return list(map(lambda x: self._fix_val_int(ListTransformer.get_sub_type(t), x), val))
Expand Down Expand Up @@ -918,7 +996,8 @@
self._msgpack_decoder[expected_python_type] = decoder
dc = decoder.decode(binary_idl_object.value)

return dc
# return dc
return self._fix_dataclass_int(expected_python_type, dc)

Check warning on line 1000 in flytekit/core/type_engine.py

View check run for this annotation

Codecov / codecov/patch

flytekit/core/type_engine.py#L1000

Added line #L1000 was not covered by tests
else:
raise TypeTransformerFailedError(f"Unsupported binary format: `{binary_idl_object.tag}`")

Expand All @@ -929,28 +1008,78 @@
"user defined datatypes in Flytekit"
)

import pandas as pd
from flytekit.types.structured.structured_dataset import StructuredDataset
from typing import get_type_hints, Type, Dict
import pandas as pd
from flytekit.types.file import FlyteFile
from flytekit.types.directory import FlyteDirectory
from flytekit.types.structured.structured_dataset import StructuredDataset
from flytekit.types.schema import FlyteSchema

Check warning on line 1018 in flytekit/core/type_engine.py

View check run for this annotation

Codecov / codecov/patch

flytekit/core/type_engine.py#L1011-L1018

Added lines #L1011 - L1018 were not covered by tests

from typing import get_type_hints, Type, Dict

Check warning on line 1020 in flytekit/core/type_engine.py

View check run for this annotation

Codecov / codecov/patch

flytekit/core/type_engine.py#L1020

Added line #L1020 was not covered by tests

def convert_dataclass(instance, target_cls):

Check warning on line 1022 in flytekit/core/type_engine.py

View check run for this annotation

Codecov / codecov/patch

flytekit/core/type_engine.py#L1022

Added line #L1022 was not covered by tests
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding type hints to function

The convert_dataclass function could benefit from type hints for better code maintainability and IDE support. Consider adding type annotations for instance and target_cls parameters.

Code suggestion
Check the AI-generated fix before applying
Suggested change
def convert_dataclass(instance, target_cls):
def convert_dataclass(instance: Any, target_cls: Type[T]) -> T:

Code Review Run #b16268


Is this a valid issue, or was it incorrectly flagged by the Agent?

  • it was incorrectly flagged

if not (is_dataclass(instance) and is_dataclass(target_cls)):
return instance

Check warning on line 1024 in flytekit/core/type_engine.py

View check run for this annotation

Codecov / codecov/patch

flytekit/core/type_engine.py#L1024

Added line #L1024 was not covered by tests

kwargs = {}
target_fields = {f.name: f.type for f in fields(target_cls)}

Check warning on line 1027 in flytekit/core/type_engine.py

View check run for this annotation

Codecov / codecov/patch

flytekit/core/type_engine.py#L1026-L1027

Added lines #L1026 - L1027 were not covered by tests
for field in fields(instance.__class__):
if field.name in target_fields:
value = getattr(instance, field.name)

Check warning on line 1030 in flytekit/core/type_engine.py

View check run for this annotation

Codecov / codecov/patch

flytekit/core/type_engine.py#L1030

Added line #L1030 was not covered by tests
if is_dataclass(value) and is_dataclass(target_fields[field.name]):
value = convert_dataclass(value, target_fields[field.name])
kwargs[field.name] = value
return target_cls(**kwargs)

Check warning on line 1034 in flytekit/core/type_engine.py

View check run for this annotation

Codecov / codecov/patch

flytekit/core/type_engine.py#L1032-L1034

Added lines #L1032 - L1034 were not covered by tests

def transform_dataclass(cls, memo=None):

Check warning on line 1036 in flytekit/core/type_engine.py

View check run for this annotation

Codecov / codecov/patch

flytekit/core/type_engine.py#L1036

Added line #L1036 was not covered by tests
if memo is None:
memo = {}

Check warning on line 1038 in flytekit/core/type_engine.py

View check run for this annotation

Codecov / codecov/patch

flytekit/core/type_engine.py#L1038

Added line #L1038 was not covered by tests

if cls in memo:
return memo[cls]

Check warning on line 1041 in flytekit/core/type_engine.py

View check run for this annotation

Codecov / codecov/patch

flytekit/core/type_engine.py#L1041

Added line #L1041 was not covered by tests

cls_hints = get_type_hints(cls)
new_field_defs = []

Check warning on line 1044 in flytekit/core/type_engine.py

View check run for this annotation

Codecov / codecov/patch

flytekit/core/type_engine.py#L1043-L1044

Added lines #L1043 - L1044 were not covered by tests
for field in fields(cls):
orig_type = cls_hints[field.name]

Check warning on line 1046 in flytekit/core/type_engine.py

View check run for this annotation

Codecov / codecov/patch

flytekit/core/type_engine.py#L1046

Added line #L1046 was not covered by tests
if orig_type == pd.DataFrame:
new_type = StructuredDataset

Check warning on line 1048 in flytekit/core/type_engine.py

View check run for this annotation

Codecov / codecov/patch

flytekit/core/type_engine.py#L1048

Added line #L1048 was not covered by tests
elif is_dataclass(orig_type):
new_type = transform_dataclass(orig_type, memo)

Check warning on line 1050 in flytekit/core/type_engine.py

View check run for this annotation

Codecov / codecov/patch

flytekit/core/type_engine.py#L1050

Added line #L1050 was not covered by tests
else:
new_type = orig_type
new_field_defs.append((field.name, new_type))

Check warning on line 1053 in flytekit/core/type_engine.py

View check run for this annotation

Codecov / codecov/patch

flytekit/core/type_engine.py#L1052-L1053

Added lines #L1052 - L1053 were not covered by tests

new_cls = make_dataclass("FlyteModified" + cls.__name__, new_field_defs)
memo[cls] = new_cls
return new_cls

Check warning on line 1057 in flytekit/core/type_engine.py

View check run for this annotation

Codecov / codecov/patch

flytekit/core/type_engine.py#L1055-L1057

Added lines #L1055 - L1057 were not covered by tests

new_expected_python_type = transform_dataclass(expected_python_type)

Check warning on line 1059 in flytekit/core/type_engine.py

View check run for this annotation

Codecov / codecov/patch

flytekit/core/type_engine.py#L1059

Added line #L1059 was not covered by tests

if lv.scalar and lv.scalar.binary:
return self.from_binary_idl(lv.scalar.binary, expected_python_type) # type: ignore
return convert_dataclass(self.from_binary_idl(lv.scalar.binary, new_expected_python_type), expected_python_type) # type: ignore

Check warning on line 1062 in flytekit/core/type_engine.py

View check run for this annotation

Codecov / codecov/patch

flytekit/core/type_engine.py#L1062

Added line #L1062 was not covered by tests

json_str = _json_format.MessageToJson(lv.scalar.generic)

# The `from_json` function is provided from mashumaro's `DataClassJSONMixin`.
# It deserializes a JSON string into a data class, and supports additional functionality over JSONDecoder
# We can't use hasattr(expected_python_type, "from_json") here because we rely on mashumaro's API to customize the deserialization behavior for Flyte types.
if issubclass(expected_python_type, DataClassJSONMixin):
dc = expected_python_type.from_json(json_str) # type: ignore
if issubclass(new_expected_python_type, DataClassJSONMixin):
dc = new_expected_python_type.from_json(json_str) # type: ignore

Check warning on line 1070 in flytekit/core/type_engine.py

View check run for this annotation

Codecov / codecov/patch

flytekit/core/type_engine.py#L1070

Added line #L1070 was not covered by tests
else:
# The function looks up or creates a JSONDecoder specifically designed for the object's type.
# This decoder is then used to convert a JSON string into a data class.
try:
decoder = self._json_decoder[expected_python_type]
decoder = self._json_decoder[new_expected_python_type]

Check warning on line 1075 in flytekit/core/type_engine.py

View check run for this annotation

Codecov / codecov/patch

flytekit/core/type_engine.py#L1075

Added line #L1075 was not covered by tests
except KeyError:
decoder = JSONDecoder(expected_python_type)
self._json_decoder[expected_python_type] = decoder
decoder = JSONDecoder(new_expected_python_type)
self._json_decoder[new_expected_python_type] = decoder

Check warning on line 1078 in flytekit/core/type_engine.py

View check run for this annotation

Codecov / codecov/patch

flytekit/core/type_engine.py#L1077-L1078

Added lines #L1077 - L1078 were not covered by tests
Comment on lines +1077 to +1078
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider caching decoder with original type

Consider storing the decoder in the original expected_python_type key instead of new_expected_python_type to avoid potential memory leaks from storing multiple decoders for the same logical type.

Code suggestion
Check the AI-generated fix before applying
Suggested change
decoder = JSONDecoder(new_expected_python_type)
self._json_decoder[new_expected_python_type] = decoder
decoder = JSONDecoder(new_expected_python_type)
self._json_decoder[expected_python_type] = decoder

Code Review Run #fedbf7


Is this a valid issue, or was it incorrectly flagged by the Agent?

  • it was incorrectly flagged


dc = decoder.decode(json_str)

return self._fix_dataclass_int(expected_python_type, dc)
return convert_dataclass(self._fix_dataclass_int(new_expected_python_type, dc), expected_python_type)

Check warning on line 1082 in flytekit/core/type_engine.py

View check run for this annotation

Codecov / codecov/patch

flytekit/core/type_engine.py#L1082

Added line #L1082 was not covered by tests
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider validating dataclass field compatibility

The conversion from new_expected_python_type to expected_python_type may lose data if the types have different field structures. Consider validating field compatibility before conversion.

Code suggestion
Check the AI-generated fix before applying
Suggested change
return convert_dataclass(self._fix_dataclass_int(new_expected_python_type, dc), expected_python_type)
fixed_dc = self._fix_dataclass_int(new_expected_python_type, dc)
# Validate field compatibility
if not all(f.name in [ef.name for ef in fields(expected_python_type)] for f in fields(fixed_dc.__class__)):
raise ValueError(f"Incompatible field structure between {new_expected_python_type.__name__} and {expected_python_type.__name__}")
return convert_dataclass(fixed_dc, expected_python_type)

Code Review Run #b16268


Is this a valid issue, or was it incorrectly flagged by the Agent?

  • it was incorrectly flagged


# This ensures that calls with the same literal type returns the same dataclass. For example, `pyflyte run``
# command needs to call guess_python_type to get the TypeEngine-derived dataclass. Without caching here, separate
Expand Down
Loading