Return StructuredDataset which is a field in a dataclass #3071

arbaobao · 2025-01-21T16:16:17Z

Tracking issue

Related to #6117

Why are the changes needed?

If we wrap the StructuredDataset in a dataclass, it will fail during the to_flyte_idl conversion.

What changes were proposed in this pull request?

Before returning Literals, we check the type of python_val._literal_sd. If it is a Python native StructuredDataset, we transform it into a Literals.StructuredDataset.

How was this patch tested?

As described in #6117, an error occurs when the extract task is executed.

@dataclass
class Data:
    f: StructuredDataset


@task
def create_data() -> Data:
    return Data(f=StructuredDataset(dataframe=pd.DataFrame({"a": [5]})))


@task
def extract(d: Data) -> StructuredDataset:
    return d.f


@workflow
def example_wf() -> None:
    d = create_data()
    f = extract(d=d)

Setup process

Screenshots

Check all the applicable boxes

I updated the documentation accordingly.
All new and existing tests passed.
All commits are signed-off.

Summary by Bito

This PR implements comprehensive improvements to Flytekit's core functionality, including StructuredDataset handling, caching system enhancements, and a new rate limiting system. The changes transition from in-memory DataFrame to file-based approach, integrate Kubernetes StatefulSet Data Service and Ray plugins, while improving dictionary and annotated type handling. Key improvements include enhanced error handling, better array node handling in remote execution, improved workflow execution tracking, and strengthened input validation across multiple components.

Unit tests added: True

Estimated effort to review (1-5, lower is better): 5

Signed-off-by: Nelson Chen <[email protected]>

flyte-bot · 2025-01-21T16:16:32Z

Code Review Agent Run #63793c

Actionable Suggestions - 2

flytekit/types/structured/structured_dataset.py - 2
- Private member access needs encapsulation · Line 738-742
- Consider extracting literal creation logic · Line 738-742

Review Details

Files reviewed - 2 · Commit Range: 51f6f73..a3df842
- flytekit/core/type_engine.py
- flytekit/types/structured/structured_dataset.py
Files skipped - 0
Tools
- Whispers (Secret Scanner) - ✔︎ Successful
- Detect-secrets (Secret Scanner) - ✔︎ Successful
- MyPy (Static Code Analysis) - ✔︎ Successful
- Astral Ruff (Static Code Analysis) - ✔︎ Successful

AI Code Review powered by

flyte-bot · 2025-01-21T16:19:29Z

Changelist by Bito

This pull request implements the following key changes.

Key Change	Files Impacted
Feature Improvement - Enhanced StructuredDataset Handling in Dataclass	- `structured_dataset.py` - Added support for handling StructuredDataset fields within dataclass - `test_remote.py` - Added integration test for StructuredDataset attribute access in dataclass - `attr_access_dc_sd.py` - Created test workflow for StructuredDataset dataclass functionality

flyte-bot · 2025-01-21T16:19:32Z

flytekit/types/structured/structured_dataset.py

+                if isinstance(python_val._literal_sd, StructuredDataset):
+                    sdt = StructuredDatasetType(format=python_val._literal_sd.file_format)
+                    metad = literals.StructuredDatasetMetadata(structured_dataset_type=sdt)
+                    sd_literal = literals.StructuredDataset(uri=python_val._literal_sd.uri, metadata=metad)
+                    return Literal(scalar=Scalar(structured_dataset=sd_literal))


Private member access needs encapsulation

Accessing private member '_literal_sd'. Consider using a public interface or property to access this data.

Code suggestion

Check the AI-generated fix before applying

- if literal_type.structured_dataset_type is not None and self._literal_sd is not None: - return self._literal_sd - if literal_type.structured_dataset_type is not None and self._literal_sd is None: + if literal_type.structured_dataset_type is not None and self.literal_sd is not None: + return self.literal_sd + if literal_type.structured_dataset_type is not None and self.literal_sd is None:

Code Review Run #63793c

Is this a valid issue, or was it incorrectly flagged by the Agent?

it was incorrectly flagged

flyte-bot · 2025-01-21T16:19:33Z

flytekit/types/structured/structured_dataset.py

+                if isinstance(python_val._literal_sd, StructuredDataset):
+                    sdt = StructuredDatasetType(format=python_val._literal_sd.file_format)
+                    metad = literals.StructuredDatasetMetadata(structured_dataset_type=sdt)
+                    sd_literal = literals.StructuredDataset(uri=python_val._literal_sd.uri, metadata=metad)
+                    return Literal(scalar=Scalar(structured_dataset=sd_literal))


Consider extracting literal creation logic

The code block for handling StructuredDataset passed through dataclass could be simplified by extracting the literal creation logic into a helper method. This would improve code readability and maintainability.

Code suggestion

Check the AI-generated fix before applying

Suggested change

if isinstance(python_val._literal_sd, StructuredDataset):

sdt = StructuredDatasetType(format=python_val._literal_sd.file_format)

metad = literals.StructuredDatasetMetadata(structured_dataset_type=sdt)

sd_literal = literals.StructuredDataset(uri=python_val._literal_sd.uri, metadata=metad)

return Literal(scalar=Scalar(structured_dataset=sd_literal))

if isinstance(python_val._literal_sd, StructuredDataset):

return self._create_structured_dataset_literal(python_val._literal_sd.uri, python_val._literal_sd.file_format)

def _create_structured_dataset_literal(self, uri: str, file_format: str) -> Literal:

sdt = StructuredDatasetType(format=file_format)

metad = literals.StructuredDatasetMetadata(structured_dataset_type=sdt)

sd_literal = literals.StructuredDataset(uri=uri, metadata=metad)

return Literal(scalar=Scalar(structured_dataset=sd_literal))

Code Review Run #63793c

Is this a valid issue, or was it incorrectly flagged by the Agent?

it was incorrectly flagged

Future-Outlier

it looks correct, can you provide

screenshot
add an example to integration test to test it properlly?
test_remote.py

…ataset

Signed-off-by: Nelson Chen <[email protected]>

codecov · 2025-02-04T04:11:56Z

Codecov Report

Attention: Patch coverage is 0% with 5 lines in your changes missing coverage. Please review.

Project coverage is 79.58%. Comparing base (9d34416) to head (60142d0).

Files with missing lines	Patch %	Lines
flytekit/types/structured/structured_dataset.py	0.00%	4 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #3071      +/-   ##
==========================================
+ Coverage   78.20%   79.58%   +1.38%     
==========================================
  Files         292      203      -89     
  Lines       25401    21599    -3802     
  Branches     2779     2780       +1     
==========================================
- Hits        19864    17190    -2674     
+ Misses       4726     3633    -1093     
+ Partials      811      776      -35

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

arbaobao · 2025-02-04T04:14:24Z

When we return a StructuredDataset attribute from a dataclass instance, an error occurs :
AttributeError: 'StructuredDataset' object has no attribute 'to_flyte_idl'

This issue has been addressed in this PR.

Signed-off-by: Nelson Chen <[email protected]>

flyte-bot · 2025-02-04T05:02:59Z

Code Review Agent Run #d93af6

Actionable Suggestions - 1

tests/flytekit/integration/remote/workflows/basic/attr_access_dc_sd.py - 1
- Consider adding error handling for dataset access · Line 31-31

Additional Suggestions - 10

flytekit/clis/sdk_in_container/serve.py - 2
- Consider adding parameter validation checks · Line 64-64
- Consider adding port number validation · Line 83-83
flytekit/models/security.py - 1
- Consider adding env_var validation check · Line 45-45
flytekit/core/data_persistence.py - 1
- Consider adding error handling for kwargs · Line 382-383
plugins/flytekit-omegaconf/flytekitplugins/omegaconf/dictconfig_transformer.py - 1
- Consider consolidating NoneType handling checks · Line 146-147
tests/flytekit/clis/sdk_in_container/test_serve.py - 1
- Consider consistent CLI argument naming convention · Line 19-20
tests/flytekit/unit/core/test_dataclass.py - 1
- Consider default mock path for non-existent files · Line 1196-1200
tests/flytekit/unit/core/image_spec/test_default_builder.py - 1
- Consider expanding python_exec validation tests · Line 347-360
plugins/flytekit-k8sdataservice/utils/resources.py - 1
- Consider optimizing zero value checks · Line 13-20
flytekit/core/type_engine.py - 1
- Consider more descriptive environment variable name · Line 63-63

Review Details

Files reviewed - 56 · Commit Range: a3df842..5329247
- .pre-commit-config.yaml
- Dockerfile.agent
- docs/source/plugins/k8sstatefuldataservice.rst
- flytekit/clis/sdk_in_container/serve.py
- flytekit/core/data_persistence.py
- flytekit/core/python_function_task.py
- flytekit/core/resources.py
- flytekit/core/type_engine.py
- flytekit/image_spec/default_builder.py
- flytekit/image_spec/image_spec.py
- flytekit/interaction/parse_stdin.py
- flytekit/models/security.py
- flytekit/models/task.py
- flytekit/remote/remote.py
- flytekit/remote/remote_fs.py
- flytekit/types/structured/structured_dataset.py
- plugins/flytekit-envd/flytekitplugins/envd/image_builder.py
- plugins/flytekit-envd/tests/test_image_spec.py
- plugins/flytekit-k8sdataservice/dev-requirements.txt
- plugins/flytekit-k8sdataservice/flytekitplugins/k8sdataservice/__init__.py
- plugins/flytekit-k8sdataservice/flytekitplugins/k8sdataservice/agent.py
- plugins/flytekit-k8sdataservice/flytekitplugins/k8sdataservice/k8s/kube_config.py
- plugins/flytekit-k8sdataservice/flytekitplugins/k8sdataservice/k8s/manager.py
- plugins/flytekit-k8sdataservice/flytekitplugins/k8sdataservice/sensor.py
- plugins/flytekit-k8sdataservice/flytekitplugins/k8sdataservice/task.py
- plugins/flytekit-k8sdataservice/setup.py
- plugins/flytekit-k8sdataservice/tests/k8sdataservice/k8s/test_kube_config.py
- plugins/flytekit-k8sdataservice/tests/k8sdataservice/k8s/test_manager.py
- plugins/flytekit-k8sdataservice/tests/k8sdataservice/test_agent.py
- plugins/flytekit-k8sdataservice/tests/k8sdataservice/test_sensor.py
- plugins/flytekit-k8sdataservice/tests/k8sdataservice/test_task.py
- plugins/flytekit-k8sdataservice/tests/k8sdataservice/utils/test_resources.py
- plugins/flytekit-k8sdataservice/utils/infra.py
- plugins/flytekit-k8sdataservice/utils/resources.py
- plugins/flytekit-omegaconf/flytekitplugins/omegaconf/dictconfig_transformer.py
- plugins/flytekit-omegaconf/tests/test_dictconfig_transformer.py
- plugins/flytekit-ray/flytekitplugins/ray/task.py
- plugins/flytekit-ray/setup.py
- plugins/flytekit-ray/tests/test_ray.py
- plugins/setup.py
- pydoclint-errors-baseline.txt
- pyproject.toml
- tests/flytekit/clis/sdk_in_container/test_serve.py
- tests/flytekit/integration/remote/test_remote.py
- tests/flytekit/integration/remote/workflows/basic/attr_access_dc_sd.py
- tests/flytekit/integration/remote/workflows/basic/get_secret.py
- tests/flytekit/integration/remote/workflows/basic/sd_attr.py
- tests/flytekit/unit/core/image_spec/test_default_builder.py
- tests/flytekit/unit/core/test_data_persistence.py
- tests/flytekit/unit/core/test_dataclass.py
- tests/flytekit/unit/core/test_generice_idl_type_engine.py
- tests/flytekit/unit/core/test_list.py
- tests/flytekit/unit/core/test_resources.py
- tests/flytekit/unit/core/test_type_engine.py
- tests/flytekit/unit/extras/pydantic_transformer/test_pydantic_basemodel_transformer.py
- tests/flytekit/unit/types/structured_dataset/test_structured_dataset.py
Files skipped - 2
- .github/workflows/pythonbuild.yml - Reason: Filter setting
- plugins/flytekit-k8sdataservice/README.md - Reason: Filter setting
Tools
- Whispers (Secret Scanner) - ✔︎ Successful
- Detect-secrets (Secret Scanner) - ✔︎ Successful
- MyPy (Static Code Analysis) - ✔︎ Successful
- Astral Ruff (Static Code Analysis) - ✔︎ Successful

AI Code Review powered by

flyte-bot · 2025-02-04T05:57:07Z

tests/flytekit/integration/remote/workflows/basic/attr_access_dc_sd.py

+@task
+def read_sd(dc: DC) -> StructuredDataset:
+    """Read input StructuredDataset."""
+    print("sd:", dc.sd.open(pd.DataFrame).all())


Consider adding error handling for dataset access

Consider adding error handling around open() and all() calls to handle potential exceptions when accessing the structured dataset.

Code suggestion

Check the AI-generated fix before applying

Suggested change

print("sd:", dc.sd.open(pd.DataFrame).all())

try:

df = dc.sd.open(pd.DataFrame).all()

print("sd:", df)

except Exception as e:

print(f"Error accessing structured dataset: {e}")

raise

Code Review Run #d93af6

Is this a valid issue, or was it incorrectly flagged by the Agent?

it was incorrectly flagged

…ataset

Signed-off-by: Nelson Chen <[email protected]>

flyte-bot · 2025-02-07T10:58:01Z

Code Review Agent Run #0d1278

Actionable Suggestions - 0

Additional Suggestions - 10

flytekit/tools/fast_registration.py - 1
- Consider optimizing progress bar refresh rate · Line 186-187
tests/flytekit/unit/core/test_node_creation.py - 1
- Consider extracting PodTemplate config for clarity · Line 474-499
flytekit/models/core/workflow.py - 1
- Consider adding pod template validation · Line 621-621
flytekit/loggers.py - 1
- Consider explicit boolean conversion for env var · Line 190-190
tests/flytekit/unit/core/test_map_task.py - 1
- Consider extracting pod template configuration · Line 365-383
flytekit/core/base_task.py - 3
- Consider adding validation for generates_deck flag · Line 141-141
- Consider using more detailed deprecation warning · Line 833-836
- Consider extracting deck logic to method · Line 726-729
flytekit/core/context_manager.py - 1
- Consider moving enable_deck init to constructor · Line 97-97
flytekit/bin/entrypoint.py - 1
- Consider consistent parameter naming in calls · Line 327-327

Review Details

Files reviewed - 26 · Commit Range: 5329247..3e390e8
- dev-requirements.txt
- flytekit/bin/entrypoint.py
- flytekit/clients/friendly.py
- flytekit/clients/raw.py
- flytekit/core/base_task.py
- flytekit/core/context_manager.py
- flytekit/core/node.py
- flytekit/core/type_engine.py
- flytekit/deck/deck.py
- flytekit/loggers.py
- flytekit/models/core/workflow.py
- flytekit/models/domain.py
- flytekit/models/task.py
- flytekit/remote/remote.py
- flytekit/tools/fast_registration.py
- flytekit/tools/translator.py
- flytekit/types/directory/types.py
- pydoclint-errors-baseline.txt
- pyproject.toml
- tests/flytekit/integration/remote/workflows/basic/attr_access_dc_sd.py
- tests/flytekit/unit/core/test_array_node_map_task.py
- tests/flytekit/unit/core/test_map_task.py
- tests/flytekit/unit/core/test_node_creation.py
- tests/flytekit/unit/core/test_unions.py
- tests/flytekit/unit/deck/test_deck.py
- tests/flytekit/unit/test_translator.py
Files skipped - 0
Tools
- Whispers (Secret Scanner) - ✔︎ Successful
- Detect-secrets (Secret Scanner) - ✔︎ Successful
- MyPy (Static Code Analysis) - ✔︎ Successful
- Astral Ruff (Static Code Analysis) - ✔︎ Successful

AI Code Review powered by

Signed-off-by: Nelson Chen <[email protected]>

flyte-bot · 2025-02-07T12:00:25Z

Code Review Agent Run #7e2d92

Actionable Suggestions - 3

tests/flytekit/integration/remote/test_remote.py - 1
- Consider validating remote file path parameter · Line 896-896
tests/flytekit/integration/remote/workflows/basic/attr_access_dc_sd.py - 2
- Consider validating parquet file format support · Line 23-23
- Consider configurable parquet file path · Line 44-44

Review Details

Files reviewed - 2 · Commit Range: 3e390e8..60142d0
- tests/flytekit/integration/remote/test_remote.py
- tests/flytekit/integration/remote/workflows/basic/attr_access_dc_sd.py
Files skipped - 0
Tools
- Whispers (Secret Scanner) - ✔︎ Successful
- Detect-secrets (Secret Scanner) - ✔︎ Successful
- MyPy (Static Code Analysis) - ✔︎ Successful
- Astral Ruff (Static Code Analysis) - ✔︎ Successful

AI Code Review powered by

flyte-bot · 2025-02-07T12:04:14Z

tests/flytekit/integration/remote/test_remote.py

+    file_transfer = SimpleFileTransfer()
+    remote_file_path = file_transfer.upload_file(file_type="parquet")
+
+    execution_id = run("attr_access_dc_sd.py", "wf", "--uri", remote_file_path)


Consider validating remote file path parameter

Consider adding validation for the remote_file_path parameter before passing it to the run() function. The URI parameter should be validated to ensure it's a valid S3/Minio path.

Code suggestion

Check the AI-generated fix before applying

@@ -893,4 +893,7 @@ file_transfer = SimpleFileTransfer() remote_file_path = file_transfer.upload_file(file_type="parquet") + if not remote_file_path or not remote_file_path.startswith('s3://') or not remote_file_path.endswith('.parquet'): + raise ValueError(f'Invalid remote file path: {remote_file_path}') + execution_id = run("attr_access_dc_sd.py", "wf", "--uri", remote_file_path)

Code Review Run #7e2d92

Is this a valid issue, or was it incorrectly flagged by the Agent?

it was incorrectly flagged

flyte-bot · 2025-02-07T12:04:15Z

tests/flytekit/integration/remote/workflows/basic/attr_access_dc_sd.py

+    Returns:
+        dc: A dataclass with a StructuredDataset attribute.
+    """
+    dc = DC(sd=StructuredDataset(uri=uri, file_format="parquet"))


Consider validating parquet file format support

Consider adding validation for the file_format parameter. The code assumes parquet format but it may be worth checking if this format is supported by checking against DEFAULT_FORMATS or ENCODERS dictionaries.

Code suggestion

Check the AI-generated fix before applying

-def create_dc(uri: str) -> DC: - dc = DC(sd=StructuredDataset(uri=uri, file_format="parquet")) - return dc +def create_dc(uri: str) -> DC: + file_format = "parquet" + if file_format not in DEFAULT_FORMATS.values() and file_format not in ENCODERS: + raise ValueError(f"File format {file_format} is not supported") + dc = DC(sd=StructuredDataset(uri=uri, file_format=file_format)) + return dc

Code Review Run #7e2d92

Is this a valid issue, or was it incorrectly flagged by the Agent?

it was incorrectly flagged

flyte-bot · 2025-02-07T12:04:16Z

tests/flytekit/integration/remote/workflows/basic/attr_access_dc_sd.py

+
+
+if __name__ == "__main__":
+    wf(uri="tests/flytekit/integration/remote/workflows/basic/data/df.parquet")


Consider configurable parquet file path

Consider making the parquet file path configurable through environment variables or configuration files instead of hardcoding it. This would make the code more flexible and easier to maintain across different environments.

Code suggestion

Check the AI-generated fix before applying

Suggested change

wf(uri="tests/flytekit/integration/remote/workflows/basic/data/df.parquet")

import os

default_path = "tests/flytekit/integration/remote/workflows/basic/data/df.parquet"

wf(uri=os.getenv("PARQUET_FILE_PATH", default_path))

Code Review Run #7e2d92

Is this a valid issue, or was it incorrectly flagged by the Agent?

it was incorrectly flagged

Signed-off-by: Nelson Chen <[email protected]>

flyte-bot · 2025-02-07T15:41:54Z

Code Review Agent Run #d69fbd

Actionable Suggestions - 1

tests/flytekit/integration/remote/workflows/basic/attr_access_dc_sd.py - 1
- Consider storing read_sd return value · Line 40-40

Review Details

Files reviewed - 2 · Commit Range: 60142d0..d26a6e9
- tests/flytekit/integration/remote/test_remote.py
- tests/flytekit/integration/remote/workflows/basic/attr_access_dc_sd.py
Files skipped - 0
Tools
- Whispers (Secret Scanner) - ✔︎ Successful
- Detect-secrets (Secret Scanner) - ✔︎ Successful
- MyPy (Static Code Analysis) - ✔︎ Successful
- Astral Ruff (Static Code Analysis) - ✔︎ Successful

AI Code Review powered by

flyte-bot · 2025-02-07T15:44:25Z

tests/flytekit/integration/remote/workflows/basic/attr_access_dc_sd.py

+@workflow
+def wf(uri: str) -> None:
+    dc = create_dc(uri=uri)
+    read_sd(dc=dc)


Consider storing read_sd return value

Consider storing the return value of read_sd() since it returns a StructuredDataset. The returned value might be needed later in the workflow.

Code suggestion

Check the AI-generated fix before applying

Suggested change

read_sd(dc=dc)

sd = read_sd(dc=dc)

Code Review Run #d69fbd

Is this a valid issue, or was it incorrectly flagged by the Agent?

it was incorrectly flagged

…ataset

Signed-off-by: Nelson Chen <[email protected]>

flyte-bot · 2025-02-14T10:40:58Z

Code Review Agent Run #b95e33

Actionable Suggestions - 0

Additional Suggestions - 10

flytekit/interaction/string_literals.py - 1
- Consider adding type check for binary value · Line 47-48
flytekit/core/python_auto_container.py - 1
- Consider adding shared memory validation · Line 55-55
tests/flytekit/unit/clients/auth/test_keyring_store.py - 1
- Consider using enum instead of string · Line 64-64
flytekit/remote/remote.py - 3
- Consider adding missing entity type · Line 1634-1636
- Consider splitting version resolution logic · Line 811-825
- Consider aligning version default with error · Line 2105-2105
tests/flytekit/integration/remote/test_remote.py - 2
- Consider more descriptive variable naming · Line 581-590
- Consider making timeout value configurable · Line 628-628
tests/flytekit/unit/interaction/test_string_literals.py - 1
- Consider adding error handling test case · Line 89-90
flytekit/models/core/workflow.py - 1
- Consider reducing code duplication in returns · Line 667-680

Review Details

Files reviewed - 59 · Commit Range: d26a6e9..c6d9010
- flytekit/__init__.py
- flytekit/clients/auth_helper.py
- flytekit/clients/grpc_utils/auth_interceptor.py
- flytekit/clis/sdk_in_container/run.py
- flytekit/configuration/plugin.py
- flytekit/core/cache.py
- flytekit/core/constants.py
- flytekit/core/node.py
- flytekit/core/promise.py
- flytekit/core/python_auto_container.py
- flytekit/core/python_function_task.py
- flytekit/core/resources.py
- flytekit/core/task.py
- flytekit/core/type_engine.py
- flytekit/core/type_match_checking.py
- flytekit/core/worker_queue.py
- flytekit/core/workflow.py
- flytekit/image_spec/default_builder.py
- flytekit/image_spec/image_spec.py
- flytekit/interaction/string_literals.py
- flytekit/loggers.py
- flytekit/models/core/workflow.py
- flytekit/models/execution.py
- flytekit/models/security.py
- flytekit/models/task.py
- flytekit/remote/data.py
- flytekit/remote/remote.py
- plugins/flytekit-aws-sagemaker/tests/test_boto3_agent.py
- plugins/flytekit-onnx-pytorch/dev-requirements.txt
- plugins/flytekit-openai/tests/openai_batch/test_agent.py
- plugins/flytekit-pandera/flytekitplugins/pandera/pandas_transformer.py
- plugins/flytekit-pandera/setup.py
- pydoclint-errors-baseline.txt
- pyproject.toml
- tests/flytekit/integration/remote/test_remote.py
- tests/flytekit/unit/bin/test_python_entrypoint.py
- tests/flytekit/unit/cli/pyflyte/test_run.py
- tests/flytekit/unit/cli/pyflyte/test_run_lps.py
- tests/flytekit/unit/clients/auth/test_keyring_store.py
- tests/flytekit/unit/clients/test_auth_helper.py
- tests/flytekit/unit/clients/test_friendly.py
- tests/flytekit/unit/clients/test_raw.py
- tests/flytekit/unit/core/image_spec/test_default_builder.py
- tests/flytekit/unit/core/image_spec/test_image_spec.py
- tests/flytekit/unit/core/test_annotated_bindings.py
- tests/flytekit/unit/core/test_array_node_map_task.py
- tests/flytekit/unit/core/test_cache.py
- tests/flytekit/unit/core/test_generice_idl_type_engine.py
- tests/flytekit/unit/core/test_node_creation.py
- tests/flytekit/unit/core/test_resources.py
- tests/flytekit/unit/core/test_type_engine.py
- tests/flytekit/unit/core/test_type_match_checking.py
- tests/flytekit/unit/core/test_worker_queue.py
- tests/flytekit/unit/interaction/test_string_literals.py
- tests/flytekit/unit/models/core/test_security.py
- tests/flytekit/unit/models/core/test_workflow.py
- tests/flytekit/unit/models/test_execution.py
- tests/flytekit/unit/models/test_tasks.py
- tests/flytekit/unit/types/structured_dataset/test_structured_dataset.py
Files skipped - 1
- .gitignore - Reason: Filter setting
Tools
- Whispers (Secret Scanner) - ✔︎ Successful
- Detect-secrets (Secret Scanner) - ✔︎ Successful
- MyPy (Static Code Analysis) - ✔︎ Successful
- Astral Ruff (Static Code Analysis) - ✔︎ Successful

AI Code Review powered by

…ataset

arbaobao · 2025-02-18T13:26:18Z

@Future-Outlier
I think this PR is ready to be reviewed.

flyte-bot · 2025-02-18T14:00:58Z

Code Review Agent Run #425bc0

Actionable Suggestions - 0

Additional Suggestions - 6

flytekit/utils/rate_limiter.py - 1
- Consider improving delay value definition · Line 18-18
flytekit/core/cache.py - 1
- Consider adding error details to message · Line 93-93
flytekit/core/type_engine.py - 1
- Consider raising error for empty args · Line 2040-2041
tests/flytekit/unit/core/test_cache.py - 2
- Consider more descriptive exception message · Line 16-17
- Consider expanding cache policy error tests · Line 177-183
tests/flytekit/integration/remote/test_remote.py - 1
- Consider adding assertion message for clarity · Line 617-617

Review Details

Files reviewed - 14 · Commit Range: c6d9010..13d7e5a
- flytekit/__init__.py
- flytekit/clis/sdk_in_container/init.py
- flytekit/core/cache.py
- flytekit/core/type_engine.py
- flytekit/core/worker_queue.py
- flytekit/interaction/click_types.py
- flytekit/remote/remote.py
- flytekit/utils/rate_limiter.py
- tests/flytekit/integration/remote/test_remote.py
- tests/flytekit/integration/remote/workflows/basic/dataclass_wf.py
- tests/flytekit/unit/core/test_cache.py
- tests/flytekit/unit/core/test_type_engine.py
- tests/flytekit/unit/interaction/test_click_types.py
- tests/flytekit/unit/utils/test_rate_limiter.py
Files skipped - 0
Tools
- Whispers (Secret Scanner) - ✔︎ Successful
- Detect-secrets (Secret Scanner) - ✔︎ Successful
- MyPy (Static Code Analysis) - ✔︎ Successful
- Astral Ruff (Static Code Analysis) - ✔︎ Successful

AI Code Review powered by

arbaobao added 3 commits January 20, 2025 17:35

test

51f6f73

Signed-off-by: Nelson Chen <[email protected]>

remove breakpoint

675a7fa

Signed-off-by: Nelson Chen <[email protected]>

add commit

a3df842

Signed-off-by: Nelson Chen <[email protected]>

arbaobao requested review from wild-endeavor, kumare3, eapolinario, pingsutw, cosmicBboy, samhita-alla, thomasjpfan and Future-Outlier as code owners January 21, 2025 16:16

flyte-bot reviewed Jan 21, 2025

View reviewed changes

Future-Outlier reviewed Jan 22, 2025

View reviewed changes

arbaobao added 2 commits February 3, 2025 14:04

Merge branch 'master' of github.com:flyteorg/flytekit into structured…

c7b8f5d

…ataset

add integration test

2b1ee07

Signed-off-by: Nelson Chen <[email protected]>

arbaobao added 2 commits February 4, 2025 12:30

fix error

1d8f396

Signed-off-by: Nelson Chen <[email protected]>

fix test err

5329247

Signed-off-by: Nelson Chen <[email protected]>

flyte-bot reviewed Feb 4, 2025

View reviewed changes

arbaobao added 2 commits February 7, 2025 17:53

Merge branch 'master' of github.com:flyteorg/flytekit into structured…

7a4e749

…ataset

test

3e390e8

Signed-off-by: Nelson Chen <[email protected]>

test

60142d0

Signed-off-by: Nelson Chen <[email protected]>

flyte-bot reviewed Feb 7, 2025

View reviewed changes

test

d26a6e9

Signed-off-by: Nelson Chen <[email protected]>

flyte-bot reviewed Feb 7, 2025

View reviewed changes

arbaobao added 2 commits February 14, 2025 17:31

Merge branch 'master' of github.com:flyteorg/flytekit into structured…

123895d

…ataset

increase timeout to see if it failed

c6d9010

Signed-off-by: Nelson Chen <[email protected]>

Merge branch 'master' of github.com:flyteorg/flytekit into structured…

13d7e5a

…ataset

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Return StructuredDataset which is a field in a dataclass #3071

Return StructuredDataset which is a field in a dataclass #3071

arbaobao commented Jan 21, 2025 •

edited by flyte-bot

Loading

flyte-bot commented Jan 21, 2025 •

edited

Loading

Code Review Agent Run #63793c

flyte-bot commented Jan 21, 2025 •

edited

Loading

Changelist by Bito

flyte-bot Jan 21, 2025

flyte-bot Jan 21, 2025

Future-Outlier left a comment

codecov bot commented Feb 4, 2025 •

edited

Loading

arbaobao commented Feb 4, 2025

flyte-bot commented Feb 4, 2025 •

edited

Loading

Code Review Agent Run #d93af6

flyte-bot Feb 4, 2025

flyte-bot commented Feb 7, 2025 •

edited

Loading

Code Review Agent Run #0d1278

flyte-bot commented Feb 7, 2025 •

edited

Loading

Code Review Agent Run #7e2d92

flyte-bot Feb 7, 2025

flyte-bot Feb 7, 2025

flyte-bot Feb 7, 2025

flyte-bot commented Feb 7, 2025 •

edited

Loading

Code Review Agent Run #d69fbd

flyte-bot Feb 7, 2025

flyte-bot commented Feb 14, 2025 •

edited

Loading

Code Review Agent Run #b95e33

arbaobao commented Feb 18, 2025

flyte-bot commented Feb 18, 2025 •

edited

Loading

Code Review Agent Run #425bc0

-    print("sd:", dc.sd.open(pd.DataFrame).all())
+    try:
+        df = dc.sd.open(pd.DataFrame).all()
+        print("sd:", df)
+    except Exception as e:
+        print(f"Error accessing structured dataset: {e}")
+        raise



		if __name__ == "__main__":
		wf(uri="tests/flytekit/integration/remote/workflows/basic/data/df.parquet")

-    wf(uri="tests/flytekit/integration/remote/workflows/basic/data/df.parquet")
+    import os
+    default_path = "tests/flytekit/integration/remote/workflows/basic/data/df.parquet"
+    wf(uri=os.getenv("PARQUET_FILE_PATH", default_path))

Return StructuredDataset which is a field in a dataclass #3071

Are you sure you want to change the base?

Return StructuredDataset which is a field in a dataclass #3071

Conversation

arbaobao commented Jan 21, 2025 • edited by flyte-bot Loading

Tracking issue

Why are the changes needed?

What changes were proposed in this pull request?

How was this patch tested?

Setup process

Screenshots

Check all the applicable boxes

Summary by Bito

flyte-bot commented Jan 21, 2025 • edited Loading

Code Review Agent Run #63793c

flyte-bot commented Jan 21, 2025 • edited Loading

Changelist by Bito

flyte-bot Jan 21, 2025

Choose a reason for hiding this comment

flyte-bot Jan 21, 2025

Choose a reason for hiding this comment

Future-Outlier left a comment

Choose a reason for hiding this comment

codecov bot commented Feb 4, 2025 • edited Loading

Codecov Report

arbaobao commented Feb 4, 2025

flyte-bot commented Feb 4, 2025 • edited Loading

Code Review Agent Run #d93af6

flyte-bot Feb 4, 2025

Choose a reason for hiding this comment

flyte-bot commented Feb 7, 2025 • edited Loading

Code Review Agent Run #0d1278

flyte-bot commented Feb 7, 2025 • edited Loading

Code Review Agent Run #7e2d92

flyte-bot Feb 7, 2025

Choose a reason for hiding this comment

flyte-bot Feb 7, 2025

Choose a reason for hiding this comment

flyte-bot Feb 7, 2025

Choose a reason for hiding this comment

flyte-bot commented Feb 7, 2025 • edited Loading

Code Review Agent Run #d69fbd

flyte-bot Feb 7, 2025

Choose a reason for hiding this comment

flyte-bot commented Feb 14, 2025 • edited Loading

Code Review Agent Run #b95e33

arbaobao commented Feb 18, 2025

flyte-bot commented Feb 18, 2025 • edited Loading

Code Review Agent Run #425bc0

arbaobao commented Jan 21, 2025 •

edited by flyte-bot

Loading

flyte-bot commented Jan 21, 2025 •

edited

Loading

flyte-bot commented Jan 21, 2025 •

edited

Loading

codecov bot commented Feb 4, 2025 •

edited

Loading

flyte-bot commented Feb 4, 2025 •

edited

Loading

flyte-bot commented Feb 7, 2025 •

edited

Loading

flyte-bot commented Feb 7, 2025 •

edited

Loading

flyte-bot commented Feb 7, 2025 •

edited

Loading

flyte-bot commented Feb 14, 2025 •

edited

Loading

flyte-bot commented Feb 18, 2025 •

edited

Loading