Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jan/dx 1324 add execution metrics endpoint in flyteremote #3126

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

fiedlerNr9
Copy link
Contributor

@fiedlerNr9 fiedlerNr9 commented Feb 11, 2025

Why are the changes needed?

  • Today there is no endpoint to query worklfow execution metrics
  • Especially these metrics are interesting for users to reason about task runtimes
    image

What changes were proposed in this pull request?

  • adding get_execution_metrics() endpoint for friendly client
  • adding model for Spans

How was this patch tested?

  • running this on a couple of workflows with different shape:
from flytekit.remote import FlyteRemote
from flytekit.configuration import Config

r = FlyteRemote(config=Config.for_endpoint(endpoint="demo.hosted.unionai.cloud"))

# execution_id = "altzn6pfw8ll6n4h8tdq"  # static
# execution_id = "a964rbl6l9q4thkwzgnj"  # static fail
execution_id = "a2hpw4nh9vjhxsbst8nc"  # static loop
# execution_id = "atkjvqqhdkxpbzphqm7t"  # dynamic loop
# execution_id = "agmqkz8jp5cprw62c9kc"  # dynamic starting
# execution_id = "a5lk5bcqvjxnkpxvp5h2"  # dynamic nested
# execution_id = "amlm77lsv4mmm2gqmgm2"  # Spark
# execution_id = "abkhbrztrmqdx4jvzxh5"  # static retries
# execution_id = "afhtl6l89wsvx4r2glvt"  # map over task fails currently, lets fix in a follow up PR

project = "jan-playground"
domain = "development"

exec = r.fetch_execution(project=project, domain=domain, name=execution_id)

a = r.client.get_execution_metrics(id=exec.id, depth=10)


def handle_node(node):
    node_id = node.node_id.node_id
    if node_id != "":  # Check for existing nodes in the execution
        for node_span in node.spans:
            task_id = node_span.task_id.task_id.name
            if node_span.node_id.node_id != "":
                handle_node(node=node_span)
            elif task_id != "":
                print(f"{node_id} - {task_id}")
                task_metrics = {}
                for operation_span in node_span.spans:
                    task_metrics[operation_span.operation_id] = (
                        operation_span.end_time.seconds
                        - operation_span.start_time.seconds
                    )
                print(task_metrics)


for span in a.spans:
    handle_node(node=span)

Example output:
image

Check all the applicable boxes

  • I updated the documentation accordingly.
  • All new and existing tests passed.
  • All commits are signed-off.

Summary by Bito

The implementation initially added a get_execution_metrics endpoint and Span model for workflow execution metrics, but was later refactored to remove the deprecated metrics module and Span model. The changes simplified the metrics retrieval logic through the friendly client while maintaining core capabilities for analyzing task runtimes and execution performance.

Unit tests added: False

Estimated effort to review (1-5, lower is better): 1

@flyte-bot
Copy link
Contributor

Code Review Agent Run Status

  • Limitations and other issues: ❌ Failure - The AI Code Review Agent skipped reviewing this change because it is configured to exclude certain pull requests based on the source/target branch or the pull request status. You can change the settings here, or contact the agent instance creator at [email protected].

Signed-off-by: Jan Fiedler <[email protected]>
@flyte-bot
Copy link
Contributor

flyte-bot commented Feb 14, 2025

Code Review Agent Run #077496

Actionable Suggestions - 1
  • flytekit/models/metrics.py - 1
Review Details
  • Files reviewed - 2 · Commit Range: b12f383..8462d2b
    • flytekit/clients/friendly.py
    • flytekit/models/metrics.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

AI Code Review powered by Bito Logo

@flyte-bot
Copy link
Contributor

flyte-bot commented Feb 14, 2025

Changelist by Bito

This pull request implements the following key changes.

Key Change Files Impacted
New Feature - Workflow Execution Metrics API Implementation

friendly.py - Added get_execution_metrics method to SynchronousFlyteClient for retrieving workflow execution metrics

Copy link
Member

@troychiu troychiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM but could use @wild-endeavor 's review

@@ -0,0 +1,124 @@
from flyteidl.core import metrics_pb2 as _metrics_pb2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should do away with model files. Could we just remove this file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we talked about this! Done!

Signed-off-by: Jan Fiedler <[email protected]>
Signed-off-by: Jan Fiedler <[email protected]>
@flyte-bot
Copy link
Contributor

flyte-bot commented Feb 18, 2025

Code Review Agent Run #39e849

Actionable Suggestions - 0
Additional Suggestions - 1
  • flytekit/clients/friendly.py - 1
    • Consider returning span directly from response · Line 675-675
Review Details
  • Files reviewed - 2 · Commit Range: 8462d2b..b76f3f8
    • flytekit/clients/friendly.py
    • flytekit/models/metrics.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

AI Code Review powered by Bito Logo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants