Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Core feature] Pyflyte - Workflow Version Comparisons #6196

Open
2 tasks done
UmerAhmad opened this issue Jan 27, 2025 · 7 comments
Open
2 tasks done

[Core feature] Pyflyte - Workflow Version Comparisons #6196

UmerAhmad opened this issue Jan 27, 2025 · 7 comments
Assignees
Labels
enhancement New feature or request waiting for reporter Used for when we need input from the bug reporter

Comments

@UmerAhmad
Copy link

UmerAhmad commented Jan 27, 2025

Motivation: Why do you think this is important?

Internally, we found a gap in native support for comparing workflow versions and intuitively understanding changes made to your workflow, and have created a comparison feature to do so, and wanted to discuss about potential interest on the OSS side.

This feature is a boon for a few reasons, users can quickly identify differences (DAG, config, inputs, outputs, metadata, interface, etc) between workflow versions and identify changes that may have introduced performance or issues or bugs to significantly reduce time spent on troubleshooting, track changes made between versions by other developers in collaboration, reproducibility, etc.

Goal: What should the final outcome look like, ideally?

Currently this is supported via command line tooling to generate a full DAG, node by node comparison of workflow versions with a table generated side by side highlighted delta unified difference between workflow/task template/specs (with the option to output to an HTML file, attached are screenshots of an example of a simple workflow comparison to demonstrate how it looks at a very basic level - no differences, same DAG, only a separate version registered). Two versions can be supplemented to the command, in which case it fetches them from remote, where otherwise it will use the to be registered version and/or the latest version from remote depending on how many versions were supplemented. The DAG is created in memory for comparison by fully expanding the remote FlyteWorkflow until all subworkflows and their tasks are expanded into task literals.

This works well in the terminal, and can be extended/reworked for UI support, etc.

Image Image

Describe alternatives you've considered

N/A, alternatives were mainly internal technical decisions.

Propose: Link/Inline OR Additional context

No response

Are you sure this issue hasn't been raised already?

  • Yes

Have you read the Code of Conduct?

  • Yes
@UmerAhmad UmerAhmad added enhancement New feature or request untriaged This issues has not yet been looked at by the Maintainers labels Jan 27, 2025
@10sharmashivam
Copy link
Contributor

Hi @davidmirror-ops @UmerAhmad, I would like to pick up this issue. #take

@davidmirror-ops
Copy link
Contributor

davidmirror-ops commented Feb 3, 2025

@UmerAhmad so is this something that already works internally for your team and you plan to upstream it? It'd be a very useful feature

@UmerAhmad
Copy link
Author

Hi @10sharmashivam thank you for the interest! but this is something we've already implemented internally and would plan to iterate on that for the upstream.

@davidmirror-ops, thanks for the response! The feature is currently pending internal release, and we wanted to gauge interest from the OSS side in the meantime. After gathering internal feedback, we’d be excited to explore starting the upstream process.

@10sharmashivam
Copy link
Contributor

Hi @10sharmashivam thank you for the interest! but this is something we've already implemented internally and would plan to iterate on that for the upstream.

@davidmirror-ops, thanks for the response! The feature is currently pending internal release, and we wanted to gauge interest from the OSS side in the meantime. After gathering internal feedback, we’d be excited to explore starting the upstream process.

@UmerAhmad @davidmirror-ops, Got it, thanks for the update! Looking forward to seeing how this evolves.

Regards

@davidmirror-ops
Copy link
Contributor

@UmerAhmad yes, this feature would be very helpful. Once you're ready to upstream, let us know if you have questions

@eapolinario eapolinario removed the untriaged This issues has not yet been looked at by the Maintainers label Feb 6, 2025
@eapolinario eapolinario self-assigned this Feb 6, 2025
@eapolinario
Copy link
Contributor

just to echo the sentiment here, this looks awesome! I'd love to learn more about how it works and how we can make this part of the Flyte ecosystem as a whole , maybe, as the title suggest, as pyflyte subcommand.

@eapolinario eapolinario assigned UmerAhmad and unassigned eapolinario Feb 7, 2025
@eapolinario eapolinario added the waiting for reporter Used for when we need input from the bug reporter label Feb 7, 2025
@UmerAhmad
Copy link
Author

UmerAhmad commented Feb 8, 2025

@davidmirror-ops @eapolinario thank you, will do! and great, i'm happy to see the interest and looking forward to discussion -- we should be able to dive deeper into details soon.

Also, I wanted to link this issue & PR which is a related action item for the feature (issue, PR). Essentially we noticed that there were fields missing in the remote Task entity and subsequently the comparisons (with mainly k8s_pod being of concern for us to compare execution configurations).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request waiting for reporter Used for when we need input from the bug reporter
Projects
Status: Backlog
Development

No branches or pull requests

4 participants