-
Notifications
You must be signed in to change notification settings - Fork 696
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Simplify flytekit.Resources
#6245
base: master
Are you sure you want to change the base?
Conversation
Signed-off-by: Grantham Taylor <[email protected]>
Thank you for opening this pull request! 🙌 These tips will help get your PR across the finish line:
|
Code Review Agent Run Status
|
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #6245 +/- ##
===========================================
- Coverage 50.51% 36.87% -13.64%
===========================================
Files 1162 1318 +156
Lines 91811 134703 +42892
===========================================
+ Hits 46375 49672 +3297
- Misses 41340 80703 +39363
- Partials 4096 4328 +232
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
This RFC would be a big loss for me. I love having control over reqs/lims directly. |
Hi! Thank you for sharing. Perhaps, going forward, we can allow power users to define the request and limits separately in the form of a tuple Is it just |
|
||
Some of the footguns around `requests` and `limits` are non-deterministic and not intuitive, thus can escape detection during development but have significant and detrimental impact in production. | ||
|
||
For example, the possibility of setting `requests < limits` for `mem` can lead to a pod becoming overloaded. For `cpu`, doing so can lead to CPU availability, for some multi-processing applications, to be non-deterministic, thus resulting in unpredictable performance degradation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for the memory case in particular, when a requests != limits
the pod is considered burstable (quality of service designation) and more likely to be evicted when the node is under memory pressure
|
||
## 6 Alternatives | ||
|
||
Should any resource type (`CPU`) need both `limits` and `requests`, we should allow definition of them via a `tuple` (IE `(400m, 600m)`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to having an escape hatch but prioritizing the common use-case
@cjidboon94 that's really helpful feedback, do you mind expanding in which circumstances you prefer to set request != limits |
I agree that the UX of the task arguments
I also agree that for ML engineers/Data Scientists who don't know/care about K8s, the UX can feel a bit awkward. Especially the intricacies of requests vs. limits. We maintain a decorator wrapping the flytekit task decorator which derives the limits from the requests in a way that makes sense for us. Our users never use the limits arg. At the same time, I feel for platform engineers it absolutely makes sense to have full control over both requests and limits. To summarize:
From a platform engineer's perspective:
All of this being said, this is a very breaking change which needs to be considered carefullly. |
Defining resources for tasks is error prone and counter-intuitive due to the nuances around
requests
andlimits
.This RFC discusses an alternative solution to simplify the definition of resources, remove some footguns, and smoothen flytekit altogether.