Offloading of objects during registration is a difficult to debug trap for inexperienced users #6282
Labels
good first issue
Good for newcomers
Improve error handling
improve-error-message
UX
Issues that require UX-Design
Discussed in #4743
Originally posted by fg91 January 18, 2024
Let us consider an example where 1) the type engine transports an object by offloading it to blob storage (in this case as a pickle file) and 2) where this object is instantiated not in a task but when calling a task in a workflow:
This workflow fails with:
The reason is that during registration,
pyflyte
does not realize that the object needs to be uploaded to blob storage. The user would have to proactively configure the raw data prefix.I would argue that this example is very difficult to understand and debug for users that don't have a clear understanding of Flyte's data model and too "simple" to let users fall into this trap.
As a user I would want flytekit to 1) realize that during registration, files need to be offloaded to blob storage and 2) the backend to specify a default raw data prefix during registration unless I configured it explicitly in my flyte config file.
How could this be fixed?
In the
FileAccessProvider
, we need to prevent thatput_raw_data
stores offloaded objects locally during registration.pyflyte
request theraw_data_prefix
fromflyteadmin
if the user didn't set it explicitly and "make the file access provider aware of it"?The text was updated successfully, but these errors were encountered: