-
Notifications
You must be signed in to change notification settings - Fork 50
Open
Description
Problem
Currently icechunk storages only allow non-async functions to be passed in as credential refresh functions.
It would be nice to allow async functions.
Proposed Solution
If an async function is passed in, icechunk should run the credential refresh function on an async event loop.
In a web backend, i'd like you to run the refresh function on the "main" loop, as that's likely what my database access is tied to.
I've written some examples of cases that should be handled, there's probably more ways users be weird.
import asyncio
import icechunk
async def refresh():
return {"creds": "creds"}
async def get_repo_async():
storage = icechunk.s3_storage(get_credentials=refresh)
repo = await icechunk.Repository.open_async(storage=storage)
return repo
def get_repo():
storage = icechunk.s3_storage(get_credentials=refresh)
repo = icechunk.Repository.open(storage=storage)
return repo
# Nice example where "binding" the repo to the event loop it was created on
# would make sense, and running the refresh function on that same loop would work
async def normal():
repo = await get_repo_async()
await repo.list_branches_async() # <- refresh should be called on the same loop as the repo was created on?
repo.list_branches() # <- trying to put refresh() on the same loop would deadlock things i think?
# Less nice where the repo is put on different loops
def new_loop():
repo = asyncio.run(get_repo_async()) # create repo on loop 1
asyncio.run(repo.list_branches_async()) # <- what to do here? explode?
# Creating a repo sync but then trying to use it on a running loop
def sync_repo_async_usage():
# create repo sync, should it create its own loop for refresh to run on?
# Should it reject async refresh when created?
repo = get_repo()
asyncio.run(repo.list_branches_async()) # <- what to do here? explode?Metadata
Metadata
Assignees
Labels
No labels
Type
Projects
Status
No status