-
Notifications
You must be signed in to change notification settings - Fork 236
Task handling is incomplete #774
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This is true; it's the reason for #270 but ideally we should make the operation reliable. |
This issue is also mentioned in #1179. I can help patch this by:
But in the long term, we might want to move this purge execution logic to the Table Maintenance Service #538 for better scalability. |
I think that it's also very important that no two instances run the same task. |
Yup, we prevent this from happening by storing a LAST_ATTEMPT_START_TIME for each task entity. When loadTasks is called, each task is selected based on whether it’s timed out (LAST_ATTEMPT_START_TIME < now - TASK_TIMEOUT_MILLIS), and then updated transactionally to set a new LAST_ATTEMPT_START_TIME and assign to the executor. So in an HA/LB setup, once one executor picks and updates a task, no others will be able to pick the same one until it times out again. |
Describe the bug
Polaris uses some asynchronously executed tasks to run operations for table and manifest file cleanup. Those tasks are potentially executed in a separate thread in the same JVM. There is however no guarantee that those tasks will eventually run for multiple reasons:
org.apache.polaris.service.catalog.BasePolarisCatalog#dropTable
) are triggered after the fact.Overall this means that for example a "drop table with purge" returns a successful result to the user, the actual purge may never ever happen.
To Reproduce
No response
Actual Behavior
No response
Expected Behavior
No response
Additional context
No response
System information
No response
The text was updated successfully, but these errors were encountered: