Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[16.0][IMP] queue_job: run specific hook method after max_retries #674

Open
wants to merge 2 commits into
base: 16.0
Choose a base branch
from
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions queue_job/job.py
Original file line number Diff line number Diff line change
Expand Up @@ -527,6 +527,21 @@
elif not self.max_retries: # infinite retries
raise
elif self.retry >= self.max_retries:
hook = f"{self.method_name}_on_max_retries_reached"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am generally not a fan of interpolating method names. Pass on_exception as an additional argument to delayable/with_delay instead?

Perhaps the scope could be slightly broader as well? Give the developer a chance to handle all types of exception, not just FailedJobError?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • interpolating method names is quite a common pattern in odoo code: see lots of getattr in the codebase :)
  • quite elegant imho to be able to define method_name and method_name_on_max_retries_reached nearby, but of course it's a bit subjective
  • regarding your last point, that's an interesting idea but it feels quite natural to handle exceptions in the job code itself, e.g. in the EDI framework here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A more declarative approach could be to use a decorator but it will likely add complexity.
@QuocDuong1306 could you please update the docs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @simahawk , I updated the docs

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say whenever job reaches failed state, it would be useful to have a hook, to do something, not when it just failed after max retries, but failed for any reason?

For example, issue described here: #618

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. Yet, I think you can subscribe to that particular event easily (job switching to failed).
In fact we could subscribe even in this case and check the max retry counter.
@guewen did you have something in mind regarding handling failures?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously this is the kind of thing we would add to the @job decorator, things that were configured on this decorator are now on queue.job.function. This is akin to the "related actions" where we store the method to execute there. Different jobs can be pointed to the same error handler, and we would be able to use an handler on "no-code jobs" easily (e.g. I call an existing method with with_delay in a script, and I want to notify slack when the max failure is reached using a handler that already exists in the code, I can create a queue job function and set this handler from the UI).

I agree with your points on triggering when switching to failed, not considering retries, then it would be worth to provide the max retry and current retry count to the handler as well.

Something to pay really attention to in the implementation is the transaction handling: I think in the current form, if the job failed with any error that causes a rollback (such as a serialization error for example), the transaction is unusable and the handler will probably fail as well! We should probably execute it in a new transaction, but then be aware that it will not be up-to-date with whatever happened in the current transaction, and could be subject to deadlocks depending of what the failed job did and the failure handler does...

Considering that, I'd also be more confortable if the handling happens somewhere in

    def _try_perform_job(self, env, job):
        """Try to perform the job."""
        job.set_started()
        job.store()
        env.cr.commit()
        _logger.debug("%s started", job)

        job.perform()
        # Triggers any stored computed fields before calling 'set_done'
        # so that will be part of the 'exec_time'
        env.flush_all()
        job.set_done()
        job.store()
        env.flush_all()
        env.cr.commit()
        _logger.debug("%s done", job)

So the transactional flow is more straightforward

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously this is the kind of thing we would add to the @job decorator, things that were configured on this decorator are now on queue.job.function. This is akin to the "related actions" where we store the method to execute there. Different jobs can be pointed to the same error handler, and we would be able to use an handler on "no-code jobs" easily (e.g. I call an existing method with with_delay in a script, and I want to notify slack when the max failure is reached using a handler that already exists in the code, I can create a queue job function and set this handler from the UI).

I agree with your points on triggering when switching to failed, not considering retries, then it would be worth to provide the max retry and current retry count to the handler as well.

Something to pay really attention to in the implementation is the transaction handling: I think in the current form, if the job failed with any error that causes a rollback (such as a serialization error for example), the transaction is unusable and the handler will probably fail as well! We should probably execute it in a new transaction, but then be aware that it will not be up-to-date with whatever happened in the current transaction, and could be subject to deadlocks depending of what the failed job did and the failure handler does...

Considering that, I'd also be more confortable if the handling happens somewhere in

    def _try_perform_job(self, env, job):
        """Try to perform the job."""
        job.set_started()
        job.store()
        env.cr.commit()
        _logger.debug("%s started", job)

        job.perform()
        # Triggers any stored computed fields before calling 'set_done'
        # so that will be part of the 'exec_time'
        env.flush_all()
        job.set_done()
        job.store()
        env.flush_all()
        env.cr.commit()
        _logger.debug("%s done", job)

So the transactional flow is more straightforward

Hello @guewen, I started in #734, would you mind taking a look please?

if hasattr(self.recordset, hook):
recordset = self.recordset.with_context(

Check warning on line 532 in queue_job/job.py

View check run for this annotation

Codecov / codecov/patch

queue_job/job.py#L532

Added line #L532 was not covered by tests
job_uuid=self.uuid, exc_info=self.exc_info
)
try:
getattr(recordset, hook)()
except Exception as ex:
_logger.debug(

Check warning on line 538 in queue_job/job.py

View check run for this annotation

Codecov / codecov/patch

queue_job/job.py#L535-L538

Added lines #L535 - L538 were not covered by tests
"Exception on %s:%s() for Job UUID: %s",
self.recordset,
hook,
self.uuid,
ex,
)
type_, value, traceback = sys.exc_info()
# change the exception type but keep the original
# traceback and message:
Expand Down
Loading