Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

定时任务有时无法正常执行,能否加上重试机制 #83

Open
QingGo opened this issue Oct 11, 2018 · 9 comments
Open

定时任务有时无法正常执行,能否加上重试机制 #83

QingGo opened this issue Oct 11, 2018 · 9 comments

Comments

@QingGo
Copy link

QingGo commented Oct 11, 2018

前一阵子发现定时任务有时候没有按计划来执行,仔细检查程序输出日志,发现报错如下:

2018-10-11 16:18:50,165 - SpiderKeeper.app - DEBUG - [sync_job_execution_status]
Job "sync_job_execution_status_job (trigger: interval[0:00:05], next run at: 2018-10-11 08:18:59 UTC)" raised an exception
Traceback (most recent call last):
  File "/usr/local/lib/python3.5/site-packages/apscheduler/executors/base.py", line 125, in run_job
    retval = job.func(*job.args, **job.kwargs)
  File "/usr/local/lib/python3.5/site-packages/SpiderKeeper/app/schedulers/common.py", line 13, in sync_job_execution_status_job
    for project in Project.query.all():
  File "/usr/local/lib/python3.5/site-packages/sqlalchemy/orm/query.py", line 2703, in all
    return list(self)
  File "/usr/local/lib/python3.5/site-packages/sqlalchemy/orm/query.py", line 2855, in __iter__
    return self._execute_and_instances(context)
  File "/usr/local/lib/python3.5/site-packages/sqlalchemy/orm/query.py", line 2876, in _execute_and_instances
    close_with_result=True)
  File "/usr/local/lib/python3.5/site-packages/sqlalchemy/orm/query.py", line 2885, in _get_bind_args
    **kw
  File "/usr/local/lib/python3.5/site-packages/sqlalchemy/orm/query.py", line 2867, in _connection_from_session
    conn = self.session.connection(**kw)
  File "/usr/local/lib/python3.5/site-packages/sqlalchemy/orm/session.py", line 966, in connection
    execution_options=execution_options)
  File "/usr/local/lib/python3.5/site-packages/sqlalchemy/orm/session.py", line 971, in _connection_for_bind
    engine, execution_options)
  File "/usr/local/lib/python3.5/site-packages/sqlalchemy/orm/session.py", line 382, in _connection_for_bind
    self._assert_active()
  File "/usr/local/lib/python3.5/site-packages/sqlalchemy/orm/session.py", line 276, in _assert_active
    % self._rollback_exception
sqlalchemy.exc.InvalidRequestError: This Session's transaction has been rolled back due to a previous exception during flush. To begin a new transaction with this Session, first issue Session.rollback(). Original exception was: (sqlite3.IntegrityError) NOT NULL constraint failed: sk_job_execution.service_job_execution_id [SQL: 'INSERT INTO sk_job_execution (date_created, date_modified, project_id, service_job_execution_id, job_instance_id, create_time, start_time, end_time, running_status, running_on) VALUES (CURRENT_TIMESTAMP, CURRENT_TIMESTAMP, ?, ?, ?, ?, ?, ?, ?, ?)'] [parameters: (11, None, 12, '2018-10-11 16:12:00.028118', None, None, 0, 'http://192.168.1.245:6800')]

一开始我还以为是Project.query.all()出错,仔细一看原来是同一个事务中的前一条用于插入的sql语句出错了。因为错误输出中没有指明出错的代码位置,我猜测这条sql语句是对应sqlalchemy里的session.add语句。最终定位到原因可能在文件/SpiderKeeper/app/proxy/contrib/scrapy.py中,开启爬虫任务的方法start_spider有可能会返回空值,而sqlite里sk_job_execution表的service_job_execution_id字段又有非空约束。

def start_spider(self, project_name, spider_name, arguments):
        post_data = dict(project=project_name, spider=spider_name)
        post_data.update(arguments)
        data = request("post", self._scrapyd_url() + "/schedule.json", data=post_data, return_type="json")
        return data['jobid'] if data and data['status'] == 'ok' else None

我的SpiderKeeper和Scrapyd部署在同一台机器的不同docker上,Scrapyd上部署了十来只爬虫。Scrapyd上也没有运行定时任务的记录,看来问题在于SpiderKeeper请求Scrapyd开始任务这一步没有成功。也许是网络问题,也许是Scrapyd本身的问题,不知道有没有人遇到和我一样的情况?另外为了提高定时任务的容错性,能否把start_spider方法改为类似于下面的形式:

 def start_spider(self, project_name, spider_name, arguments):
        post_data = dict(project=project_name, spider=spider_name)
        post_data.update(arguments)
        data = request("post", self._scrapyd_url() + "/schedule.json", data=post_data, return_type="json")
        if data and data['status'] == 'ok':
            return data['jobid']
        else:
            import time
            time.sleep(3)
            return self.start_spider(project_name, spider_name, arguments)

或许还能加上一个可配置的最大重试次数?

@QingGo
Copy link
Author

QingGo commented Oct 11, 2018

真是巧,立刻就看到一个类似的新issue。#84 (comment)

@QingGo
Copy link
Author

QingGo commented Oct 12, 2018

发起了一个PR,不知道是否能合并到这个分支?#85

@luzihang123
Copy link

同样遇到这样的问题

@wymen2020
Copy link

发起了一个PR,不知道是否能合并到这个分支?#85

前一阵子发现定时任务有时候没有按计划来执行,仔细检查程序输出日志,发现报错如下:

2018-10-11 16:18:50,165 - SpiderKeeper.app - DEBUG - [sync_job_execution_status]
Job "sync_job_execution_status_job (trigger: interval[0:00:05], next run at: 2018-10-11 08:18:59 UTC)" raised an exception
Traceback (most recent call last):
  File "/usr/local/lib/python3.5/site-packages/apscheduler/executors/base.py", line 125, in run_job
    retval = job.func(*job.args, **job.kwargs)
  File "/usr/local/lib/python3.5/site-packages/SpiderKeeper/app/schedulers/common.py", line 13, in sync_job_execution_status_job
    for project in Project.query.all():
  File "/usr/local/lib/python3.5/site-packages/sqlalchemy/orm/query.py", line 2703, in all
    return list(self)
  File "/usr/local/lib/python3.5/site-packages/sqlalchemy/orm/query.py", line 2855, in __iter__
    return self._execute_and_instances(context)
  File "/usr/local/lib/python3.5/site-packages/sqlalchemy/orm/query.py", line 2876, in _execute_and_instances
    close_with_result=True)
  File "/usr/local/lib/python3.5/site-packages/sqlalchemy/orm/query.py", line 2885, in _get_bind_args
    **kw
  File "/usr/local/lib/python3.5/site-packages/sqlalchemy/orm/query.py", line 2867, in _connection_from_session
    conn = self.session.connection(**kw)
  File "/usr/local/lib/python3.5/site-packages/sqlalchemy/orm/session.py", line 966, in connection
    execution_options=execution_options)
  File "/usr/local/lib/python3.5/site-packages/sqlalchemy/orm/session.py", line 971, in _connection_for_bind
    engine, execution_options)
  File "/usr/local/lib/python3.5/site-packages/sqlalchemy/orm/session.py", line 382, in _connection_for_bind
    self._assert_active()
  File "/usr/local/lib/python3.5/site-packages/sqlalchemy/orm/session.py", line 276, in _assert_active
    % self._rollback_exception
sqlalchemy.exc.InvalidRequestError: This Session's transaction has been rolled back due to a previous exception during flush. To begin a new transaction with this Session, first issue Session.rollback(). Original exception was: (sqlite3.IntegrityError) NOT NULL constraint failed: sk_job_execution.service_job_execution_id [SQL: 'INSERT INTO sk_job_execution (date_created, date_modified, project_id, service_job_execution_id, job_instance_id, create_time, start_time, end_time, running_status, running_on) VALUES (CURRENT_TIMESTAMP, CURRENT_TIMESTAMP, ?, ?, ?, ?, ?, ?, ?, ?)'] [parameters: (11, None, 12, '2018-10-11 16:12:00.028118', None, None, 0, 'http://192.168.1.245:6800')]

一开始我还以为是Project.query.all()出错,仔细一看原来是同一个事务中的前一条用于插入的sql语句出错了。因为错误输出中没有指明出错的代码位置,我猜测这条sql语句是对应sqlalchemy里的session.add语句。最终定位到原因可能在文件/SpiderKeeper/app/proxy/contrib/scrapy.py中,开启爬虫任务的方法start_spider有可能会返回空值,而sqlite里sk_job_execution表的service_job_execution_id字段又有非空约束。

def start_spider(self, project_name, spider_name, arguments):
        post_data = dict(project=project_name, spider=spider_name)
        post_data.update(arguments)
        data = request("post", self._scrapyd_url() + "/schedule.json", data=post_data, return_type="json")
        return data['jobid'] if data and data['status'] == 'ok' else None

我的SpiderKeeper和Scrapyd部署在同一台机器的不同docker上,Scrapyd上部署了十来只爬虫。Scrapyd上也没有运行定时任务的记录,看来问题在于SpiderKeeper请求Scrapyd开始任务这一步没有成功。也许是网络问题,也许是Scrapyd本身的问题,不知道有没有人遇到和我一样的情况?另外为了提高定时任务的容错性,能否把start_spider方法改为类似于下面的形式:

 def start_spider(self, project_name, spider_name, arguments):
        post_data = dict(project=project_name, spider=spider_name)
        post_data.update(arguments)
        data = request("post", self._scrapyd_url() + "/schedule.json", data=post_data, return_type="json")
        if data and data['status'] == 'ok':
            return data['jobid']
        else:
            import time
            time.sleep(3)
            return self.start_spider(project_name, spider_name, arguments)

或许还能加上一个可配置的最大重试次数?

重试之后有用吗??

@QingGo
Copy link
Author

QingGo commented Jan 9, 2019

发起了一个PR,不知道是否能合并到这个分支?#85

前一阵子发现定时任务有时候没有按计划来执行,仔细检查程序输出日志,发现报错如下:

2018-10-11 16:18:50,165 - SpiderKeeper.app - DEBUG - [sync_job_execution_status]
Job "sync_job_execution_status_job (trigger: interval[0:00:05], next run at: 2018-10-11 08:18:59 UTC)" raised an exception
Traceback (most recent call last):
  File "/usr/local/lib/python3.5/site-packages/apscheduler/executors/base.py", line 125, in run_job
    retval = job.func(*job.args, **job.kwargs)
  File "/usr/local/lib/python3.5/site-packages/SpiderKeeper/app/schedulers/common.py", line 13, in sync_job_execution_status_job
    for project in Project.query.all():
  File "/usr/local/lib/python3.5/site-packages/sqlalchemy/orm/query.py", line 2703, in all
    return list(self)
  File "/usr/local/lib/python3.5/site-packages/sqlalchemy/orm/query.py", line 2855, in __iter__
    return self._execute_and_instances(context)
  File "/usr/local/lib/python3.5/site-packages/sqlalchemy/orm/query.py", line 2876, in _execute_and_instances
    close_with_result=True)
  File "/usr/local/lib/python3.5/site-packages/sqlalchemy/orm/query.py", line 2885, in _get_bind_args
    **kw
  File "/usr/local/lib/python3.5/site-packages/sqlalchemy/orm/query.py", line 2867, in _connection_from_session
    conn = self.session.connection(**kw)
  File "/usr/local/lib/python3.5/site-packages/sqlalchemy/orm/session.py", line 966, in connection
    execution_options=execution_options)
  File "/usr/local/lib/python3.5/site-packages/sqlalchemy/orm/session.py", line 971, in _connection_for_bind
    engine, execution_options)
  File "/usr/local/lib/python3.5/site-packages/sqlalchemy/orm/session.py", line 382, in _connection_for_bind
    self._assert_active()
  File "/usr/local/lib/python3.5/site-packages/sqlalchemy/orm/session.py", line 276, in _assert_active
    % self._rollback_exception
sqlalchemy.exc.InvalidRequestError: This Session's transaction has been rolled back due to a previous exception during flush. To begin a new transaction with this Session, first issue Session.rollback(). Original exception was: (sqlite3.IntegrityError) NOT NULL constraint failed: sk_job_execution.service_job_execution_id [SQL: 'INSERT INTO sk_job_execution (date_created, date_modified, project_id, service_job_execution_id, job_instance_id, create_time, start_time, end_time, running_status, running_on) VALUES (CURRENT_TIMESTAMP, CURRENT_TIMESTAMP, ?, ?, ?, ?, ?, ?, ?, ?)'] [parameters: (11, None, 12, '2018-10-11 16:12:00.028118', None, None, 0, 'http://192.168.1.245:6800')]

一开始我还以为是Project.query.all()出错,仔细一看原来是同一个事务中的前一条用于插入的sql语句出错了。因为错误输出中没有指明出错的代码位置,我猜测这条sql语句是对应sqlalchemy里的session.add语句。最终定位到原因可能在文件/SpiderKeeper/app/proxy/contrib/scrapy.py中,开启爬虫任务的方法start_spider有可能会返回空值,而sqlite里sk_job_execution表的service_job_execution_id字段又有非空约束。

def start_spider(self, project_name, spider_name, arguments):
        post_data = dict(project=project_name, spider=spider_name)
        post_data.update(arguments)
        data = request("post", self._scrapyd_url() + "/schedule.json", data=post_data, return_type="json")
        return data['jobid'] if data and data['status'] == 'ok' else None

我的SpiderKeeper和Scrapyd部署在同一台机器的不同docker上,Scrapyd上部署了十来只爬虫。Scrapyd上也没有运行定时任务的记录,看来问题在于SpiderKeeper请求Scrapyd开始任务这一步没有成功。也许是网络问题,也许是Scrapyd本身的问题,不知道有没有人遇到和我一样的情况?另外为了提高定时任务的容错性,能否把start_spider方法改为类似于下面的形式:

 def start_spider(self, project_name, spider_name, arguments):
        post_data = dict(project=project_name, spider=spider_name)
        post_data.update(arguments)
        data = request("post", self._scrapyd_url() + "/schedule.json", data=post_data, return_type="json")
        if data and data['status'] == 'ok':
            return data['jobid']
        else:
            import time
            time.sleep(3)
            return self.start_spider(project_name, spider_name, arguments)

或许还能加上一个可配置的最大重试次数?

重试之后有用吗??

有用是有用,不过之后又遇到了其它同步状态出错得问题。我的理解是这样的。SpiderKeeper调用Scrapyd的任何一个API都有可能会各种原因失败(比如网络异常,或者scrapyd本身被请求得太频繁导致堵塞),从而造成两者状态不同步,我觉得对于错误应该要加上相应的处理机制,比如在界面提示你操作失败,或者自动重试。不过我现在已经不用SpiderKeeper了,而是改用celery+celery-beat来管理定时任务

@cstackess
Copy link

碰到同样的问题了,定时任务有时不能正常启动

@luzihang123
Copy link

Why not try ScrapydWeb

试试去,看到1.2支持定制任务了Timer Tasks

@yanqiang123456789
Copy link

碰到同样的问题了,定时任务有时不能正常启动

你解决了吗?你是怎么解决的

@luzihang123
Copy link

碰到同样的问题了,定时任务有时不能正常启动

你解决了吗?你是怎么解决的

不用spiderkeeper,用了scrapydweb。spiderkeeper这个项目作者不维护了,任务一多sqlite锁死

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants