Skip to content

[UNMAINTAINED] A middleware that provides continuous site login facility

Notifications You must be signed in to change notification settings

TeamHG-Memex/scrapy-login

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Scrapy-login

A middleware that helps implementing login facility for Scrapy spiders

Usage

  1. Load middleware in settings.py

    SPIDER_MIDDLEWARES = { [...], 'scrapy_login.LoginMiddleware': 200, }

  2. Implement do_login(response, username, password) in your spider class. response var is a response from first start request. This method can return Request or Deferred that resolves to Request.

  3. Implement check_login(response) method in your spider. It has to check response after login for login indicators (eg. logout button, elements that are not available without login) and return True if login succeed, otherwise False or str providing error message.

  4. Run your spider with arguments username and password, for example:

    scrapy crawl -a username=johndoe -a password=mysecret dmoz.com

About

[UNMAINTAINED] A middleware that provides continuous site login facility

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages