-
Notifications
You must be signed in to change notification settings - Fork 204
Scrapy price monitor update #14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
further-reading
wants to merge
8
commits into
scrapinghub:master
Choose a base branch
from
further-reading:scrapy_price_monitor_update
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 2 commits
Commits
Show all changes
8 commits
Select commit
Hold shift + click to select a range
4ec419d
Making new version with modern code
further-reading 8c24048
Making new version with modern code
further-reading 72b906d
Removing old code and seperating alert code
further-reading 25fe0e1
Removing placeholder project id
further-reading 065b77f
PR feedback
further-reading 36c4251
PR feedback
further-reading af32707
PR feedback
further-reading 793aa32
Adding missing fields
further-reading File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2,7 +2,8 @@ Scrapy Price Monitor | |
==================== | ||
|
||
This is a simple price monitor built with [Scrapy](https://github.com/scrapy/scrapy) | ||
and [Scrapy Cloud](https://scrapinghub.com/scrapy-cloud). | ||
and [Scrapy Cloud](https://www.zyte.com/scrapy-cloud/). It is an updated version of | ||
[this sample](https://github.com/scrapinghub/sample-projects/tree/master/scrapy_price_monitor/_scrapy_price_monitor_OLD). | ||
|
||
It is basically a Scrapy project with one spider for each online retailer that | ||
we want to monitor prices from. In addition to the spiders, there's a Python | ||
|
@@ -19,11 +20,6 @@ the already supported retailers, just add a new key for that product and add | |
the URL list as its value, such as: | ||
|
||
{ | ||
"headsetlogitech": [ | ||
"https://www.amazon.com/.../B005GTO07O/", | ||
"http://www.bestbuy.com/.../3436118.p", | ||
"http://www.ebay.com/.../110985874014" | ||
], | ||
"NewProduct": [ | ||
"http://url.for.retailer.x", | ||
"http://url.for.retailer.y", | ||
|
@@ -34,16 +30,8 @@ the URL list as its value, such as: | |
|
||
## Supporting Further Retailers | ||
|
||
This project currently only works with 3 online retailers, and you can list them | ||
running: | ||
|
||
$ scrapy list | ||
amazon.com | ||
bestbuy.com | ||
ebay.com | ||
|
||
If the retailer that you want to monitor is not yet supported, just create a spider | ||
to handle the product pages from it. To include a spider for samsclub.com, you | ||
To add a retailer, just create a spider to handle the product pages from it. | ||
To include a spider for samsclub.com, you | ||
could run: | ||
|
||
$ scrapy genspider samsclub.com samsclub.com | ||
further-reading marked this conversation as resolved.
Show resolved
Hide resolved
further-reading marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
@@ -74,7 +62,7 @@ later when showing how to schedule the project on Scrapy Cloud. | |
|
||
1. Clone this repo: | ||
|
||
$ git clone [email protected]:stummjr/scrapy_price_monitor.git | ||
$ git clone [email protected]:further-reading/price-monitoring-sample.git | ||
further-reading marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
2. Enter the folder and install the project dependencies: | ||
|
||
|
@@ -141,9 +129,9 @@ To do that, first add your Scrapy Cloud project id to [settings.py `SHUB_PROJ_ID | |
|
||
Then run the spiders via command line: | ||
|
||
$ scrapy crawl bestbuy.com | ||
$ scrapy crawl books.toscrape.com | ||
|
||
This will run the spider named as `bestbuy.com` and store the scraped data into | ||
This will run the spider named as `books.toscrape.com` and store the scraped data into | ||
a Scrapy Cloud collection, under the project you set in the last step. | ||
|
||
You can also run the price monitor via command line: | ||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,92 @@ | ||
# Byte-compiled / optimized / DLL files | ||
__pycache__/ | ||
*.py[cod] | ||
*$py.class | ||
|
||
# C extensions | ||
*.so | ||
|
||
# Distribution / packaging | ||
.Python | ||
env/ | ||
build/ | ||
develop-eggs/ | ||
dist/ | ||
downloads/ | ||
eggs/ | ||
.eggs/ | ||
lib/ | ||
lib64/ | ||
parts/ | ||
sdist/ | ||
var/ | ||
*.egg-info/ | ||
.installed.cfg | ||
*.egg | ||
|
||
# PyInstaller | ||
# Usually these files are written by a python script from a template | ||
# before PyInstaller builds the exe, so as to inject date/other infos into it. | ||
*.manifest | ||
*.spec | ||
|
||
# Installer logs | ||
pip-log.txt | ||
pip-delete-this-directory.txt | ||
|
||
# Unit test / coverage reports | ||
htmlcov/ | ||
.tox/ | ||
.coverage | ||
.coverage.* | ||
.cache | ||
nosetests.xml | ||
coverage.xml | ||
*,cover | ||
.hypothesis/ | ||
|
||
# Translations | ||
*.mo | ||
*.pot | ||
|
||
# Django stuff: | ||
*.log | ||
local_settings.py | ||
|
||
# Flask stuff: | ||
instance/ | ||
.webassets-cache | ||
|
||
# Scrapy stuff: | ||
.scrapy | ||
|
||
# Sphinx documentation | ||
docs/_build/ | ||
|
||
# PyBuilder | ||
target/ | ||
|
||
# IPython Notebook | ||
.ipynb_checkpoints | ||
|
||
# pyenv | ||
.python-version | ||
|
||
# celery beat schedule file | ||
celerybeat-schedule | ||
|
||
# dotenv | ||
.env | ||
|
||
# virtualenv | ||
.venv/ | ||
venv/ | ||
ENV/ | ||
|
||
# Spyder project settings | ||
.spyderproject | ||
|
||
# Rope project settings | ||
.ropeproject | ||
|
||
.scrapy |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.