-
-
Notifications
You must be signed in to change notification settings - Fork 351
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reporting blanks #779
Comments
Maybe it is a cache problem? There are two things confusing me:
Thanks. |
urlwatch stores its history in directories related to cache (see e.g. #416). If you have a program on your machine which regularly cleans cache directories, this currently will clean out the history. "NEW" in that report means that there is no previous entry for it. Otherwise the report would mention "CHANGED". However, it's weird that After you see the "NEW" report, does |
Thanks @thp ! I don't have anything running that i know of that is cleaning the cache (no cleanmymac). FWIW it seems to be only with these https://lobaughsdahlias.com/ ecommerce urls. I've been using urlwatch successfully on a few other urls for a month or so and it works as expected |
@jkmaxwell how did you install urlwatch? i've been having sporadic alarms about I've installed urlwatch using rye. at first i thought it was urlwatch being 2.28, but i've downgraded to 2.25 and the issue persists. so it must be rye's fault, as 2.25 works fine on other machines. |
Between 2.27 and 2.28, the backend for "navigate" jobs changed from Pyppeteer to Playwright. In the logs you posted above, it says:
It seems like you have multiple reporters configured (stdout + ifttt appear in addition to e-mail and pushover?). Also, according to the logs, this specific run didn't cause any e-mail or pushover notification to be sent? Can you post (with private information replaced with "XXX"):
|
Not sure if rye can cause run-time issues like this, maybe some direct or indirect dependencies of urlwatch are installed in the wrong version? @jkmaxwell did you use |
Encountering the same issue, where the reporter is frequently reporting blanks. Started several days ago when it appears that urlwatch was somehow updated, it was updated from 2.28 to 2.28_4. (2.28_4 was installed through Homebrew while other dependencies were updated). Similar to above, the Inspection with --verbose looks similar, where the filter is applied, reports are made with Uninstalling and reinstalling urlwatch hasn't seemed to have made any difference, as the blank reports are still surfacing. |
installed via brew here as well, not rye |
So far, have run into a single ERROR report - aside from the blank "NEW" reports. The error seems to be identical to the ERROR that @gaia encountered in the
The "NEW" blank reports only occur on a subset of the jobs. After receiving the "NEW" blank reports for up to a week, they seemingly only occur when the computer is actively in use. If the computer is asleep/urlwatch is still running with a scheduled cron job, it appears the "NEW" blank reports will not occur. Edit: March 4, 2024- This error is still occurring. For example, my reporter had fired approximately 40 times with "CHANGES" from July 2023 to January 2024 (around 8 months). From January to March, the reporter has fired approximately 585 times containing blank "NEW" reports, and only 5 of these reports have actually contained changes. |
Giving my sad face emoji because this also happens to me. Or I get this Python error or a NEW entry appears. |
@mhalano can you help me with a recreate? If you can help me trigger the issue I can try and work out why its happening. Are you also installing with rye? |
No. I generated a wheel file from source and installed with pip. I think the problem is related to Python version 3.12. Version 3.11 didn't have any problems. I can install using pip or Ubuntu package manager. The question is I'm using the development version, that uses Python version 3.12. But sure, I will help you. |
Can you share the commands you used to build and install the wheel, and the exact python version? |
A |
I built from master but the most recent commit was python3 setup.py bdist_wheel
pip3 install --force-reinstall --user dist/urlwatch-2.28-py3-none-any.whl My Python version is 3.12.2 And
|
Thanks. I managed to make it fail and have been having a play. I expect this is a race in the cache. If I switch MAX_WORKERS in worker.py to 1, I don't get the errors any more. It would be good to hear if the same applies for you, but I'm assuming there's a DB lock somewhere that's causing this. |
@thp would be interested in your thought on this one. I'm not sure where to start looking, especially as minidb seems to be using locks on both read and write already. I might potentially try moving the job_state load call out of the parallelism. I'm fairly sure its the minidb cache, because I also tries switching to the old file based caching mechanism, and that didn't exhibit the same problems. That one does also have a different relationship with history though. |
Yes, moving the load to not attempt to be parallel fixes the bug for me: https://github.com/thp/urlwatch/compare/master...Jamstah:threading?expand=1 Its a workaround though, minidb should be doing the locking already, unless RLock is doing something odd with the threading around python vs system threads. My local system python is currently 3.12.1 and it doesn't exhibit the symptoms. To show the bug I ran in a container with 3.12.2. I'll update my system to 3.12.2 and see if that's the underlying change that exposes the bug too. |
Yes, I upgraded my system to python 3.12.2, and the bug appeared, so its definitely a change in 3.12.2 that exposes the bug. Don't see anything obvious in the changelog though: https://docs.python.org/3/whatsnew/changelog.html#python-3-12-2-final |
Added a test that runs static command jobs 100 times which flags up the error reliably on python 3.12.2, so we can add that along with whatever solution we decide to go with: https://github.com/thp/urlwatch/compare/master...Jamstah:threading?expand=1 No PR yet because I'm not happy with the solution. |
24 hours after switching MAX_WORKERS in worker.py from 10 to 1, there has not been a "blank NEW" report, nor have there been any error messages. |
This approach couldn't affect negatively the execution speed? I think 10 workers working in parallel is better than serialize everything to just one single worker, but this is a well-educated guess. I don't know if this is how |
Yes, that will slow down execution speed. That test was really to check if it was a race condition, that isn't how I would solve the problem. One option is here, which just moves loading the cache to be serialized (which it already was effectively because the database is serialized), then checks the urls in parallel: https://github.com/thp/urlwatch/compare/master...Jamstah:threading?expand=1 I'd like to wait for @thp to take a look though, because he wrote both urlwatch and the database layer, so may have some more thoughts. |
Yes, loading the cache doesn't need to be parallel, and loading it upfront seems like a great idea. |
Thanks, have opened the PR. |
Also, @Jamstah thanks for the test case for Python 3.12, I think this fixes it in minidb: thp/minidb@433ae34 |
minidb 2.0.8 should contain the fix. |
Threading was the cause of the issue in thp#779, so adding a test that will exercise multi-threading.
Threading was the cause of the issue in thp#779, so adding a test that will exercise multi-threading.
Threading was the cause of the issue in #779, so adding a test that will exercise multi-threading. Co-authored-by: Thomas Perl <[email protected]>
I believe there is an error happening but I cannot confirm it and am therefore considering it a possible bug.
Summary:
I am getting empty notifications. I believe I should not be, given my parameters.
urlwatch.yaml
Relevant sections
Reporting via email and pushover
urls.yaml
(i already tried with url instead of playwright, but i'm trying to outrun the dynamic content and it seems impossible)
Logs
test-diff-filter
test-filter
--verbose
Problem
However, I continually get positive hits and these generate reporter actions. Why is it reporting them as new?
Thank you!
The text was updated successfully, but these errors were encountered: