-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add sjl-static domains for thumbnails #72
Comments
Does the CDX API support wildcards in the hostname? |
Looks like not directly, but it does support regex. I wonder if that can be used. |
I think a good approach would be grabbing all the links and request a search from that since YouTube doesn't use it anymore |
I don't think that's a good idea as WARCs may always be added to the Wayback Machine. We'd be missing those. |
True.... |
What do you think about filtering for all subdomains of sjl.youtube.com, i.e. https://web.archive.org/cdx/search/cdx?url=*.sjl.youtube.com/*&output=json&fl=original&collapse=urlkey ? Edit: Ah, I see, you can't filter for all subdomains and a specific prefix simultaneously. :/ |
In 2006-2007, YouTube used sjl-static{number}.sjl.youtube.com to host thumbnails.
https://web.archive.org/cdx/search/cdx?url=sjl-static1.sjl.youtube.com/*&output=json&fl=original&collapse=urlkey
Only issue is that the number in question goes from 1 - 16, meaning there needs to be 16 domains checked and that's pretty unrealistic for every query.
The text was updated successfully, but these errors were encountered: