Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add sjl-static domains for thumbnails #72

Open
actuallyasmartname opened this issue Feb 8, 2024 · 6 comments
Open

Add sjl-static domains for thumbnails #72

actuallyasmartname opened this issue Feb 8, 2024 · 6 comments

Comments

@actuallyasmartname
Copy link

In 2006-2007, YouTube used sjl-static{number}.sjl.youtube.com to host thumbnails.
https://web.archive.org/cdx/search/cdx?url=sjl-static1.sjl.youtube.com/*&output=json&fl=original&collapse=urlkey
Only issue is that the number in question goes from 1 - 16, meaning there needs to be 16 domains checked and that's pretty unrealistic for every query.

@TheTechRobo
Copy link
Owner

Does the CDX API support wildcards in the hostname?

@TheTechRobo
Copy link
Owner

TheTechRobo commented Feb 8, 2024

Looks like not directly, but it does support regex. I wonder if that can be used.

@actuallyasmartname
Copy link
Author

I think a good approach would be grabbing all the links and request a search from that since YouTube doesn't use it anymore

@TheTechRobo
Copy link
Owner

I don't think that's a good idea as WARCs may always be added to the Wayback Machine. We'd be missing those.

@actuallyasmartname
Copy link
Author

True....

@TheTechRobo
Copy link
Owner

TheTechRobo commented Feb 14, 2024

What do you think about filtering for all subdomains of sjl.youtube.com, i.e. https://web.archive.org/cdx/search/cdx?url=*.sjl.youtube.com/*&output=json&fl=original&collapse=urlkey ?

Edit: Ah, I see, you can't filter for all subdomains and a specific prefix simultaneously. :/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants