You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The last character of a YouTube video ID can only be one of 16 characters. Until sometime in 2020, YouTube accepted video IDs where this last character was close enough, so -sHIYAaJ7CK is invalid but may previously have been accepted as -sHIYAaJ7CI, which is valid.
The YouTube Video Finder should perform this canonicalization before searching. Maybe it should try it both canonicalized and uncanonicalized? Because if the URL is damaged, it may have been archived like that before YouTube stopped supporting that URL format.
It's as easy as decoding the base64 and reencoding it, because most base64 decoders will drop the last two bits (which are the ones that can get mangled). In Python:
The last character of a YouTube video ID can only be one of 16 characters. Until sometime in 2020, YouTube accepted video IDs where this last character was close enough, so
-sHIYAaJ7CK
is invalid but may previously have been accepted as-sHIYAaJ7CI
, which is valid.The YouTube Video Finder should perform this canonicalization before searching. Maybe it should try it both canonicalized and uncanonicalized? Because if the URL is damaged, it may have been archived like that before YouTube stopped supporting that URL format.
It's as easy as decoding the base64 and reencoding it, because most base64 decoders will drop the last two bits (which are the ones that can get mangled). In Python:
Cf. https://wiki.archiveteam.org/index.php/YouTube/Technical_details#Videos
The text was updated successfully, but these errors were encountered: