Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Admin commands for gloss videos #1398

Open
wants to merge 41 commits into
base: master
Choose a base branch
from
Open

Conversation

susanodd
Copy link
Collaborator

@susanodd susanodd commented Nov 26, 2024

COMMANDS AND FILTERS FOR GLOSS VIDEO ADMIN

FILTERS

- By NME Video

All
True
False

  • By Perspective Video

All
True
False

  • By Filename Correct

All
True
False

  • By Backup Video

All
True
False

COMMANDS (on selection)

- Rename video files to match type

-- The code checks selected backup and normal video files and renames them accordingly
-- This takes care of old "bak bak" sequences and special 7-char Django characters appearing between the ID and extension.
-- It also makes sure the video extension matches the type of the video file. Previously, backup files with bak bak sequences did not have any video extension.

  • Set incorrect NME/Perspective filenames to empty string
    -- The code effectively erases the selected objects filename from the video object
    -- The empty filename shows up in the admin table
    -- Once set to empty, it is possible to delete the incorrect objects without deleting the normal video file
    -- The correct normal gloss video has not been altered, because the command merely erased the string
    -- other code has been modified to handle the possibility that a videofile object may have no path [str(videofile) may be empty]

  • Move selected backups to DELETED_FILES_FOLDER location

  • Renumber selected backups

Original setup NOVEMBER 2024 is below. The code has evolved from what it originally was.

Goal: from a selected set of GlossVideo objects in the admin, delete the video file and the object if the version is a backup.
Reorder the remaining backup videos for the glosses in the set.

@vanlummelhuizen if you're working today on Signbank, the new "video admin" command merely prints what it is doing, it does not actually change anything.
I added a rename to it to get the correct video file extension on the backup file.

Perhaps the command is a bit weird now. Or perhaps it should be split into multiple commands.

The command needs to operate on a selection of GlossVideo objects.
So it does what it's supposed to (although commented out and doesn't), and for the "remaining" objects that were not deleted, it renames them and fixes the version number.

Since the command currently "doesn't do anything other than print stuff" it is harmless to try on the real server.

after selection was deleted, version numbers adjusted for new set. Change extension of backup file if it has the wrong video extension for its format.
@vanlummelhuizen

This comment was marked as resolved.

@susanodd

This comment was marked as resolved.

@vanlummelhuizen
Copy link
Collaborator

The glosses need to be retrieved from the selected GlossVideo objects before anything is deleted.

You are right.

@susanodd susanodd changed the title #1397: Command to delete excess backup videos and reorder versions #1397: Commands to delete excess backup videos and reorder versions Nov 29, 2024
@susanodd

This comment was marked as outdated.

@susanodd

This comment was marked as outdated.

change file extension command, match file type
@susanodd susanodd changed the title #1397: Commands to delete excess backup videos and reorder versions Commands to delete excess backup videos and reorder versions Nov 30, 2024
@susanodd susanodd changed the title Commands to delete excess backup videos and reorder versions Admin commands to delete excess backup videos and reorder versions Dec 10, 2024
@susanodd

This comment was marked as outdated.

@susanodd susanodd requested a review from Woseseltops January 7, 2025 13:04
@susanodd susanodd changed the title Admin commands to delete excess backup videos and reorder versions Admin commands for gloss videos Jan 8, 2025
@susanodd
Copy link
Collaborator Author

susanodd commented Feb 3, 2025

I added high priority to this issue because there are still lots of NME and Perspective videos that point to the normal video, even after they were re-uploaded.

This issue has a filter on Perspective video, NME video, Backup video, as well as "wrong filename".

It's not possible to "witness" whether the videos are fixed or not without filters such as given here.

It seems the "package" is being used to obtain the video paths to see if the path is correct.

But the API and also the Signbank interface cannot delete the wrong things because that deletes the normal videos that are pointed to. (#1502)

In the Admin, some of the normal video objects (on the production server) indeed do not point to any file (no timestamp in the admin means no file exists). Most likely that file was deleted by deleting a perspective or NME video that pointed to it.

Copy link
Collaborator

@Woseseltops Woseseltops left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this initiative, although it's not 100% clear to me how the commands should be used. Perhaps you can create a page on the wiki for it, along with some examples of what will happen? Let me know if you have something, I will try to organize it for you :)

About code style, I have quite a few smaller stylistic things this time. Don't get discouraged by this! It's all small stuff.

Comment on lines 224 to 226
if file_extension not in ACCEPTABLE_VIDEO_EXTENSIONS:
# some other extension is present in the filename
file_extension = '.mp4'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't returning None and handling that outside this function be cleaner? I feel it's a bit weird that you cannot trust that an mp4 really is an mp4 . Or perhaps even throwing an error?

Copy link
Collaborator Author

@susanodd susanodd Feb 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

THE FUNCTION HAS BEEN REWRITTEN WITH COMMENTS

For this, no it's not true. Because the original "bak bak" format of backup videos did not include any video type file extension, it was not maintained whether this was an mp4 file. We assumed it was mp4 because the code used "ensure_mp4". But that was turned off at the request of UvA. The newer "bakNNN" includes the video file type as an extension before the backup extension, like .mp4.bakNNN.

Context of video file extension maybe being wrong

The "video type" filter was added because some are not actually mp4, but instead something else.
The CNGT script that converts the videos was broken. Jetske discovered that the frame rate of the original video and the converted video did not match. For a while it was hard-coded as mp4 in the filename, because it was supposedly being coverted.

The older m4v files are already mp4 format, so lots were merely just renamed as mp4.

Comment on lines 232 to 235
if not video_file_full_path:
return ''
if 'glossvideo' not in video_file_full_path:
return ''
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be cleaner to return None or throwing an error?

Copy link
Collaborator Author

@susanodd susanodd Feb 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this issue, this function is being called inside either a queryset loop, or as a filter in the Admin.

The caller can test whether it's empty, instead catching an error.

except (OSError, PermissionError):
if settings.DEBUG_VIDEOS:
print('delete_files exception GlossVideo OSError, PermissionError: ', str(self.videofile))
print('delete_files exception GlossVideoNME OSError, PermissionError: ', file_system_path)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps this function can return a boolean to indicate success? Seems weird to me that it just returns silently or puts something in the log if something goes wrong, but the code where you call this function wouldn't know.

@@ -1036,7 +1051,7 @@ def delete_files(self):
return
try:
os.unlink(file_system_path)
except OSError:
except (PermissionError, OSError):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps this function can return a boolean to indicate success? Seems weird to me that it just returns silently or puts something in the log if something goes wrong, but the code where you call this function wouldn't know.

Copy link
Collaborator Author

@susanodd susanodd Feb 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

THIS HAS BEEN MADE TO RETURN A BOOLEAN

try:
if m := re.search(r".+-(\d+)_(nme_\d+|nme_\d+_left|nme_\d+_right)$", filename_without_extension):
return True
return False
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return m := re.search(r".+-(\d+)_(nme_\d+|nme_\d+_left|nme_\d+_right)$", filename_without_extension):

Same for the functions below this one.

Copy link
Collaborator Author

@susanodd susanodd Feb 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments about code style and Django implementation

I'll go ahead and change it for readability. But I think that violates type checking and implementation efficiently (see below)

Nope, m := is a syntax error.

It returns None if there isn't a match, or a match object if there is.

But that's not of type Boolean.

Isn't what you suggest violating "type checking" guidelines?
Also, do we want to use memory for storing an object in the return value when it's not necessary?

I understand the comment, but is this wise regarding the compiler?

It can just throw away the search result instead of send it to the caller. (Original code)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just FYI: if you want to fully match a string, you could use re.fullmatch (see also https://docs.python.org/3/library/re.html#search-vs-match). And if you want True if there is a match and False there is none, cast it to a boolean with bool(). Then it becomes:

return bool(re.fullmatch(r".+-(\d+)_(nme_\d+|nme_\d+_left|nme_\d+_right)$", filename_without_extension))

Again just FYI, the changes @susanodd made per review are clean enough.

destination = os.path.join(WRITABLE_FOLDER, GLOSS_VIDEO_DIRECTORY,
dataset_dir, two_letter_dir, desired_filename)
desired_relative_path = os.path.join(GLOSS_VIDEO_DIRECTORY,
dataset_dir, two_letter_dir, desired_filename)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Wouldn't it make more sense to first determine the relative path, and then use that as the latter part of the absolute path?
  • I feel like we have similar code in other places... perhaps better to reuse other code that determines where videos should be stored?



@admin.action(description="Remove selected backups")
def remove_backups(modeladmin, request, queryset):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternative suggestion: instead of deleting, move them to an archive location that we can manually empty later.

relative_path = str(obj.videofile)
video_file_full_path = os.path.join(WRITABLE_FOLDER, str(obj.videofile))
if os.path.exists(video_file_full_path):
# remove the video file so the GlossVideo object can be deleted
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer this as guard clause

Comment on lines 363 to 364
# allow to erase the filename from an object if it has the wrong format and it is for a subclass video
# this is a patch for repairing doubly linked files
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make this a doscstring.

Comment on lines 466 to 476
def video_type(self, obj=None):
# if the file exists, this will display its timestamp in the list view
if obj is None or not str(obj.videofile):
return ""
import os
video_file_full_path = os.path.join(WRITABLE_FOLDER, str(obj.videofile))
if os.path.exists(video_file_full_path):
return video_file_type_extension(video_file_full_path)
else:
return ""

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here too I feel it would make more sense to return None or raise an error

Copy link
Collaborator Author

@susanodd susanodd Feb 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is a column "VIDEO TYPE" in the Admin. It makes no sense to return an error.

@susanodd
Copy link
Collaborator Author

susanodd commented Feb 18, 2025

Bullets

All of the code modifications have been made, major changes summarised below

  • Backup files moved to a garbage location DELETED_FILES_FOLDER; filenames are flattened to keep the location flat (this also works for non-Latin languages)

  • I revised as requested the "delete_files" function for NME videos and for Perspective videos so it returns a Boolean (this function wasn't from this issue, but suggest for in review.)
    -- Because it now returns True or False, I made it so when the name of the file is not correct, it merely erases the name (by the by, that resolves issue Domino effect on video deletion #1502) It returns True in this case, but merely erased the file name from the object. The caller of this function, "reversion" or "pre_delete signal, removes the object itself.

  • Text for Video Admin Wiki

Changed typo in name of function

Made delete_files return Boolean, print OS exception text in log

Moved functions from admin.py to models.py so they can be shared by models

Check on correct filename before deleting doubly linked file in normal delete
for symmetry. Also takes care of #1502 by checking if filename is correct before deleting
Renamed, added patterns, added comments, mp4 still last resort value
…trash location

Files renamed to include dataset and 2char folder at front of filename to keep trash flat.
Stored as a text file so it gets maintained in GitHub.
…n wiki.

Added text describing filenames and storage structure. This provides context for the commands and filters.
@susanodd susanodd requested a review from Woseseltops February 27, 2025 07:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants