FilmEditDetection

About

This is a CLI tool to detect cuts in films, especially old films with noisy and broken frames. This tool basically takes an input video and stores cuts in various formats - frame index of cuts alongwith type of cut in CSV format, timestamp in seconds of start and end shots in the film, MEP json format containing shots timestamps, and a format which is supported by cinemetrics for further analysis. This tool was produced for Google Summer Code 2021 with RedHenLabs and Media Ecology Project.

Detailed working of this tool is included in the final submission blog.

Getting Started

Prerequisites

You need Python 3.x and Conda package manager to run this tool

Installation

For installing this tool with pretrained model, follow the steps below :

Clone this repository git clone https://github.com/tre3x/FilmEditDetection.git
Switch to cutdetector-CLI branch git checkout cutdetector-cli
Install the neccessary dependencies by executing conda env create -f environment.yml

Dataset Description

Two datasets were used in training the validation/softcut-detection module.

The first dataset used is TRECVID IACC.3 dataset. This video dataset comprises of annotated videos which are further used to create synthetically generated snippets containing hard-cut, soft-cut or no-cut. One way to generate the training data is defined as follows: Go to data/synthetic_data and run run.sh with neccessary parameters. All parameters can be tweaked from the shell file. This step will download videos from the TRECVID IACC.3 dataset, and process them into small snippets of N frames containing cuts or no-cuts.
The second dataset used is Media Ecology Project's B&W video data, which contains fully annotated films, which has great resemblence to the kind of data we are actually dealing with in this work. One way to generate the training data is defined as follows: Go to data/MEP_data and run run.sh with neccessary parameters. This step is only valid for Media Ecology Project's video data, and it is designed to produce small snippets with cuts from the very specific annotation format it uses.

Now we have the data that can be used for training and testing.

Pre-Trained Model

One pre-trained model of specific configuration is available, which can be found here.

Training

Now we have the data, we can train the model by running data/train.sh. The training parameters can be tweaked in the particular file. For training the model :

Run cd model
Tweak the training parameters in the file train.sh
Run ./train.sh

Usage

To run the tool on local machine, follow the steps in the Installation section. After setting up the environment, Run :

python main.py --vidpath <path/to/video> --modpath <path/to/model> --operation <result_output_format> --oudir <path/to/output/directory> --config <path/to/config>

<path/to/video> - Path of target video path.
<modpath> - Path of the model trained/downloaded previously. Default path : .\trained-models\cutdetection-model
<result_output_format> - Output format of the result. Available formats : cuts - CSV file containing frame index of cuts, shots - CSV file containing timestamps of shots, mepformat - JSON format containing timestamps of shots in Media Ecology Project annotation format, cinemetrics - a '.cns' formatted file which is supported for uploading to cinemetrics, read-only - to get timestamps of cut frames at the terminal, without writing the data to any file. Default mode : read-only
<path/to/output/directory> - Path to directory where output files will be saved, on the basis of operations performed on the film.
<path/to/config> - Path to network configuration file. This file (in json format) should contain all information about the networks used in the tool. Few default congigs are stored in the config folder. Default path : .\configs\vgg16.json

To get help about the syntax format : python main.py --help

MEP Dataset

The Media Ecology Project's black and white manually annotated video dataset highly resembles the kind of old archival films whose shot boundaries are hard to predict by other computer-based cut detection tools. Hence, the annotated video dataset is very helpful in training the deep learning models used in this tool and making the prediction of shot boundaries in other archival films. The annotated dataset is made possible with the efforts of Mark J. Williams, John P. Bell, and students from Dartmouth College and University of Chicago:

Yangqiao Lu, University of Chicago
Emily Hester, Dartmouth College
Elijah Czysz, Dartmouth College
Noah Hensley, Dartmouth College
Ileana Sung, Dartmouth College
Adithi Jayaraman, Dartmouth College
Momina Naveed, Dartmouth College
Maria Graziano, Dartmouth College
Kevin Chen, Dartmouth College
Frandy Rodriguez, Dartmouth College

Cinemetrics

Besides providing a feature to generate '.cns' formatted file which works with cinemetrics, this tool also provides a feature to upload detected cut result directly to the cinemetrics server for further computation. The feature by extending the original implementation as follows:

python main.py --vidpath <path/to/video> --modpath <path/to/model> --operation <result_output_format> --config <path/to/config> --config <path/to/config> --cinemetrics_submit --yname <submitter name> --mtitle <movie_title> --myear <movie_year> --email <submitter_email>

Singularity Usage

To access Singularity image of this tool in the CWRU HPC environment :

Connect to CWRU VPN
ssh into HPC

ssh [email protected]

Navigate to this project folder directory

cd /mnt/rds/redhen/gallina/home/sxg1139/GSOC_SINGULARITY

Request a GPU node for computation

srun -p gpu -C gpu2080 gpu=gres:1 --pty bash

Load Singularity into HPC environment

module load singularity/3.7.1

Run the image

singularity run FilmEditDetection.img --vidpath <path/to/video>

<path/to/video> denotes the absolute input video path

Name		Name	Last commit message	Last commit date
Latest commit History 115 Commits
configs		configs
data		data
model		model
sample_json		sample_json
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
cutdetector.py		cutdetector.py
environment.yml		environment.yml
main.py		main.py
run.py		run.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FilmEditDetection

About

Getting Started

Prerequisites

Installation

Dataset Description

Pre-Trained Model

Training

Usage

MEP Dataset

Cinemetrics

Singularity Usage

About

Releases

Packages

Languages

License

tre3x/FilmEditDetection

Folders and files

Latest commit

History

Repository files navigation

FilmEditDetection

About

Getting Started

Prerequisites

Installation

Dataset Description

Pre-Trained Model

Training

Usage

MEP Dataset

Cinemetrics

Singularity Usage

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages