diff --git a/.dockerignore b/.dockerignore index 619722f..e6b8aa8 100644 --- a/.dockerignore +++ b/.dockerignore @@ -1,4 +1,3 @@ -static/tmp* *~ __pycache__ .git diff --git a/.gitignore b/.gitignore index e0542fd..0a9b11b 100644 --- a/.gitignore +++ b/.gitignore @@ -71,9 +71,6 @@ gdrive_shared*/ tags .tags -# static archival files -static/tmp* - # VSCode .devcontainer devcontainer.json \ No newline at end of file diff --git a/README.md b/README.md index 07f5fb9..e2f703f 100644 --- a/README.md +++ b/README.md @@ -7,7 +7,6 @@ This application creates an HTML server that visualizes annotation components in - Interactive, searchable MMIF tree view with [JSTree](https://www.jstree.com/). - Embedded [Universal Viewer](https://universalviewer.io/) (assuming file refers to video and/or image document). - The application also includes tailored visualizations depending on the annotations present in the input MMIF: | Visualization | Supported CLAMS apps | |---|---| @@ -16,9 +15,7 @@ The application also includes tailored visualizations depending on the annotatio | Named entity annotations with [displaCy.](https://explosion.ai/demos/displacy-ent) | [SPACY](https://github.com/clamsproject/app-spacy-wrapper) | | | Screenshots & HTML5 video navigation of TimeFrames | [Chyron text recognition](https://github.com/clamsproject/app-chyron-text-recognition), [Slate detection](https://github.com/clamsproject/app-slatedetection), [Bars detection](https://github.com/clamsproject/app-barsdetection) | - - -Requirements: +## Requirements: - A command line interface. - Git (to get the code). @@ -31,7 +28,9 @@ To get this code if you don't already have it: $ git clone https://github.com/clamsproject/mmif-visualizer ``` -## Quick start +## Startup + +### Quick start If you just want to get the server up and running quickly, the repository contains a shell script `start_visualizer.sh` to immediately launch the visualizer in a container. You can invoke it with the following command: @@ -42,72 +41,78 @@ If you just want to get the server up and running quickly, the repository contai * The **required** `data_directory` argument should be the absolute or relative path of the media files on your machine which the MMIF files reference. * The **optional** `mount_directory` argument should be specified if your MMIF files point to a different directory than where your media files are stored on the host machine. For example, if your video, audio, and text data is stored locally at `/home/archive` but your MMIF files refer to `/data/...`, you should set this variable to `/data`. (If this variable is not set, the mount directory will default to the data directory) -For example, if your media files are stored at `/llc_data` and your MMIF files specify the document location as `"location": "file:///data/...`, you can start the visualizer with the following command: +For example, if your media files are stored at `/my_data` and your MMIF files specify the document location as `"location": "file:///data/...`, you can start the visualizer with the following command: ``` -./start_visualizer.sh /llc_data /data +./start_visualizer.sh /my_data /data ``` -The server can then be accessed at `http://localhost:5000/upload` - -## Running the server in a container - -Download or clone this repository and build an image using the `Dockerfile` (you may use another name for the -t parameter, for this example we use `clams-mmif-visualizer` throughout). **NOTE**: if using podman, just substitute `docker` for `podman` in the following commands. +The server can then be accessed at `http://localhost:5001/upload` -```bash -$ docker build . -f Containerfile -t clams-mmif-visualizer -``` +The following is breakdown of the script's functionality: -In these notes we assume that the data are in a local directory named `/Users/Shared/archive` with sub directories `audio`, `image`, `text` and `video` (those subdirectories are standard in CLAMS, but the parent directory could be any directory depending on your local set up). We can now run a Docker container with - -```bash -$ docker run --rm -d -p 5000:5000 -v /Users/Shared/archive:/data clams-mmif-visualizer -``` +### Running the server natively -See the *Data source repository and input MMIF file* section below for a description of the MMIF file. Assuming you have not made any changes to the directory structure you can use the example MMIF files in the `input` folder. - -**Some background** - -With the docker command above we do two things of note: +First install the python dependencies listed in `requirements.txt`: -1. The container port 5000 (the default for a Flask server) is exposed to the same port on your Docker host (your local computer) with the `-p` option. -2. The local data repository `/Users/Shared/archive` is mounted to `/data` on the container with the `-v` option. +````bash +$ pip install -r requirements.txt +```` -Another useful piece of information is that the Flask server on the Docker container has no direct access to `/data` since it can only see data in the `static` directory of this repository. Therefore we have created a symbolic link `static/data` that links to `/data`: +You will also need to install opencv-python if you are not running within a container (`pip install opencv-python`). +Then, to run the server do: ```bash -$ ln -s /data static/data +$ python app.py ``` -With this, the mounted directory `/data` in the container is accessable from inside the `/app/static` directory of the container. You do not need to use this command unless you change your set up because the symbolic link is part of this repository. - +Running the server natively means that the source media file paths in the target MMIF file are all accessible in the local file system, under the same directory paths. +If that's not the case, and the paths in the MMIF is beyond your FS permission, using container is recommended. See the next section for an example. +#### Data source repository and example MMIF file +This repository contains an example MMIF file in `example/whisper-spacy.json`. This file refers to three media files: -## Running the server locally +1. service-mbrs-ntscrm-01181182.mp4 +2. service-mbrs-ntscrm-01181182.wav +3. service-mbrs-ntscrm-01181182.txt + +> [!NOTE] +> Note on source/copyright: these documents are sourced from [the National Screening Room collection in the Library of Congress Online Catalog](https://hdl.loc.gov/loc.mbrsmi/ntscrm.01181182). The collection provides the following copyright information: +> > The Library of Congress is not aware of any U.S. copyright or other restrictions in the vast majority of motion pictures in these collections. Absent any such restrictions, these materials are free to use and reuse. -First install the python dependencies listed in `requirements.txt`: +These files can be found in the directory `example/example-documents`. But according to the `whisper-spacy.json` MMIF file, those three files should be found in their respective subdirectories in `/data`. +Easy way to align these paths is probably to create a symbolic link to the `example-documents` directory in the `/data` directory. +However, since `/data` is located at the root directory, you might not have permission to write a new symlink to the FS root. +In this case you can more easily re-map the `examples/example-documents` directory to `/data` by using the `-v` option in the docker-run command. See below. -````bash -$ pip install -r requirements.txt -```` +### Running the server in a container -You will also need to install opencv-python if you are not running within a container (`pip install opencv-python`). +Download or clone this repository and build an image using the `Containerfile` (you may use another name for the -t parameter, +for this example we use `clams-mmif-visualizer` throughout). -Let's again assume that the data are in a local directory `/Users/Shared/archive` with sub directories `audio`, `image`, `text` and`video`. You need to copy, symlink, or mount that local directory into the `static` directory. Note that the `static/data` symbolic link that is in the repository is set up to work with the docker containers, if you keep it in that form your data need to be in `/data`, otherwise you need to change the link to fit your needs, for example, you could remove the symbolic link and replace it with one that uses your local directory: +> [!NOTE] +> if using podman, just substitute `docker` for `podman` in the following commands. ```bash -$ rm static/data -$ ln -s /Users/Shared/archive static/data +$ docker build . -f Containerfile -t clams-mmif-visualizer ``` -To run the server do: +In these notes we assume that the data are in a local directory named `/home/myuser/public` with subdirectories `audio`, `image`, `text` and `video`. We can now run a container with ```bash -$ python app.py +$ docker run --rm -d -p 5001:5000 -v /home/myuser/public:/data clams-mmif-visualizer ``` +> [!NOTE] +> With the docker command above we do two things of note: +> 1. The container port 5000 (the default for a Flask server) is exposed to the same port on your host (your local computer) with the `-p` option. +> 2. The local data repository `/home/myuser/public` is mounted to `/data` on the container with the `-v` option. + +Now, when you use the `example/example-documents` directory as the data source to visualize `examples/whisper-spacy.json` MMIF file, you need to triple-mount the example directory to the container, as `audio`, `video`, and `text` respectively. -## Uploading Files -MMIF files can be uploaded to the visualization server one of two ways: +$ docker run --rm -d -p 5001:5000 -v $(pwd)/example/example-documents:/data/audio -v $(pwd)/example/example-documents:/data/video -v $(pwd)/example/example-documents:/data/text clams-mmif-visualizer + +## Usage +Use the visualizer by uploading files. MMIF files can be uploaded to the visualization server one of two ways: * Point your browser to http://0.0.0.0:5000/upload, click "Choose File" and then click "Visualize". This will generate a static URL containing the visualization of the input file (e.g. `http://localhost:5000/display/HaTxbhDfwakewakmzdXu5e`). Once the file is uploaded, the page will automatically redirect to the file's visualization. * Using a command line, enter: ``` @@ -117,31 +122,3 @@ MMIF files can be uploaded to the visualization server one of two ways: The server will maintain a cache of up to 50MB for these temporary files, so the visualizations can be repeatedly accessed without needing to re-upload any files. Once this limit is reached, the server will delete stored visualizations until enough space is reclaimed, drawing from oldest/least recently accessed pages first. If you attempt to access the /display URL of a deleted file, you will be redirected back to the upload page instead. - -## Data source repository and input MMIF file -The data source includes video, audio, and text (transcript) files that are subjects for the CLAMS analysis tools. As mentioned above, to make this visualizer work with those files and be able to display the contents on the web browser, those source files need to be accessible from inside the `static` directory. - -This repository contains an example MMIF file in `input/whisper-spacy.json`. This file refers to three media files: - -1. service-mbrs-ntscrm-01181182.mp4 -2. service-mbrs-ntscrm-01181182.wav -3. service-mbrs-ntscrm-01181182.txt - -These files can be found in the directory `input/example-documents`. They can be moved anywhere on the host machine, as long as they are placed in the subdirectories `video`, `audio`, and `text` respectively. (e.g. `/Users/Shared/archive/video`, etc.) - -According to the MMIF file, those three files should be found in their respective subdirectories in `/data`. The Flask server will look for these files in `static/data/video`, `static/data/audio` and `static/data/text`, amd those directories should point at the appropriate location: - -- If you run the visualizer in a Docker container, then the `-v` option in the docker-run command is used to mount the local data directory `/Users/shared/archive` to the `/data` directory on the container and the `static/data` symlink already points to that. -- If you run the visualizer on your local machine without using a container, then you have a couple of options (where you may need to remove the current link first): - - Make sure that the `static/data` symlink points at the local data directory - `$ ln -s /Users/Shared/archive/ static/data` - - Copy the contents of `/Users/Shared/archive` into `static/data`. - - You could choose to copy the data to any spot in the `static` folder but then you would have to edit the MMIF input file. - - ---- -Note on source/copyright: these documents are sourced from [the National Screening Room collection in the Library of Congress Online Catalog](https://hdl.loc.gov/loc.mbrsmi/ntscrm.01181182). The collection provides the following copyright information: - -> The Library of Congress is not aware of any U.S. copyright or other restrictions in the vast majority of motion pictures in these collections. Absent any such restrictions, these materials are free to use and reuse. - ---- diff --git a/app.py b/app.py index 7294b59..bdfecaa 100644 --- a/app.py +++ b/app.py @@ -21,7 +21,7 @@ def index(): def ocr(): try: data = dict(request.json) - mmif_str = open(cache.get_cache_path() / data["mmif_id"] / "file.mmif").read() + mmif_str = open(cache.get_cache_root() / data["mmif_id"] / "file.mmif").read() mmif = Mmif(mmif_str) ocr_view = mmif.get_view_by_id(data["view_id"]) return prepare_ocr_visualization(mmif, ocr_view, data["mmif_id"]) @@ -67,23 +67,29 @@ def upload(): def invalidate_cache(): app.logger.debug(f"Request to invalidate cache on {request.args}") if not request.args.get('viz_id'): + app.logger.debug("Invalidating entire cache.") cache.invalidate_cache() return redirect("/upload") viz_id = request.args.get('viz_id') - in_mmif = open(cache.get_cache_path() / viz_id / 'file.mmif', 'rb').read() + in_mmif = open(cache.get_cache_root() / viz_id / 'file.mmif', 'rb').read() + app.logger.debug(f"Invalidating {viz_id} from cache.") cache.invalidate_cache([viz_id]) return upload_file(in_mmif) @app.route('/display/') def display(viz_id): - try: - path = cache.get_cache_path() / viz_id + path = cache.get_cache_root() / viz_id + app.logger.debug(f"Displaying visualization {viz_id} from {path}") + if os.path.exists(path / "index.html"): + app.logger.debug(f"Visualization {viz_id} found in cache.") set_last_access(path) with open(os.path.join(path, "index.html")) as f: html_file = f.read() return html_file - except FileNotFoundError: + else: + app.logger.debug(f"Visualization {viz_id} not found in cache.") + os.remove(path) flash("File not found -- please upload again (it may have been deleted to clear up cache space).") return redirect("/upload") @@ -95,12 +101,12 @@ def send_js(path): def render_mmif(mmif_str, viz_id): mmif = Mmif(mmif_str) - media = documents_to_htmls(mmif, viz_id) - app.logger.debug(f"Prepared Media: {[m[0] for m in media]}") + htmlized_docs = documents_to_htmls(mmif, viz_id) + app.logger.debug(f"Prepared document: {[d[0] for d in htmlized_docs]}") annotations = prep_annotations(mmif, viz_id) app.logger.debug(f"Prepared Annotations: {[annotation[0] for annotation in annotations]}") return render_template('player.html', - media=media, viz_id=viz_id, annotations=annotations) + docs=htmlized_docs, viz_id=viz_id, annotations=annotations) def upload_file(in_mmif): @@ -109,7 +115,7 @@ def upload_file(in_mmif): in_mmif_str = in_mmif_bytes.decode('utf-8') viz_id = hashlib.sha1(in_mmif_bytes).hexdigest() app.logger.debug(f"Visualization ID: {viz_id}") - path = cache.get_cache_path() / viz_id + path = cache.get_cache_root() / viz_id app.logger.debug(f"Visualization Directory: {path}") try: os.makedirs(path) @@ -136,9 +142,14 @@ def upload_file(in_mmif): if __name__ == '__main__': # Make path for temp files - cache_path = cache.get_cache_path() - if not os.path.exists(cache_path): - os.makedirs(cache_path) + cache_path = cache.get_cache_root() + cache_symlink_path = os.path.join(app.static_folder, cache._CACHE_DIR_SUFFIX) + if os.path.islink(cache_symlink_path): + os.unlink(cache_symlink_path) + elif os.path.exists(cache_symlink_path): + raise RuntimeError(f"Expected {cache_symlink_path} to be a symlink (for re-linking to a new cache dir, " + f"but it is a real path.") + os.symlink(cache_path, cache_symlink_path) # to avoid runtime errors for missing keys when using flash() alphabet = 'abcdefghijklmnopqrstuvwxyz1234567890' @@ -148,4 +159,4 @@ def upload_file(in_mmif): if len(sys.argv) > 2 and sys.argv[1] == '-p': port = int(sys.argv[2]) - app.run(port=port, host='0.0.0.0', debug=True, use_reloader=False) + app.run(port=port, host='0.0.0.0', debug=True, use_reloader=True) diff --git a/cache.py b/cache.py index 7c9660b..c9b38c1 100644 --- a/cache.py +++ b/cache.py @@ -1,31 +1,28 @@ import os -import time +import pathlib import shutil +import tempfile import threading -import pathlib - -from utils import app +import time lock = threading.Lock() - -def get_cache_path(): - return pathlib.Path(app.static_folder) / "tmp" +# module constants are unchanged throughout multiple "imports" +_CACHE_DIR_SUFFIX = "mmif-viz-cache" +_CACHE_DIR_ROOT = tempfile.TemporaryDirectory(suffix=_CACHE_DIR_SUFFIX) -def get_cache_relpath(full_path): - return str(full_path)[len(app.static_folder):] +def get_cache_root(): + return pathlib.Path(_CACHE_DIR_ROOT.name) def invalidate_cache(viz_ids): if not viz_ids: - app.logger.debug("Invalidating entire cache.") - shutil.rmtree(get_cache_path()) - os.makedirs(get_cache_path()) + shutil.rmtree(get_cache_root()) + os.makedirs(get_cache_root()) else: for v in viz_ids: - app.logger.debug(f"Invalidating {v} from cache.") - shutil.rmtree(get_cache_path() / v) + shutil.rmtree(get_cache_root() / v) def set_last_access(path): @@ -35,9 +32,9 @@ def set_last_access(path): def scan_tmp_directory(): oldest_accessed_dir = {"dir": None, "access_time": None} - total_size = sum(f.stat().st_size for f in get_cache_path().glob('**/*') if f.is_file()) + total_size = sum(f.stat().st_size for f in get_cache_root().glob('**/*') if f.is_file()) # this will be some visualization IDs - for p in get_cache_path().glob('*'): + for p in get_cache_root().glob('*'): if not (p / 'last_access.txt').exists(): oldest_accessed_dir = {"dir": p, "access_time": 0} elif oldest_accessed_dir["dir"] is None: diff --git a/displacy/__init__.py b/displacy/__init__.py index e73a301..e1e0dff 100644 --- a/displacy/__init__.py +++ b/displacy/__init__.py @@ -1,15 +1,8 @@ import os -from spacy import displacy - -from mmif.serialize import Mmif, View, Annotation -from mmif.vocabulary import AnnotationTypes -from mmif.vocabulary import DocumentTypes from lapps.discriminators import Uri - - -def get_displacy(mmif: Mmif): - return displacy_dict_to_ent_html(mmif_to_displacy_dict(mmif)) +from mmif.serialize import Mmif, View, Annotation +from spacy import displacy def visualize_ner(mmif: Mmif, view: View, document_id: str, app_root: str) -> str: diff --git a/input/example-documents/service-mbrs-ntscrm-01181182.mp4 b/examples/example-documents/service-mbrs-ntscrm-01181182.mp4 similarity index 100% rename from input/example-documents/service-mbrs-ntscrm-01181182.mp4 rename to examples/example-documents/service-mbrs-ntscrm-01181182.mp4 diff --git a/input/example-documents/service-mbrs-ntscrm-01181182.txt b/examples/example-documents/service-mbrs-ntscrm-01181182.txt similarity index 100% rename from input/example-documents/service-mbrs-ntscrm-01181182.txt rename to examples/example-documents/service-mbrs-ntscrm-01181182.txt diff --git a/input/example-documents/service-mbrs-ntscrm-01181182.wav b/examples/example-documents/service-mbrs-ntscrm-01181182.wav similarity index 100% rename from input/example-documents/service-mbrs-ntscrm-01181182.wav rename to examples/example-documents/service-mbrs-ntscrm-01181182.wav diff --git a/input/image-example.json b/examples/image-example.json similarity index 100% rename from input/image-example.json rename to examples/image-example.json diff --git a/input/ocr-test-files/test-ocr-1.mmif b/examples/ocr-test-files/test-ocr-1.mmif similarity index 100% rename from input/ocr-test-files/test-ocr-1.mmif rename to examples/ocr-test-files/test-ocr-1.mmif diff --git a/input/ocr-test-files/test-ocr-10.mmif b/examples/ocr-test-files/test-ocr-10.mmif similarity index 100% rename from input/ocr-test-files/test-ocr-10.mmif rename to examples/ocr-test-files/test-ocr-10.mmif diff --git a/input/ocr-test-files/test-ocr-11.mmif b/examples/ocr-test-files/test-ocr-11.mmif similarity index 100% rename from input/ocr-test-files/test-ocr-11.mmif rename to examples/ocr-test-files/test-ocr-11.mmif diff --git a/input/ocr-test-files/test-ocr-12.mmif b/examples/ocr-test-files/test-ocr-12.mmif similarity index 100% rename from input/ocr-test-files/test-ocr-12.mmif rename to examples/ocr-test-files/test-ocr-12.mmif diff --git a/input/ocr-test-files/test-ocr-13.mmif b/examples/ocr-test-files/test-ocr-13.mmif similarity index 100% rename from input/ocr-test-files/test-ocr-13.mmif rename to examples/ocr-test-files/test-ocr-13.mmif diff --git a/input/ocr-test-files/test-ocr-14.mmif b/examples/ocr-test-files/test-ocr-14.mmif similarity index 100% rename from input/ocr-test-files/test-ocr-14.mmif rename to examples/ocr-test-files/test-ocr-14.mmif diff --git a/input/ocr-test-files/test-ocr-15.mmif b/examples/ocr-test-files/test-ocr-15.mmif similarity index 100% rename from input/ocr-test-files/test-ocr-15.mmif rename to examples/ocr-test-files/test-ocr-15.mmif diff --git a/input/ocr-test-files/test-ocr-2.mmif b/examples/ocr-test-files/test-ocr-2.mmif similarity index 100% rename from input/ocr-test-files/test-ocr-2.mmif rename to examples/ocr-test-files/test-ocr-2.mmif diff --git a/input/ocr-test-files/test-ocr-3.mmif b/examples/ocr-test-files/test-ocr-3.mmif similarity index 100% rename from input/ocr-test-files/test-ocr-3.mmif rename to examples/ocr-test-files/test-ocr-3.mmif diff --git a/input/ocr-test-files/test-ocr-4.mmif b/examples/ocr-test-files/test-ocr-4.mmif similarity index 100% rename from input/ocr-test-files/test-ocr-4.mmif rename to examples/ocr-test-files/test-ocr-4.mmif diff --git a/input/ocr-test-files/test-ocr-5.mmif b/examples/ocr-test-files/test-ocr-5.mmif similarity index 100% rename from input/ocr-test-files/test-ocr-5.mmif rename to examples/ocr-test-files/test-ocr-5.mmif diff --git a/input/ocr-test-files/test-ocr-6.mmif b/examples/ocr-test-files/test-ocr-6.mmif similarity index 100% rename from input/ocr-test-files/test-ocr-6.mmif rename to examples/ocr-test-files/test-ocr-6.mmif diff --git a/input/ocr-test-files/test-ocr-7.mmif b/examples/ocr-test-files/test-ocr-7.mmif similarity index 100% rename from input/ocr-test-files/test-ocr-7.mmif rename to examples/ocr-test-files/test-ocr-7.mmif diff --git a/input/ocr-test-files/test-ocr-8.mmif b/examples/ocr-test-files/test-ocr-8.mmif similarity index 100% rename from input/ocr-test-files/test-ocr-8.mmif rename to examples/ocr-test-files/test-ocr-8.mmif diff --git a/input/ocr-test-files/test-ocr-9.mmif b/examples/ocr-test-files/test-ocr-9.mmif similarity index 100% rename from input/ocr-test-files/test-ocr-9.mmif rename to examples/ocr-test-files/test-ocr-9.mmif diff --git a/input/whisper-spacy.json b/examples/whisper-spacy.json similarity index 100% rename from input/whisper-spacy.json rename to examples/whisper-spacy.json diff --git a/iiif_utils.py b/iiif_utils.py index 7be9eb8..7281751 100644 --- a/iiif_utils.py +++ b/iiif_utils.py @@ -84,13 +84,14 @@ def add_canvas_from_documents(in_mmif, iiif_json): iiif_json["sequences"][0]["canvases"].append(canvas) break # todo currently only supports single document, needs more work to align canvas values + def build_document_url(document): - ''' + """ This trims off all of the path to the document except the filename then prepends data/video/. This is so mmif's from running locally can still be found if the viewe r is run in docker, assuming the volume mount or symlink is correctly set. - ''' + """ location = document.location if location.startswith("file://"): location = document.location[7:] @@ -109,7 +110,7 @@ def add_structure_from_timeframe(in_mmif: Mmif, iiif_json: Dict): def save_manifest(iiif_json: Dict, viz_id) -> str: # generate a iiif manifest and save output file manifest = tempfile.NamedTemporaryFile( - 'w', dir=str(cache.get_cache_path() / viz_id), suffix='.json', delete=False) + 'w', dir=str(cache.get_cache_root() / viz_id), suffix='.json', delete=False) json.dump(iiif_json, manifest, indent=4) return manifest.name diff --git a/input/kaldi-spacy.json b/input/kaldi-spacy.json deleted file mode 100644 index e69de29..0000000 diff --git a/ocr.py b/ocr.py index 41aa30e..00af970 100644 --- a/ocr.py +++ b/ocr.py @@ -14,7 +14,9 @@ class OCRFrame(): - """Class representing an (aligned or otherwise) set of OCR annotations for a single frame""" + """ + Class representing an (aligned or otherwise) set of OCR annotations for a single frame + """ def __init__(self, anno, mmif): self.text = [] @@ -110,7 +112,9 @@ def get_ocr_frames(view, mmif, fps): def paginate(frames_list): - """Generate pages from a list of frames""" + """ + Generate pages from a list of frames + """ pages = [[]] n_frames_on_page = 0 for frame_num, frame in frames_list: @@ -127,10 +131,12 @@ def paginate(frames_list): def render_ocr(mmif_id, vid_path, view_id, page_number): - """Iterate through frames and display the contents/alignments.""" + """ + Iterate through frames and display the contents/alignments. + """ # Path for storing temporary images generated by cv2 cv2_vid = cv2.VideoCapture(vid_path) - tn_data_fname = cache.get_cache_path() / mmif_id / f"{view_id}-pages.json" + tn_data_fname = cache.get_cache_root() / mmif_id / f"{view_id}-pages.json" thumbnail_pages = json.load(open(tn_data_fname)) page = thumbnail_pages[str(page_number)] prev_frame_cap = None @@ -162,7 +168,7 @@ def render_ocr(mmif_id, vid_path, view_id, page_number): def make_image_directory(mmif_id): # Make path for temp OCR image files or clear image files if it exists - path = cache.get_cache_path() / mmif_id / "img" + path = cache.get_cache_root() / mmif_id / "img" if os.path.exists(path): shutil.rmtree(path) os.makedirs(path) @@ -218,7 +224,9 @@ def is_duplicate_image(prev_frame, frame, cv2_vid): def round_boxes(boxes): - # To account for jittery bounding boxes in OCR annotations + """ + To account for jittery bounding boxes in OCR annotations + """ rounded_boxes = [] for box in boxes: rounded_box = [] @@ -247,6 +255,6 @@ def get_ocr_views(mmif): def save_json(data, view_id, mmif_id): - path = cache.get_cache_path() / mmif_id / f"{view_id}-pages.json" + path = cache.get_cache_root() / mmif_id / f"{view_id}-pages.json" with open(path, 'w') as f: json.dump(data, f) diff --git a/start_visualizer.sh b/start_visualizer.sh index a23ccc4..6802896 100755 --- a/start_visualizer.sh +++ b/start_visualizer.sh @@ -27,5 +27,5 @@ else fi # Start visualizer $container_engine build . -f Containerfile -t clams-mmif-visualizer -$container_engine run -d --name clams-mmif-visualizer --rm -p 5000:5000 -e PYTHONUNBUFFERED=1 -v $datadir:$mountdir -v $datadir:/app/static/$mountdir clams-mmif-visualizer -echo "MMIF Visualizer is running in the background and can be accessed at http://localhost:5000/. To shut it down, run '$container_engine kill clams-mmif-visualizer'" \ No newline at end of file +$container_engine run -d --name clams-mmif-visualizer --rm -p 5001:5000 -e PYTHONUNBUFFERED=1 -v $datadir:$mountdir -v $datadir:/app/static/$mountdir clams-mmif-visualizer +echo "MMIF Visualizer is running in the background and can be accessed at http://localhost:5001/. To shut it down, run '$container_engine kill clams-mmif-visualizer'" \ No newline at end of file diff --git a/static/data b/static/data deleted file mode 120000 index 249cda9..0000000 --- a/static/data +++ /dev/null @@ -1 +0,0 @@ -/data \ No newline at end of file diff --git a/static/tmp/.gitignore b/static/tmp/.gitignore deleted file mode 100644 index 86d0cb2..0000000 --- a/static/tmp/.gitignore +++ /dev/null @@ -1,4 +0,0 @@ -# Ignore everything in this directory -* -# Except this file -!.gitignore \ No newline at end of file diff --git a/temp/do_not_delete_this_dir b/temp/do_not_delete_this_dir deleted file mode 100644 index e69de29..0000000 diff --git a/templates/ocr.html b/templates/ocr.html index ff584eb..7daea2b 100644 --- a/templates/ocr.html +++ b/templates/ocr.html @@ -1,7 +1,7 @@
{% for frame_num, frame in page %} - {% set filename = "/tmp/" + mmif_id + "/img/" + frame["id"] %} + {% set filename = "/mmif-viz-cache/" + mmif_id + "/img/" + frame["id"] %} {% set id = frame["id"] %} {% set boxes = frame["boxes"] %} {% set secs = frame["secs"] %} diff --git a/templates/player.html b/templates/player.html index 5da61c0..3f56ea8 100644 --- a/templates/player.html +++ b/templates/player.html @@ -117,9 +117,9 @@

Visualizing MMIF