|
| 1 | +# Cuneiform-Sign-Detection-Code |
| 2 | + |
| 3 | +Author: Tobias Dencker - < [email protected]> |
| 4 | + |
| 5 | +This is the code repository for the article submission on "Deep learning of cuneiform sign detection with weak supervision using transliteration alignment". |
| 6 | + |
| 7 | +This repository contains code to execute the proposed iterative training procedure as well as code to evaluate and visualize results. |
| 8 | +Moreover, we provide pre-trained models of the cuneiform sign detector for Neo-Assyrian script after iterative training on the [Cuneiform Sign Detection Dataset](https://compvis.github.io/cuneiform-sign-detection-dataset/). |
| 9 | +Finally, we provide a web application for the analysis of tablet images with the help of a pre-trained cuneiform sign detector. |
| 10 | + |
| 11 | +<img src="http://cunei.iwr.uni-heidelberg.de/cuneiformbrowser/functions/images_decent.jpg" alt="sign detections on tablet images: yellow box indicate TP and blue FP detections" width="700"/> |
| 12 | +<!--- <img src="http://cunei.iwr.uni-heidelberg.de/cuneiformbrowser/functions/images_difficult.jpg" alt="Web interface detection" width="500"/> --> |
| 13 | + |
| 14 | +## Repository description |
| 15 | + |
| 16 | +- General structure: |
| 17 | + - `data`: tablet images, annotations, transliterations, metadata |
| 18 | + - `experiments`: training, testing, evaluation and visualization |
| 19 | + - `lib`: project library code |
| 20 | + - `results`: generated detections (placed, raw and aligned), network weights, logs |
| 21 | + - `scripts`: scripts to run the alignment and placement step of iterative training |
| 22 | + |
| 23 | + |
| 24 | +### Use cases |
| 25 | + |
| 26 | +- Pre-processing of training data |
| 27 | + - line detection |
| 28 | +- Iterative training |
| 29 | + - generate sign annotations (aligned and placed detections) |
| 30 | + - sign detector training |
| 31 | +- Evaluation (on test set) |
| 32 | + - raw detections |
| 33 | + - placed detections |
| 34 | + - aligned detections |
| 35 | +- Test & visualize |
| 36 | + - line segmentation and post-processing |
| 37 | + - line-level and sign-level alignments |
| 38 | + - TP/FP for raw, aligned and placed detections (full tablet and crop level) |
| 39 | + |
| 40 | + |
| 41 | +### Pre-processing |
| 42 | +As pre-processing of the training data line detections are obtained for all tablet images before iterative training. |
| 43 | +- use jupyter notebooks (`experiments/line_segmentation/`) for train, eval of line segmentation network and to perform line detection on all tablet images of train set |
| 44 | + |
| 45 | + |
| 46 | +### Training |
| 47 | +*Iterative training* alternates between generating aligned and placed detections and training a new sign detector: |
| 48 | +1. use command-line scripts (`scripts/generate/`) for running alignment and placement step of iterative training |
| 49 | +2. use jupyter notebooks (`experiments/sign_detector/`) for sign detector training step of iterative training |
| 50 | + |
| 51 | +To keep track of the sign detector and generated sign annotations of each iteration of iterative training (stored in `results/`), |
| 52 | +we follow the convention to label the sign detector with a *model version* (e.g. v002) |
| 53 | +which is also used to label the raw, aligned and placed detections based on this detector. |
| 54 | +Besides providing a model version, a user also selects which subsets of the training data to use for the generation of new annotations. |
| 55 | +In particular, *subsets of SAAo collections* (e.g. saa01, saa05, saa08) are selected, when running the scripts under `scripts/generate/`. |
| 56 | +To enable the evaluation on the test set, it is necessary to include the collections (test, saa06). |
| 57 | + |
| 58 | + |
| 59 | +### Evaluation |
| 60 | +Use the [*test sign detector notebook*](./experiments/sign_detector/test_sign_detector.ipynb) in order to test the performance of the trained sign detector (mAP) on the test set or other subsets of the dataset. |
| 61 | +In `experiments/alignment_evaluation/` you find further notebooks for evaluation and visualization of line-level and sign-level alignments and TP/FP for raw, aligned and placed detections (full tablet and crop level). |
| 62 | + |
| 63 | + |
| 64 | +### Pre-trained models |
| 65 | + |
| 66 | +We provide pre-trained models in the form of [PyTorch model files](https://pytorch.org/tutorials/beginner/saving_loading_models.html) for the line segmentation network as well as the sign detector. |
| 67 | + |
| 68 | +| Model name | Model type | Train annotations | |
| 69 | +|----------------|-------------------|------------------------| |
| 70 | +| [lineNet_basic_vpub.pth](http://cunei.iwr.uni-heidelberg.de/cuneiformbrowser/model_weights/lineNet_basic_vpub.pth) | line segmentation | 410 lines | |
| 71 | + |
| 72 | +For the sign detector, we provide the best weakly supervised model (fpn_net_vA) and the best semi-supervised model (fpn_net_vF). |
| 73 | + |
| 74 | +| Model name | Model type | Weak supervision in training | Annotations in training | mAP on test_full | |
| 75 | +|----------------|-------------------|-------------------|------------------------|------------------------| |
| 76 | +| [fpn_net_vA.pth](http://cunei.iwr.uni-heidelberg.de/cuneiformbrowser/model_weights/fpn_net_vA.pth) | sign detector | saa01, saa05, saa08, saa10, saa13, saa16 | None | 45.3 | |
| 77 | +| [fpn_net_vF.pth](http://cunei.iwr.uni-heidelberg.de/cuneiformbrowser/model_weights/fpn_net_vF.pth) | sign detector | saa01, saa05, saa08, saa10, saa13, saa16 | train_full (4663 bboxes) | 65.6 | |
| 78 | + |
| 79 | + |
| 80 | + |
| 81 | + |
| 82 | +### Web application |
| 83 | + |
| 84 | +We also provide a demo web application that enables a user to apply a trained cuneiform sign detector to a large collection of tablet images. |
| 85 | +The code of the web front-end is available in the [webapp repo](https://github.com/compvis/cuneiform-sign-detection-webapp/). |
| 86 | +The back-end code is part of this repository and is located in [lib/webapp/](./lib/webapp/). |
| 87 | +Below you find a short animation of how the sign detector is used with this web interface. |
| 88 | + |
| 89 | +<img src="http://cunei.iwr.uni-heidelberg.de/cuneiformbrowser/functions/demo_cuneiform_sign_detection.gif" alt="Web interface detection" width="700"/> |
| 90 | + |
| 91 | + |
| 92 | +For demonstration purposes, we also host an instance of the web application: [Demo Web Application](http://cunei.iwr.uni-heidelberg.de/cuneiformbrowser/). |
| 93 | +If you would like to test the web application, please contact us for user credentials to log in. |
| 94 | +Please note that this web application is a prototype for demonstration purposes only and not a production system. |
| 95 | +In case the website is not reachable, or other technical issues occur, please contact us. |
| 96 | + |
| 97 | + |
| 98 | + |
| 99 | +### Cuneiform font |
| 100 | + |
| 101 | +For visualization of the cuneiform characters, we recommend installing the [Unicode Cuneiform Fonts](https://www.hethport.uni-wuerzburg.de/cuneifont/) by Sylvie Vanseveren. |
| 102 | + |
| 103 | + |
| 104 | +## Installation |
| 105 | + |
| 106 | +#### Software |
| 107 | +Install general dependencies: |
| 108 | + |
| 109 | +- **OpenGM** with python wrapper - library for discrete graphical models. http://hciweb2.iwr.uni-heidelberg.de/opengm/ |
| 110 | +This library is needed for the alignment step during training. Testing is not affected. An installation guide for Ubuntu 14.04 can be found [here](./install_opengm.md). |
| 111 | + |
| 112 | +- Python 2.7.X |
| 113 | + |
| 114 | +- Python packages: |
| 115 | + - torch 1.0 |
| 116 | + - torchvision |
| 117 | + - scikit-image 0.14.0 |
| 118 | + - pandas, scipy, sklearn, jupyter |
| 119 | + - pillow, tqdm, tensorboardX, nltk, Levensthein, editdistance, easydict |
| 120 | + |
| 121 | + |
| 122 | +Clone this repository and place the [*cuneiform-sign-detection-dataset*](https://github.com/compvis/cuneiform-sign-detection-dataset) in the [./data sub-folder](./data/). |
| 123 | + |
| 124 | +#### Hardware |
| 125 | + |
| 126 | +Training and evaluation can be performed on a machine with a single GPU (we used a GeFore GTX 1080). |
| 127 | +The demo web application can run on a web server without GPU support, |
| 128 | +since detection inference with a lightweight MobileNetV2 backbone is fast even in CPU only mode |
| 129 | +(less than 1s for an image with HD resolution, less than 10s for 4K resolution). |
| 130 | + |
| 131 | +### References |
| 132 | +This repository also includes external code. In particular, we want to mention: |
| 133 | +> - kuangliu's *torchcv* and *pytorch-cifar* repositories from which we adapted the SSD and FPN detector code: |
| 134 | + https://github.com/kuangliu/pytorch-cifar and |
| 135 | + https://github.com/kuangliu/torchcv |
| 136 | +> - Ross Girshick's *py-faster-rcnn* repository from which we adapted part of our evaluation routine: |
| 137 | + https://github.com/rbgirshick/py-faster-rcnn |
| 138 | +> - Rico Sennrich's *Bleualign* repository from which we adapted part of the Bleualign implementation: |
| 139 | + https://github.com/rsennrich/Bleualign |
0 commit comments