update readme and script

qiaoliang6 · qiaoliang6 · commit 8ebaf8acf9df · 2021-08-24T17:45:52.000+08:00
diff --git a/davarocr/davarocr/__init__.py b/davarocr/davarocr/__init__.py
@@ -13,6 +13,7 @@
 from .davar_rcg import *
 from .davar_spotting import *
 from .davar_ie import *
+from .davar_videotext import *
 from .mmcv import *
 from .version import __version__
 
diff --git a/davarocr/davarocr/davar_rcg/models/recognizors/general.py b/davarocr/davarocr/davar_rcg/models/recognizors/general.py
@@ -58,8 +58,8 @@ def __init__(self,
             sequence_head (dict): sequence_head parameter
             neck (dict): neck parameter
             transformation (dict): transformation parameter
-            train_cfg (dict): model training cfg parameter
-            test_cfg (dict): model test cfg parameter
+            train_cfg (mmcv.config): model training cfg parameter
+            test_cfg (mmcv.config): model test cfg parameter
             pretrained (str): model path of the pre_trained model
         """
         super(GeneralRecognizor, self).__init__()
diff --git a/davarocr/davarocr/davar_rcg/models/recognizors/rf_learning.py b/davarocr/davarocr/davar_rcg/models/recognizors/rf_learning.py
@@ -43,8 +43,8 @@ def __init__(self,
             neck_s2v (dict): recognition to visual feature strengthened neck parameter
             transformation (dict): transformation parameter
             sequence_module (dict): sequence_module parameter
-            train_cfg (dict): model training cfg parameter
-            test_cfg (dict): model test cfg parameter
+            train_cfg (mmcv.config): model training cfg parameter
+            test_cfg (mmcv.config): model test cfg parameter
             pretrained (str): model path of the pre_trained model
             train_type (str): training type：
                                       1、"visual" - training visual counting branch
diff --git a/davarocr/tools/train.py b/davarocr/tools/train.py
@@ -36,6 +36,8 @@
 
 from davarocr.davar_spotting.models.builder import build_spotter
 
+from davarocr.davar_ner.models.builder import build_ner
+
 
 def parse_args():
     parser = argparse.ArgumentParser(description='Train a detector.')
@@ -217,6 +219,8 @@ def main():
             cfg.model,
             train_cfg=cfg.get('train_cfg', None),
             test_cfg=cfg.get('test_cfg', None))
+    elif model_type == "NER":
+        model = build_ner(cfg.model,train_cfg=cfg.get('train_cfg', None),test_cfg=cfg.get('test_cfg', None))
     else:
         raise NotImplementedError
 
diff --git a/demo/text_detection/east/readme.md b/demo/text_detection/east/readme.md
@@ -11,17 +11,17 @@ The formatted training datalist and test datalist can be found in `demo/text_det
 Modified the paths ("imgs"/ "pretrained_model"/ "work_space", etc.) in the config files `demo/text_detection/east/config/east_r50_rbox.py`.
 
 Run the following bash command in the command line,
-```shell
-cd $DAVAR_LAB_OCR_ROOT$/demo/text_detection/east/
-bash dist_train.sh
+``` bash
+>>> cd $DAVAR_LAB_OCR_ROOT$/demo/text_detection/east/
+>>> bash dist_train.sh
 ```
 
 > We provide the implementation of online validation. If you want to close it to save training time, you may modify the startup script to add `--no-validate` command.
 
 ## Offline Inference and Evaluation
 We provide a demo of forward inference and visualization. You can modify the paths (`test_dataset`, `image_prefix`, etc.) in the testing script, and start testing:
-```shell
-python test.py 
+``` bash
+>>> python test.py 
 ```
 Some visualization of detection results are shown:
 
diff --git a/demo/text_detection/evaluation/readme.md b/demo/text_detection/evaluation/readme.md
@@ -1 +1 @@
-## Evaluation toolThis evaluation tools is from the repository of [SCUT-CTW1500](https://github.com/Yuliang-Liu/TIoU-metric/tree/master/curved-tiou). The code is slightly modified to be compatibled with python3.We provide some of the popular benchmarks, including [ICDAR2013](https://rrc.cvc.uab.es/?ch=2), [ICDAR2015](https://rrc.cvc.uab.es/?ch=4), [Total-Text](https://github.com/cs-chan/Total-Text-Dataset) and [SCUT-CTW1500](https://github.com/Yuliang-Liu/Curve-Text-Detector), and all of the ground-truthes are transformed into the requried format.The default evaluation metric sets IoU constraint as 0.5.#### Do evaluationDirectly run```shell python script.py -g=gt/total-text-gt.zip -s=pred/pred_tp_det_r50_tt_e25-45b1f5cf.zip```		will produce	num_gt, num_det: 2214 2366	Origin:	recall: 0.8234 precision: 0.8632 hmean: 0.8428Go into the directory of each algorithm for detailed evaluation results.
+## Evaluation toolThis evaluation tools is from the repository of [SCUT-CTW1500](https://github.com/Yuliang-Liu/TIoU-metric/tree/master/curved-tiou). The code is slightly modified to be compatibled with python3.We provide some of the popular benchmarks, including [ICDAR2013](https://rrc.cvc.uab.es/?ch=2), [ICDAR2015](https://rrc.cvc.uab.es/?ch=4), [Total-Text](https://github.com/cs-chan/Total-Text-Dataset) and [SCUT-CTW1500](https://github.com/Yuliang-Liu/Curve-Text-Detector), and all of the ground-truthes are transformed into the requried format.The default evaluation metric sets IoU constraint as 0.5.#### Do evaluationDirectly run	python script.py -g=gt/total-text-gt.zip -s=pred/pred_tp_det_r50_tt_e25-45b1f5cf.zip	will produce	num_gt, num_det: 2214 2366	Origin:	recall: 0.8234 precision: 0.8632 hmean: 0.8428Go into the directory of each algorithm for detailed evaluation results.
diff --git a/demo/text_detection/mask_rcnn_det/readme.md b/demo/text_detection/mask_rcnn_det/readme.md
@@ -11,17 +11,17 @@ The formatted training datalist and test datalist can be found in `demo/text_det
 Modified the paths ("imgs"/ "pretrained_model"/ "work_space", etc.) in the config files `demo/text_detection/mask_rcnn_det/config/mask_rcnn_r50_fpn.py`.
 
 Run the following bash command in the command line,
-```shell
-cd $DAVAR_LAB_OCR_ROOT$/demo/text_detection/mask_rcnn_det/
-bash dist_train.sh
+``` bash
+>>> cd $DAVAR_LAB_OCR_ROOT$/demo/text_detection/mask_rcnn_det/
+>>> bash dist_train.sh
 ```
 
 > We provide the implementation of online validation. If you want to close it to save training time, you may modify the startup script to add `--no-validate` command.
 
 ## Offline Inference and Evaluation
 We provide a demo of forward inference and visualization. You can modify the paths (`test_dataset`, `image_prefix`, etc.) in the testing script, and start testing:
-```shell
-python test.py 
+``` bash
+>>> python test.py 
 ```
 Some visualization of detection results are shown:
 
diff --git a/demo/text_detection/text_perceptron_det/readme.md b/demo/text_detection/text_perceptron_det/readme.md
@@ -14,17 +14,17 @@ The formatted training datalist and test datalist can be found in `demo/text_det
 Modified the paths of "imgs"/ "pretrained_model"/ "work_space" in the config files `demo/text_detection/text_perceptron_det/config/tp_r50_3stages_enlarge.py`.
 
 Run the following bash command in the command line,
-```shell
-cd $DAVAR_LAB_OCR_ROOT$/demo/text_detection/text_perceptron_det/
-bash dist_train.sh
+``` bash
+>>> cd $DAVAR_LAB_OCR_ROOT$/demo/text_detection/text_perceptron_det/
+>>> bash dist_train.sh
 ```
 
 > We provide the implementation of online validation. If you want to close it to save training time, you may modify the startup script to add `--no-validate` command.
 
 ## Offline Inference and Evaluation
 We provide a demo of forward inference and visualization. You can modify the paths (`test_dataset`, `image_prefix`, etc.) in the testing script, and start testing:
-```shell
-python test.py 
+``` bash
+>>> python test.py 
 ```
 Some visualization of detection results are shown:
 
diff --git a/demo/text_recognition/__base__/res32_bilstm_attn.py b/demo/text_recognition/__base__/res32_bilstm_attn.py
@@ -243,7 +243,7 @@
             type="DavarRCGDataset",
             data_type="LMDB_Standard",
             ann_file='mixture',
-            img_prefix='/dataset/chengzhanzhan/TextRecognition/LMDB/BenchEn/validation/',
+            img_prefix='/path/to/validation/',
             batch_max_length=25,
             used_ratio=1,
             test_mode=True,
@@ -257,7 +257,7 @@
             type="DavarRCGDataset",
             data_type='LMDB_Standard',
             ann_file='IIIT5k_3000',
-            img_prefix='/dataset/chengzhanzhan/TextRecognition/LMDB/BenchEn/evaluation/',
+            img_prefix='/path/to/evaluation/',
             batch_max_length=25,
             used_ratio=1,
             test_mode=True,
@@ -410,7 +410,7 @@
 log_level = 'INFO'
 
 # The path where the model is saved
-work_dir = '/data1/workdir/davar_opensource/att_base/'
+work_dir = '//path/to/davar_opensource/att_base/'
 
 # Load from Pre-trained model path
 load_from = None
diff --git a/demo/text_recognition/__base__/res32_bilstm_ctc.py b/demo/text_recognition/__base__/res32_bilstm_ctc.py
@@ -239,7 +239,7 @@
             type="DavarRCGDataset",
             data_type="LMDB_Standard",
             ann_file='mixture',
-            img_prefix='/dataset/chengzhanzhan/TextRecognition/LMDB/BenchEn/validation/',
+            img_prefix='/path/to/validation',
             batch_max_length=25,
             used_ratio=1,
             test_mode=True,
@@ -249,7 +249,7 @@
         type=dataset_type,
         data_type='LMDB_Standard',
         ann_file='IIIT5k_3000',
-        img_prefix='/dataset/chengzhanzhan/TextRecognition/LMDB/BenchEn/evaluation/',
+        img_prefix='/path/to/evaluation/',
         batch_ratios=1,
         batch_max_length=25,
         used_ratio=1,
@@ -402,7 +402,7 @@
 log_level = 'INFO'
 
 # The path where the model is saved
-work_dir = '/data1/workdir/davar_opensource/ctc_base/'
+work_dir = '/path/to/davar_opensource/ctc_base/'
 
 # Load from Pre-trained model path
 load_from = None
diff --git a/demo/text_recognition/__base__/test_base_setting.py b/demo/text_recognition/__base__/test_base_setting.py
@@ -12,7 +12,7 @@
 # encoding=utf-8
 
 # recognition dictionary
-character = "/data1/open-source/demo/text_recognition/__dictionary__/Scene_text_68.txt"
+character = "/path/to/demo/text_recognition/__dictionary__/Scene_text_68.txt"
 
 # dataset settings
 dataset_type = 'DavarMultiDataset'
@@ -50,70 +50,70 @@
 testsets = [
     {
         'Name': 'IIIT5k',
-        'FilePre': '/dataset/chengzhanzhan/TextRecognition/LMDB/BenchEn/evaluation/',
+        'FilePre': '/path/to/evaluation/',
         'AnnFile': 'IIIT5k_3000/',
         'Type': 'LMDB_Standard',
         'PipeLine': test_pipeline,
     },
     {
         'Name': 'SVT',
-        'FilePre': '/dataset/chengzhanzhan/TextRecognition/LMDB/BenchEn/evaluation/',
+        'FilePre': '/path/to/evaluation/',
         'AnnFile': 'SVT/',
         'Type': 'LMDB_Standard',
         'PipeLine': test_pipeline,
     },
     {
         'Name': 'IC03_860',
-        'FilePre': '/dataset/chengzhanzhan/TextRecognition/LMDB/BenchEn/evaluation/',
+        'FilePre': '/path/to/evaluation/',
         'AnnFile': 'IC03_860/',
         'Type': 'LMDB_Standard',
         'PipeLine': test_pipeline,
     },
     {
         'Name': 'IC03_867',
-        'FilePre': '/dataset/chengzhanzhan/TextRecognition/LMDB/BenchEn/evaluation/',
+        'FilePre': '/path/to/evaluation/',
         'AnnFile': 'IC03_867/',
         'Type': 'LMDB_Standard',
         'PipeLine': test_pipeline,
     },
     {
         'Name': 'IC13_857',
-        'FilePre': '/dataset/chengzhanzhan/TextRecognition/LMDB/BenchEn/evaluation/',
+        'FilePre': '/path/to/evaluation/',
         'AnnFile': 'IC13_857/',
         'Type': 'LMDB_Standard',
         'PipeLine': test_pipeline,
     },
     {
         'Name': 'IC13_1015',
-        'FilePre': '/dataset/chengzhanzhan/TextRecognition/LMDB/BenchEn/evaluation/',
+        'FilePre': '/path/to/evaluation/',
         'AnnFile': 'IC13_1015/',
         'Type': 'LMDB_Standard',
         'PipeLine': test_pipeline,
     },
     {
         'Name': 'IC15_1811',
-        'FilePre': '/dataset/chengzhanzhan/TextRecognition/LMDB/BenchEn/evaluation/',
+        'FilePre': '/path/to/evaluation/',
         'AnnFile': 'IC15_1811/',
         'Type': 'LMDB_Standard',
         'PipeLine': test_pipeline,
     },
     {
         'Name': 'IC15_2077',
-        'FilePre': '/dataset/chengzhanzhan/TextRecognition/LMDB/BenchEn/evaluation/',
+        'FilePre': '/path/to/evaluation/',
         'AnnFile': 'IC15_2077/',
         'Type': 'LMDB_Standard',
         'PipeLine': test_pipeline,
     },
     {
         'Name': 'SVTP',
-        'FilePre': '/dataset/chengzhanzhan/TextRecognition/LMDB/BenchEn/evaluation/',
+        'FilePre': '/path/to/evaluation/',
         'AnnFile': 'SVTP/',
         'Type': 'LMDB_Standard',
         'PipeLine': test_pipeline,
     },
     {
         'Name': 'CUTE80',
-        'FilePre': '/dataset/chengzhanzhan/TextRecognition/LMDB/BenchEn/evaluation/',
+        'FilePre': '/path/to/evaluation/',
         'AnnFile': 'CUTE80/',
         'Type': 'LMDB_Standard',
         'PipeLine': test_pipeline,
diff --git a/demo/text_recognition/rflearning/configs/rfl_res32_attn.py b/demo/text_recognition/rflearning/configs/rfl_res32_attn.py
@@ -125,8 +125,8 @@
 
 # File prefix path of the traning dataset
 img_prefixes = [
-    '/dataset/chengzhanzhan/TextRecognition/LMDB/BenchEn/train/',
-    '/dataset/chengzhanzhan/TextRecognition/LMDB/BenchEn/train/',
+    '*****/TextRecognition/LMDB/BenchEn/train/',  # path to the training dataset
+    '*****/TextRecognition/LMDB/BenchEn/train/',  # path to the training dataset
 ]
 
 
@@ -229,12 +229,13 @@
     val=dict(
         type=dataset_type,
         batch_ratios=1,
+        samples_per_gpu=400,
         test_mode=True,
         dataset=dict(
             type="DavarRCGDataset",
             data_type="LMDB_Standard",
             ann_file='mixture',
-            img_prefix='/dataset/chengzhanzhan/TextRecognition/LMDB/BenchEn/validation/',
+            img_prefix='/path/to/validation/',
             batch_max_length=25,
             used_ratio=1,
             test_mode=True,
@@ -248,7 +249,7 @@
             type="DavarRCGDataset",
             data_type='LMDB_Standard',
             ann_file='IIIT5k_3000',
-            img_prefix='/dataset/chengzhanzhan/TextRecognition/LMDB/BenchEn/evaluation/',
+            img_prefix='/path/to/evaluation/',
             batch_max_length=25,
             used_ratio=1,
             test_mode=True,
@@ -312,10 +313,10 @@
 find_unused_parameters = True
 
 # Load from Pre-trained model path
-load_from = '/data1/workdir/davar_opensource/rflearning_visual/RFL_visual_pretrained-2654bc6b.pth'
+load_from = '/path/to/davar_opensource/rflearning_visual/RFL_visual_pretrained-2654bc6b.pth'
 
 # work directory
-work_dir = '/data1/workdir/davar_opensource/rflearning_total/'
+work_dir = '/path/to/davar_opensource/rflearning_total/'
 
 # distributed training setting
 dist_params = dict(backend='nccl')
diff --git a/demo/text_recognition/rflearning/configs/rfl_res32_visual.py b/demo/text_recognition/rflearning/configs/rfl_res32_visual.py
@@ -13,7 +13,7 @@
     './baseline.py'
 ]
 
-character = "/data1/open-source/demo/text_recognition/__dictionary__/Scene_text_36.txt"
+character = "/path/to/demo/text_recognition/__dictionary__/Scene_text_36.txt"
 
 """
 1. Model Settings
@@ -123,8 +123,8 @@
 
 # File prefix path of the traning dataset
 img_prefixes = [
-    '/dataset/chengzhanzhan/TextRecognition/LMDB/BenchEn/train/',
-    '/dataset/chengzhanzhan/TextRecognition/LMDB/BenchEn/train/',
+    '*****/TextRecognition/LMDB/BenchEn/train/',  # path to the training dataset
+    '*****/TextRecognition/LMDB/BenchEn/train/',  # path to the training dataset
 ]
 
 
@@ -227,12 +227,13 @@
     val=dict(
         type=dataset_type,
         batch_ratios=1,
+        samples_per_gpu=400,
         test_mode=True,
         dataset=dict(
             type="DavarRCGDataset",
             data_type="LMDB_Standard",
             ann_file='mixture',
-            img_prefix='/dataset/chengzhanzhan/TextRecognition/LMDB/BenchEn/validation/',
+            img_prefix='/path/to/validation/',
             batch_max_length=25,
             used_ratio=1,
             test_mode=True,
@@ -246,7 +247,7 @@
             type="DavarRCGDataset",
             data_type='LMDB_Standard',
             ann_file='IIIT5k_3000',
-            img_prefix='/dataset/chengzhanzhan/TextRecognition/LMDB/BenchEn/evaluation/',
+            img_prefix='/path/to/evaluation/',
             batch_max_length=25,
             used_ratio=1,
             test_mode=True,
@@ -299,7 +300,7 @@
 load_from = None
 
 # work directory
-work_dir = '/data1/workdir/davar_opensource/rflearning_visual/'
+work_dir = '/path/to/davar_opensource/rflearning_visual/'
 
 # distributed training setting
 dist_params = dict(backend='nccl')
diff --git a/demo/text_recognition/rflearning/readme.md b/demo/text_recognition/rflearning/readme.md
@@ -70,7 +70,7 @@ bash ./train.sh
 
 ### Evaluation
 
-1.Visual Character Counting Stage
+1.Visual Stage
 ```shell
   cd .
   bash ./test_scripts/test_rfl_visual.sh
@@ -107,7 +107,7 @@ bash ./train.sh
         <td><center>Model</center></td>
 	<tr>
     <tr>
-        <td><center> RF-Learning visual character counting(Report)</center></td>
+        <td><center> RF-Learning visual(Report)</center></td>
         <td><center> 95.7 </center></td>
         <td><center> 94.0 </center></td>
         <td><center> 96.0 </center></td>
@@ -119,7 +119,7 @@ bash ./train.sh
         <td><center><p>-</p></center></td>
 	<tr>
     <tr>
-        <td><center> RF-Learning visual character counting</center></td>
+        <td><center> RF-Learning visual</center></td>
         <td><center> 96.0 </center></td>
         <td><center> 94.7 </center></td>
         <td><center> 96.2 </center></td>
diff --git a/demo/text_spotting/evaluation/readme.md b/demo/text_spotting/evaluation/readme.md
@@ -1 +1 @@
-## Evaluation toolThis evaluation tools is modified from the official [ICDAR2015 competition](https://rrc.cvc.uab.es/?ch=4). The code is slightly modified to be compatible with python3 and curved text instances.We provide some of the popular benchmarks, including [ICDAR2013](https://rrc.cvc.uab.es/?ch=2), [ICDAR2015](https://rrc.cvc.uab.es/?ch=4), [Total-Text](https://github.com/cs-chan/Total-Text-Dataset), and all of the ground-truthes are transformed into the requried format.The default evaluation metric sets IoU constraint as 0.5.For MANGO which is without accurate text detection branch, The IoU constraint is set as 0.1.#### Do evaluationDirectly run```shell	python script.py -g=gts/gt-icdar2013.zip -s=preds/mango_r50_ic13_none.zip -word_spotting=false -iou=0.1```	will produce	num_gt, num_det: 917 1038	Origin:	recall: 0.795 precision: 0.8265 hmean: 0.81Go into the directory of each algorithm for detailed evaluation results.
+## Evaluation toolThis evaluation tools is modified from the official [ICDAR2015 competition](https://rrc.cvc.uab.es/?ch=4). The code is slightly modified to be compatible with python3 and curved text instances.We provide some of the popular benchmarks, including [ICDAR2013](https://rrc.cvc.uab.es/?ch=2), [ICDAR2015](https://rrc.cvc.uab.es/?ch=4), [Total-Text](https://github.com/cs-chan/Total-Text-Dataset), and all of the ground-truthes are transformed into the requried format.The default evaluation metric sets IoU constraint as 0.5.For MANGO which is without accurate text detection branch, The IoU constraint is set as 0.1.#### Do evaluationDirectly run	python script.py -g=gts/gt-icdar2013.zip -s=preds/mango_r50_ic13_none.zip -word_spotting=false -iou=0.1	will produce	num_gt, num_det: 917 1038	Origin:	recall: 0.795 precision: 0.8265 hmean: 0.81Go into the directory of each algorithm for detailed evaluation results.
diff --git a/demo/text_spotting/mango/readme.md b/demo/text_spotting/mango/readme.md
diff --git a/demo/text_spotting/mask_rcnn_spot/readme.md b/demo/text_spotting/mask_rcnn_spot/readme.md
diff --git a/readme.md b/readme.md

Original file line number	Diff line number	Diff line change
`@@ -1 +1 @@`
`1`		-## Evaluation toolThis evaluation tools is from the repository of [SCUT-CTW1500](https://github.com/Yuliang-Liu/TIoU-metric/tree/master/curved-tiou). The code is slightly modified to be compatibled with python3.We provide some of the popular benchmarks, including [ICDAR2013](https://rrc.cvc.uab.es/?ch=2), [ICDAR2015](https://rrc.cvc.uab.es/?ch=4), [Total-Text](https://github.com/cs-chan/Total-Text-Dataset) and [SCUT-CTW1500](https://github.com/Yuliang-Liu/Curve-Text-Detector), and all of the ground-truthes are transformed into the requried format.The default evaluation metric sets IoU constraint as 0.5.#### Do evaluationDirectly run```shell python script.py -g=gt/total-text-gt.zip -s=pred/pred_tp_det_r50_tt_e25-45b1f5cf.zip``` will produce num_gt, num_det: 2214 2366 Origin: recall: 0.8234 precision: 0.8632 hmean: 0.8428Go into the directory of each algorithm for detailed evaluation results.
	`1`	+## Evaluation toolThis evaluation tools is from the repository of [SCUT-CTW1500](https://github.com/Yuliang-Liu/TIoU-metric/tree/master/curved-tiou). The code is slightly modified to be compatibled with python3.We provide some of the popular benchmarks, including [ICDAR2013](https://rrc.cvc.uab.es/?ch=2), [ICDAR2015](https://rrc.cvc.uab.es/?ch=4), [Total-Text](https://github.com/cs-chan/Total-Text-Dataset) and [SCUT-CTW1500](https://github.com/Yuliang-Liu/Curve-Text-Detector), and all of the ground-truthes are transformed into the requried format.The default evaluation metric sets IoU constraint as 0.5.#### Do evaluationDirectly run python script.py -g=gt/total-text-gt.zip -s=pred/pred_tp_det_r50_tt_e25-45b1f5cf.zip will produce num_gt, num_det: 2214 2366 Origin: recall: 0.8234 precision: 0.8632 hmean: 0.8428Go into the directory of each algorithm for detailed evaluation results.