Skip to content

Commit 8ebaf8a

Browse files
committed
update readme and script
1 parent 002611e commit 8ebaf8a

File tree

18 files changed

+79
-73
lines changed

18 files changed

+79
-73
lines changed

davarocr/davarocr/__init__.py

+1
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@
1313
from .davar_rcg import *
1414
from .davar_spotting import *
1515
from .davar_ie import *
16+
from .davar_videotext import *
1617
from .mmcv import *
1718
from .version import __version__
1819

davarocr/davarocr/davar_rcg/models/recognizors/general.py

+2-2
Original file line numberDiff line numberDiff line change
@@ -58,8 +58,8 @@ def __init__(self,
5858
sequence_head (dict): sequence_head parameter
5959
neck (dict): neck parameter
6060
transformation (dict): transformation parameter
61-
train_cfg (dict): model training cfg parameter
62-
test_cfg (dict): model test cfg parameter
61+
train_cfg (mmcv.config): model training cfg parameter
62+
test_cfg (mmcv.config): model test cfg parameter
6363
pretrained (str): model path of the pre_trained model
6464
"""
6565
super(GeneralRecognizor, self).__init__()

davarocr/davarocr/davar_rcg/models/recognizors/rf_learning.py

+2-2
Original file line numberDiff line numberDiff line change
@@ -43,8 +43,8 @@ def __init__(self,
4343
neck_s2v (dict): recognition to visual feature strengthened neck parameter
4444
transformation (dict): transformation parameter
4545
sequence_module (dict): sequence_module parameter
46-
train_cfg (dict): model training cfg parameter
47-
test_cfg (dict): model test cfg parameter
46+
train_cfg (mmcv.config): model training cfg parameter
47+
test_cfg (mmcv.config): model test cfg parameter
4848
pretrained (str): model path of the pre_trained model
4949
train_type (str): training type:
5050
1、"visual" - training visual counting branch

davarocr/tools/train.py

+4
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,8 @@
3636

3737
from davarocr.davar_spotting.models.builder import build_spotter
3838

39+
from davarocr.davar_ner.models.builder import build_ner
40+
3941

4042
def parse_args():
4143
parser = argparse.ArgumentParser(description='Train a detector.')
@@ -217,6 +219,8 @@ def main():
217219
cfg.model,
218220
train_cfg=cfg.get('train_cfg', None),
219221
test_cfg=cfg.get('test_cfg', None))
222+
elif model_type == "NER":
223+
model = build_ner(cfg.model,train_cfg=cfg.get('train_cfg', None),test_cfg=cfg.get('test_cfg', None))
220224
else:
221225
raise NotImplementedError
222226

demo/text_detection/east/readme.md

+5-5
Original file line numberDiff line numberDiff line change
@@ -11,17 +11,17 @@ The formatted training datalist and test datalist can be found in `demo/text_det
1111
Modified the paths ("imgs"/ "pretrained_model"/ "work_space", etc.) in the config files `demo/text_detection/east/config/east_r50_rbox.py`.
1212

1313
Run the following bash command in the command line,
14-
```shell
15-
cd $DAVAR_LAB_OCR_ROOT$/demo/text_detection/east/
16-
bash dist_train.sh
14+
``` bash
15+
>>> cd $DAVAR_LAB_OCR_ROOT$/demo/text_detection/east/
16+
>>> bash dist_train.sh
1717
```
1818

1919
> We provide the implementation of online validation. If you want to close it to save training time, you may modify the startup script to add `--no-validate` command.
2020
2121
## Offline Inference and Evaluation
2222
We provide a demo of forward inference and visualization. You can modify the paths (`test_dataset`, `image_prefix`, etc.) in the testing script, and start testing:
23-
```shell
24-
python test.py
23+
``` bash
24+
>>> python test.py
2525
```
2626
Some visualization of detection results are shown:
2727

demo/text_detection/evaluation/readme.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
## Evaluation toolThis evaluation tools is from the repository of [SCUT-CTW1500](https://github.com/Yuliang-Liu/TIoU-metric/tree/master/curved-tiou). The code is slightly modified to be compatibled with python3.We provide some of the popular benchmarks, including [ICDAR2013](https://rrc.cvc.uab.es/?ch=2), [ICDAR2015](https://rrc.cvc.uab.es/?ch=4), [Total-Text](https://github.com/cs-chan/Total-Text-Dataset) and [SCUT-CTW1500](https://github.com/Yuliang-Liu/Curve-Text-Detector), and all of the ground-truthes are transformed into the requried format.The default evaluation metric sets IoU constraint as 0.5.#### Do evaluationDirectly run```shell python script.py -g=gt/total-text-gt.zip -s=pred/pred_tp_det_r50_tt_e25-45b1f5cf.zip``` will produce num_gt, num_det: 2214 2366 Origin: recall: 0.8234 precision: 0.8632 hmean: 0.8428Go into the directory of each algorithm for detailed evaluation results.
1+
## Evaluation toolThis evaluation tools is from the repository of [SCUT-CTW1500](https://github.com/Yuliang-Liu/TIoU-metric/tree/master/curved-tiou). The code is slightly modified to be compatibled with python3.We provide some of the popular benchmarks, including [ICDAR2013](https://rrc.cvc.uab.es/?ch=2), [ICDAR2015](https://rrc.cvc.uab.es/?ch=4), [Total-Text](https://github.com/cs-chan/Total-Text-Dataset) and [SCUT-CTW1500](https://github.com/Yuliang-Liu/Curve-Text-Detector), and all of the ground-truthes are transformed into the requried format.The default evaluation metric sets IoU constraint as 0.5.#### Do evaluationDirectly run python script.py -g=gt/total-text-gt.zip -s=pred/pred_tp_det_r50_tt_e25-45b1f5cf.zip will produce num_gt, num_det: 2214 2366 Origin: recall: 0.8234 precision: 0.8632 hmean: 0.8428Go into the directory of each algorithm for detailed evaluation results.

demo/text_detection/mask_rcnn_det/readme.md

+5-5
Original file line numberDiff line numberDiff line change
@@ -11,17 +11,17 @@ The formatted training datalist and test datalist can be found in `demo/text_det
1111
Modified the paths ("imgs"/ "pretrained_model"/ "work_space", etc.) in the config files `demo/text_detection/mask_rcnn_det/config/mask_rcnn_r50_fpn.py`.
1212

1313
Run the following bash command in the command line,
14-
```shell
15-
cd $DAVAR_LAB_OCR_ROOT$/demo/text_detection/mask_rcnn_det/
16-
bash dist_train.sh
14+
``` bash
15+
>>> cd $DAVAR_LAB_OCR_ROOT$/demo/text_detection/mask_rcnn_det/
16+
>>> bash dist_train.sh
1717
```
1818

1919
> We provide the implementation of online validation. If you want to close it to save training time, you may modify the startup script to add `--no-validate` command.
2020
2121
## Offline Inference and Evaluation
2222
We provide a demo of forward inference and visualization. You can modify the paths (`test_dataset`, `image_prefix`, etc.) in the testing script, and start testing:
23-
```shell
24-
python test.py
23+
``` bash
24+
>>> python test.py
2525
```
2626
Some visualization of detection results are shown:
2727

demo/text_detection/text_perceptron_det/readme.md

+5-5
Original file line numberDiff line numberDiff line change
@@ -14,17 +14,17 @@ The formatted training datalist and test datalist can be found in `demo/text_det
1414
Modified the paths of "imgs"/ "pretrained_model"/ "work_space" in the config files `demo/text_detection/text_perceptron_det/config/tp_r50_3stages_enlarge.py`.
1515

1616
Run the following bash command in the command line,
17-
```shell
18-
cd $DAVAR_LAB_OCR_ROOT$/demo/text_detection/text_perceptron_det/
19-
bash dist_train.sh
17+
``` bash
18+
>>> cd $DAVAR_LAB_OCR_ROOT$/demo/text_detection/text_perceptron_det/
19+
>>> bash dist_train.sh
2020
```
2121

2222
> We provide the implementation of online validation. If you want to close it to save training time, you may modify the startup script to add `--no-validate` command.
2323
2424
## Offline Inference and Evaluation
2525
We provide a demo of forward inference and visualization. You can modify the paths (`test_dataset`, `image_prefix`, etc.) in the testing script, and start testing:
26-
```shell
27-
python test.py
26+
``` bash
27+
>>> python test.py
2828
```
2929
Some visualization of detection results are shown:
3030

demo/text_recognition/__base__/res32_bilstm_attn.py

+3-3
Original file line numberDiff line numberDiff line change
@@ -243,7 +243,7 @@
243243
type="DavarRCGDataset",
244244
data_type="LMDB_Standard",
245245
ann_file='mixture',
246-
img_prefix='/dataset/chengzhanzhan/TextRecognition/LMDB/BenchEn/validation/',
246+
img_prefix='/path/to/validation/',
247247
batch_max_length=25,
248248
used_ratio=1,
249249
test_mode=True,
@@ -257,7 +257,7 @@
257257
type="DavarRCGDataset",
258258
data_type='LMDB_Standard',
259259
ann_file='IIIT5k_3000',
260-
img_prefix='/dataset/chengzhanzhan/TextRecognition/LMDB/BenchEn/evaluation/',
260+
img_prefix='/path/to/evaluation/',
261261
batch_max_length=25,
262262
used_ratio=1,
263263
test_mode=True,
@@ -410,7 +410,7 @@
410410
log_level = 'INFO'
411411

412412
# The path where the model is saved
413-
work_dir = '/data1/workdir/davar_opensource/att_base/'
413+
work_dir = '//path/to/davar_opensource/att_base/'
414414

415415
# Load from Pre-trained model path
416416
load_from = None

demo/text_recognition/__base__/res32_bilstm_ctc.py

+3-3
Original file line numberDiff line numberDiff line change
@@ -239,7 +239,7 @@
239239
type="DavarRCGDataset",
240240
data_type="LMDB_Standard",
241241
ann_file='mixture',
242-
img_prefix='/dataset/chengzhanzhan/TextRecognition/LMDB/BenchEn/validation/',
242+
img_prefix='/path/to/validation',
243243
batch_max_length=25,
244244
used_ratio=1,
245245
test_mode=True,
@@ -249,7 +249,7 @@
249249
type=dataset_type,
250250
data_type='LMDB_Standard',
251251
ann_file='IIIT5k_3000',
252-
img_prefix='/dataset/chengzhanzhan/TextRecognition/LMDB/BenchEn/evaluation/',
252+
img_prefix='/path/to/evaluation/',
253253
batch_ratios=1,
254254
batch_max_length=25,
255255
used_ratio=1,
@@ -402,7 +402,7 @@
402402
log_level = 'INFO'
403403

404404
# The path where the model is saved
405-
work_dir = '/data1/workdir/davar_opensource/ctc_base/'
405+
work_dir = '/path/to/davar_opensource/ctc_base/'
406406

407407
# Load from Pre-trained model path
408408
load_from = None

demo/text_recognition/__base__/test_base_setting.py

+11-11
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@
1212
# encoding=utf-8
1313

1414
# recognition dictionary
15-
character = "/data1/open-source/demo/text_recognition/__dictionary__/Scene_text_68.txt"
15+
character = "/path/to/demo/text_recognition/__dictionary__/Scene_text_68.txt"
1616

1717
# dataset settings
1818
dataset_type = 'DavarMultiDataset'
@@ -50,70 +50,70 @@
5050
testsets = [
5151
{
5252
'Name': 'IIIT5k',
53-
'FilePre': '/dataset/chengzhanzhan/TextRecognition/LMDB/BenchEn/evaluation/',
53+
'FilePre': '/path/to/evaluation/',
5454
'AnnFile': 'IIIT5k_3000/',
5555
'Type': 'LMDB_Standard',
5656
'PipeLine': test_pipeline,
5757
},
5858
{
5959
'Name': 'SVT',
60-
'FilePre': '/dataset/chengzhanzhan/TextRecognition/LMDB/BenchEn/evaluation/',
60+
'FilePre': '/path/to/evaluation/',
6161
'AnnFile': 'SVT/',
6262
'Type': 'LMDB_Standard',
6363
'PipeLine': test_pipeline,
6464
},
6565
{
6666
'Name': 'IC03_860',
67-
'FilePre': '/dataset/chengzhanzhan/TextRecognition/LMDB/BenchEn/evaluation/',
67+
'FilePre': '/path/to/evaluation/',
6868
'AnnFile': 'IC03_860/',
6969
'Type': 'LMDB_Standard',
7070
'PipeLine': test_pipeline,
7171
},
7272
{
7373
'Name': 'IC03_867',
74-
'FilePre': '/dataset/chengzhanzhan/TextRecognition/LMDB/BenchEn/evaluation/',
74+
'FilePre': '/path/to/evaluation/',
7575
'AnnFile': 'IC03_867/',
7676
'Type': 'LMDB_Standard',
7777
'PipeLine': test_pipeline,
7878
},
7979
{
8080
'Name': 'IC13_857',
81-
'FilePre': '/dataset/chengzhanzhan/TextRecognition/LMDB/BenchEn/evaluation/',
81+
'FilePre': '/path/to/evaluation/',
8282
'AnnFile': 'IC13_857/',
8383
'Type': 'LMDB_Standard',
8484
'PipeLine': test_pipeline,
8585
},
8686
{
8787
'Name': 'IC13_1015',
88-
'FilePre': '/dataset/chengzhanzhan/TextRecognition/LMDB/BenchEn/evaluation/',
88+
'FilePre': '/path/to/evaluation/',
8989
'AnnFile': 'IC13_1015/',
9090
'Type': 'LMDB_Standard',
9191
'PipeLine': test_pipeline,
9292
},
9393
{
9494
'Name': 'IC15_1811',
95-
'FilePre': '/dataset/chengzhanzhan/TextRecognition/LMDB/BenchEn/evaluation/',
95+
'FilePre': '/path/to/evaluation/',
9696
'AnnFile': 'IC15_1811/',
9797
'Type': 'LMDB_Standard',
9898
'PipeLine': test_pipeline,
9999
},
100100
{
101101
'Name': 'IC15_2077',
102-
'FilePre': '/dataset/chengzhanzhan/TextRecognition/LMDB/BenchEn/evaluation/',
102+
'FilePre': '/path/to/evaluation/',
103103
'AnnFile': 'IC15_2077/',
104104
'Type': 'LMDB_Standard',
105105
'PipeLine': test_pipeline,
106106
},
107107
{
108108
'Name': 'SVTP',
109-
'FilePre': '/dataset/chengzhanzhan/TextRecognition/LMDB/BenchEn/evaluation/',
109+
'FilePre': '/path/to/evaluation/',
110110
'AnnFile': 'SVTP/',
111111
'Type': 'LMDB_Standard',
112112
'PipeLine': test_pipeline,
113113
},
114114
{
115115
'Name': 'CUTE80',
116-
'FilePre': '/dataset/chengzhanzhan/TextRecognition/LMDB/BenchEn/evaluation/',
116+
'FilePre': '/path/to/evaluation/',
117117
'AnnFile': 'CUTE80/',
118118
'Type': 'LMDB_Standard',
119119
'PipeLine': test_pipeline,

demo/text_recognition/rflearning/configs/rfl_res32_attn.py

+7-6
Original file line numberDiff line numberDiff line change
@@ -125,8 +125,8 @@
125125

126126
# File prefix path of the traning dataset
127127
img_prefixes = [
128-
'/dataset/chengzhanzhan/TextRecognition/LMDB/BenchEn/train/',
129-
'/dataset/chengzhanzhan/TextRecognition/LMDB/BenchEn/train/',
128+
'*****/TextRecognition/LMDB/BenchEn/train/', # path to the training dataset
129+
'*****/TextRecognition/LMDB/BenchEn/train/', # path to the training dataset
130130
]
131131

132132

@@ -229,12 +229,13 @@
229229
val=dict(
230230
type=dataset_type,
231231
batch_ratios=1,
232+
samples_per_gpu=400,
232233
test_mode=True,
233234
dataset=dict(
234235
type="DavarRCGDataset",
235236
data_type="LMDB_Standard",
236237
ann_file='mixture',
237-
img_prefix='/dataset/chengzhanzhan/TextRecognition/LMDB/BenchEn/validation/',
238+
img_prefix='/path/to/validation/',
238239
batch_max_length=25,
239240
used_ratio=1,
240241
test_mode=True,
@@ -248,7 +249,7 @@
248249
type="DavarRCGDataset",
249250
data_type='LMDB_Standard',
250251
ann_file='IIIT5k_3000',
251-
img_prefix='/dataset/chengzhanzhan/TextRecognition/LMDB/BenchEn/evaluation/',
252+
img_prefix='/path/to/evaluation/',
252253
batch_max_length=25,
253254
used_ratio=1,
254255
test_mode=True,
@@ -312,10 +313,10 @@
312313
find_unused_parameters = True
313314

314315
# Load from Pre-trained model path
315-
load_from = '/data1/workdir/davar_opensource/rflearning_visual/RFL_visual_pretrained-2654bc6b.pth'
316+
load_from = '/path/to/davar_opensource/rflearning_visual/RFL_visual_pretrained-2654bc6b.pth'
316317

317318
# work directory
318-
work_dir = '/data1/workdir/davar_opensource/rflearning_total/'
319+
work_dir = '/path/to/davar_opensource/rflearning_total/'
319320

320321
# distributed training setting
321322
dist_params = dict(backend='nccl')

demo/text_recognition/rflearning/configs/rfl_res32_visual.py

+7-6
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@
1313
'./baseline.py'
1414
]
1515

16-
character = "/data1/open-source/demo/text_recognition/__dictionary__/Scene_text_36.txt"
16+
character = "/path/to/demo/text_recognition/__dictionary__/Scene_text_36.txt"
1717

1818
"""
1919
1. Model Settings
@@ -123,8 +123,8 @@
123123

124124
# File prefix path of the traning dataset
125125
img_prefixes = [
126-
'/dataset/chengzhanzhan/TextRecognition/LMDB/BenchEn/train/',
127-
'/dataset/chengzhanzhan/TextRecognition/LMDB/BenchEn/train/',
126+
'*****/TextRecognition/LMDB/BenchEn/train/', # path to the training dataset
127+
'*****/TextRecognition/LMDB/BenchEn/train/', # path to the training dataset
128128
]
129129

130130

@@ -227,12 +227,13 @@
227227
val=dict(
228228
type=dataset_type,
229229
batch_ratios=1,
230+
samples_per_gpu=400,
230231
test_mode=True,
231232
dataset=dict(
232233
type="DavarRCGDataset",
233234
data_type="LMDB_Standard",
234235
ann_file='mixture',
235-
img_prefix='/dataset/chengzhanzhan/TextRecognition/LMDB/BenchEn/validation/',
236+
img_prefix='/path/to/validation/',
236237
batch_max_length=25,
237238
used_ratio=1,
238239
test_mode=True,
@@ -246,7 +247,7 @@
246247
type="DavarRCGDataset",
247248
data_type='LMDB_Standard',
248249
ann_file='IIIT5k_3000',
249-
img_prefix='/dataset/chengzhanzhan/TextRecognition/LMDB/BenchEn/evaluation/',
250+
img_prefix='/path/to/evaluation/',
250251
batch_max_length=25,
251252
used_ratio=1,
252253
test_mode=True,
@@ -299,7 +300,7 @@
299300
load_from = None
300301

301302
# work directory
302-
work_dir = '/data1/workdir/davar_opensource/rflearning_visual/'
303+
work_dir = '/path/to/davar_opensource/rflearning_visual/'
303304

304305
# distributed training setting
305306
dist_params = dict(backend='nccl')

demo/text_recognition/rflearning/readme.md

+3-3
Original file line numberDiff line numberDiff line change
@@ -70,7 +70,7 @@ bash ./train.sh
7070
7171
### Evaluation
7272

73-
1.Visual Character Counting Stage
73+
1.Visual Stage
7474
```shell
7575
cd .
7676
bash ./test_scripts/test_rfl_visual.sh
@@ -107,7 +107,7 @@ bash ./train.sh
107107
<td><center>Model</center></td>
108108
<tr>
109109
<tr>
110-
<td><center> RF-Learning visual character counting(Report)</center></td>
110+
<td><center> RF-Learning visual(Report)</center></td>
111111
<td><center> 95.7 </center></td>
112112
<td><center> 94.0 </center></td>
113113
<td><center> 96.0 </center></td>
@@ -119,7 +119,7 @@ bash ./train.sh
119119
<td><center><p>-</p></center></td>
120120
<tr>
121121
<tr>
122-
<td><center> RF-Learning visual character counting</center></td>
122+
<td><center> RF-Learning visual</center></td>
123123
<td><center> 96.0 </center></td>
124124
<td><center> 94.7 </center></td>
125125
<td><center> 96.2 </center></td>

demo/text_spotting/evaluation/readme.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
## Evaluation toolThis evaluation tools is modified from the official [ICDAR2015 competition](https://rrc.cvc.uab.es/?ch=4). The code is slightly modified to be compatible with python3 and curved text instances.We provide some of the popular benchmarks, including [ICDAR2013](https://rrc.cvc.uab.es/?ch=2), [ICDAR2015](https://rrc.cvc.uab.es/?ch=4), [Total-Text](https://github.com/cs-chan/Total-Text-Dataset), and all of the ground-truthes are transformed into the requried format.The default evaluation metric sets IoU constraint as 0.5.For MANGO which is without accurate text detection branch, The IoU constraint is set as 0.1.#### Do evaluationDirectly run```shell python script.py -g=gts/gt-icdar2013.zip -s=preds/mango_r50_ic13_none.zip -word_spotting=false -iou=0.1``` will produce num_gt, num_det: 917 1038 Origin: recall: 0.795 precision: 0.8265 hmean: 0.81Go into the directory of each algorithm for detailed evaluation results.
1+
## Evaluation toolThis evaluation tools is modified from the official [ICDAR2015 competition](https://rrc.cvc.uab.es/?ch=4). The code is slightly modified to be compatible with python3 and curved text instances.We provide some of the popular benchmarks, including [ICDAR2013](https://rrc.cvc.uab.es/?ch=2), [ICDAR2015](https://rrc.cvc.uab.es/?ch=4), [Total-Text](https://github.com/cs-chan/Total-Text-Dataset), and all of the ground-truthes are transformed into the requried format.The default evaluation metric sets IoU constraint as 0.5.For MANGO which is without accurate text detection branch, The IoU constraint is set as 0.1.#### Do evaluationDirectly run python script.py -g=gts/gt-icdar2013.zip -s=preds/mango_r50_ic13_none.zip -word_spotting=false -iou=0.1 will produce num_gt, num_det: 917 1038 Origin: recall: 0.795 precision: 0.8265 hmean: 0.81Go into the directory of each algorithm for detailed evaluation results.

0 commit comments

Comments
 (0)