Given a test set of source-reference sentence pairs, respective MT hypothesis sentences and a MultiTermXML export file:
- Termbase pre-processing
- converting termbase (MultiTermXML export file) to python data structure for reference in terminology evaluation (
xml2dict.py
)
- converting termbase (MultiTermXML export file) to python data structure for reference in terminology evaluation (
- Dataset pre-processing
- pre-processing the dataset and creating final testset file (
create-testset.py
)
- pre-processing the dataset and creating final testset file (
- Terminology evaluation
- fine-grained automatic evaluation of legal terminology in MT output (
LexTermEval.py
)
- fine-grained automatic evaluation of legal terminology in MT output (
- Evaluating LexTermEval precision
- creating a tab-separated file for manual evaluation of LexTermEval precision (
precision_evaluation.py
)
- creating a tab-separated file for manual evaluation of LexTermEval precision (