This software implements the Neural Machine Translation based on Hierarchical Character-to-Word Level Representations and Hierchical Character-based Decoding.
To activate the character-level encoder composing source word representations from a character trigram vocabulary, select
-src_data_type text-trigram
in the settings of preprocess.py and translate.py
and
-encoder_type trigramrnn
in train.py
To activate the character-level decoder, select
-tgt_data_type characters
in the settings of preprocess.py and translate.py
and
-decoder_type charrnn
in train.py
For information about how to install and use OpenNMT-py: Full Documentation
For further questions you can contact [email protected]
If you use this software, please cite:
Ataman, D., Firat, O., Di Gangi, M., Federico, M. and Birch, A. (2019) On the Importance of Word Boundaries in Character-level Neural Machine Translation. (To appear at the WNGT Workshop at EMNLP).