* parameter | default | description |
* mode | search | mode of Kuromoji (normal|search|extended) |
- * kanji.length_threshold | 2 | TODO |
- * kanji.penalty | 3000 | TODO |
- * other.length_threshold | 7 | TODO |
- * other.penalty | 1700 | TODO |
- * nakaguro_split | false | TODO |
+ * kanji.length_threshold | 2 | threshold of the length of kanji tokens which is penalized while running the Viterbi search (expert feature). |
+ * kanji.penalty | 3000 | additional cost for kanji tokens which is longer than the pre-defined length threshold (expert feature). |
+ * other.length_threshold | 7 | threshold of the length of non-kanji tokens which is penalized while running the Viterbi search (expert feature). |
+ * other.penalty | 1700 | additional cost for non-kanji tokens which is longer than the pre-defined length threshold (expert feature). |
+ * nakaguro_split | false | whether splits unknown words on the middle dot character (U+30FB KATAKANA MIDDLE DOT) |
* user_dict | - | path of user dictionary |
* tokenlist_name | default | target specialtokens name |
* all_language | false | apply kuromoji tokenizer to all language |