Skip to content

Releases: neurlang/goruut

v0.6.3

02 Aug 05:42
500aa39

Choose a tag to compare

🔧 Build & Infrastructure

  • Golang version bumped in:

    • go-ossf-slsa3-publish.yml
    • go.yml
  • Integrated upstream num2words and classifier libraries.


🌐 Language Support & Models

  • Hebrew3:

    • New phonemizer model, homograph model, and packed dictionary.
    • Added learned patterns.
  • English:

    • Introduced homograph model.
  • German:

    • Added new dictionary.
  • Minnan:

    • Added new 2-series models.
  • Added new languages (unspecified which, beyond those above).


🧠 Core Features

  • Integrated a sentencizer.

  • Enabled loading of abbr.tsv into dictionaries.

  • Finalized model for homograph result marking.

  • Parallelized homograph test.

  • Homograph-related improvements and documentation updates:

    • HOMOGRAPH.md, ROADMAP.md, README.md, etc.

🛠️ Other Changes

  • Added zip model loader.
  • Improved logging: now logs errors explicitly.
  • Updated quaternary dependency to v0.2.0.
  • Added and updated dev contribution docs: CONTRIBUTING.md, DEVELOPING.md.

Full Changelog: v0.6.2...v0.6.3

v0.6.2

07 Apr 05:58

Choose a tag to compare

  • repacked all dictionaries for forward / reverse phonemization (new patch version)
  • fix bug: Words with identical IPA as Text were vanishing
  • New numToWords languages:
    • Czech
    • German
    • Spanish
    • French
    • Hungarian
    • Polish
    • Russian
    • Slovak
    • Ukrainian

Full Changelog: v0.6.1...v0.6.2

v0.6.1

04 Apr 16:08

Choose a tag to compare

  • fix compilation on 386

Full Changelog: v0.6.0...v0.6.1

v0.6.0

04 Apr 10:33

Choose a tag to compare

  • retrained all models for forward / reverse phonemization (breaking change, new minor version)

  • new transformer based models for homograph word inference (english)

  • working study initiation from null, {} map in language.json

  • Handle numerics in Arabic and English

  • Add groups of same diacritics to hebrew2

  • analysis2 normalization/diacritics sorting

  • Fixed -overwrite in homograph scripts

  • Split on hanzi/nonhanzi boundary

  • don't crash if language dict not found in phonemization steps

  • User interface

    • toucan TTS integration
    • word editor
  • New Languages:

    • cantonese
    • minnan/taiwanese
    • minnan/hokkien

What's Changed

Full Changelog: v0.5.1...v0.6.0

v0.5.1

21 Mar 17:23
a2a39de

Choose a tag to compare

  • retrained 49 new models for forward / reverse phonemization (non-breaking change, new patch version)
  • enable normalization for Vietnamese
  • fix normalization non working
  • split hyphen words
  • remove old analysis aligner
  • fix cleaning bug and do final clean
  • backend: add explain world feature
  • Starting with empty/null json.Map using short keywords
  • fix: resolve infinite loop and improve alignment logic with Efficient Memoization for padspace languages
  • Add new rule SrcDuplicate to language.json
  • Clear combobox on first partial search click
  • Don't overwrite model when training
  • Finetune italian/tamil using --rowlossimportance 6

What's Changed

New Contributors

Full Changelog: v0.5.0...v0.5.1

v0.5.0

14 Mar 17:43

Choose a tag to compare

  • retrained all 87 models for forward / reverse phonemization (breaking change, new minor version)
  • new transformer based models for out-of-dictionary word inference
  • feature: punctuation preservation: toggle preserve/hide punctuation mode
  • feature: multiple language dictionaries for use in the same sentence based on user preference

User interface

  • out of dictionary words in underlined
  • preserve punctuation in bold

New Languages:

  • english/american
  • english/british

Thanks to kokoro for generously offering their data.

Full Changelog: v0.4.0...v0.5.0

v0.4.0

01 Mar 11:14

Choose a tag to compare

  • retrained all 85 models for forward / reverse phonemization (breaking change, new minor version)
  • new quaternary models (should be faster)
  • threading racing fix
  • dictionary processing error fix

Full Changelog: v0.3.0...v0.4.0

v0.3.0

09 Feb 16:06

Choose a tag to compare

  • retrained all 85 models for forward / reverse phonemization (breaking change, new minor version)
  • fixes regarding thread safety of goruut

Full Changelog: v0.2.4...v0.3.0

v0.2.4

01 Feb 21:11

Choose a tag to compare

  • fix compilation on architectures without avx512
  • reverse phonemization models for all provided languages (from IPA to the language)
  • fix forward direction phonemization to work (load the model)

Full Changelog: v0.2.3...v0.2.4

v0.2.3

01 Feb 19:34

Choose a tag to compare

  • fix compilation on architectures without avx512
  • reverse phonemization models for all provided languages (from IPA to the language)

Full Changelog: v0.2.2...v0.2.3