Releases · neurlang/goruut · GitHub

02 Aug 05:42

neurlang

v0.6.3 Latest

Latest

🔧 Build & Infrastructure

Golang version bumped in:
- go-ossf-slsa3-publish.yml
- go.yml
Integrated upstream num2words and classifier libraries.

🌐 Language Support & Models

Hebrew3:
- New phonemizer model, homograph model, and packed dictionary.
- Added learned patterns.
English:
- Introduced homograph model.
German:
- Added new dictionary.
Minnan:
- Added new 2-series models.
Added new languages (unspecified which, beyond those above).

🧠 Core Features

Integrated a sentencizer.
Enabled loading of abbr.tsv into dictionaries.
Finalized model for homograph result marking.
Parallelized homograph test.
Homograph-related improvements and documentation updates:
- HOMOGRAPH.md, ROADMAP.md, README.md, etc.

🛠️ Other Changes

Added zip model loader.
Improved logging: now logs errors explicitly.
Updated quaternary dependency to v0.2.0.
Added and updated dev contribution docs: CONTRIBUTING.md, DEVELOPING.md.

Full Changelog: v0.6.2...v0.6.3

Assets 34

goruut-android-arm64

sha256:dd5331d4aafbd9c736035972c3e99fc82116f28cbf6f76d701edf3794c532167

259 MB 2025-08-02T05:45:26Z
goruut-android-arm64.intoto.jsonl

sha256:9e34558d82b868588b4b3dc3cba86a7c14fdf57a5304559ebb49fea926fff378

17.1 KB 2025-08-02T05:45:26Z
goruut-darwin-amd64

sha256:c7b82901192a7143949d519b06e7898faf34a32e166e6cda83ac0565f1ed4bba

259 MB 2025-08-02T05:45:32Z
goruut-darwin-amd64.intoto.jsonl

sha256:539789de84571066007c38df91790375f41590e6ec66fc466434bb9cf7e3b923

17 KB 2025-08-02T05:45:32Z
goruut-darwin-arm64

sha256:a040bb7a4f397ddff9139f2c7354b471a0ffecb66662020d62423622048e7384

260 MB 2025-08-02T05:45:41Z
goruut-darwin-arm64.intoto.jsonl

sha256:3279bd77bcdc7ec38d3998dc9d31a234904915066c153a4bc7ddbee1b7eb9f71

17 KB 2025-08-02T05:45:41Z
goruut-freebsd-386

sha256:00cf299f56c54c19bfe5117ec3c56a8a26ffea195beffde0dac9c301ac185ba6

258 MB 2025-08-02T05:45:26Z
goruut-freebsd-386.intoto.jsonl

sha256:a1d3c62b151e7bda02c93290689bb667d51feb26339b808159a6474791c2e856

17 KB 2025-08-02T05:45:26Z
goruut-freebsd-amd64

sha256:81aedd65e89418d083ad4eea8259d1014f78e740441f54fb5e2d966f3701eae5

258 MB 2025-08-02T05:45:47Z
goruut-freebsd-amd64.intoto.jsonl

sha256:185f88f706f71a2372b5344fc081225fb648e285074372172dcf8adb57d71899

17.1 KB 2025-08-02T05:45:47Z
Source code (zip)

2025-08-02T05:35:47Z
Source code (tar.gz)

2025-08-02T05:35:47Z

07 Apr 05:58

neurlang

v0.6.2

repacked all dictionaries for forward / reverse phonemization (new patch version)
fix bug: Words with identical IPA as Text were vanishing
New numToWords languages:
- Czech
- German
- Spanish
- French
- Hungarian
- Polish
- Russian
- Slovak
- Ukrainian

Full Changelog: v0.6.1...v0.6.2

Assets 34

04 Apr 16:08

neurlang

v0.6.1

fix compilation on 386

Full Changelog: v0.6.0...v0.6.1

Assets 34

04 Apr 10:33

neurlang

v0.6.0

retrained all models for forward / reverse phonemization (breaking change, new minor version)
new transformer based models for homograph word inference (english)
working study initiation from null, {} map in language.json
Handle numerics in Arabic and English
Add groups of same diacritics to hebrew2
analysis2 normalization/diacritics sorting
Fixed -overwrite in homograph scripts
Split on hanzi/nonhanzi boundary
don't crash if language dict not found in phonemization steps
User interface
- toucan TTS integration
- word editor
New Languages:
- cantonese
- minnan/taiwanese
- minnan/hokkien

What's Changed

add build.sh to cmd by @thewh1teagle in #27
rename dirty to lexicon by @thewh1teagle in #26
format all go files by @thewh1teagle in #29
standard format and dev recommendation in docs by @thewh1teagle in #30
add batchsize flag and remove append flag from learn.tsv by @thewh1teagle in #36

Full Changelog: v0.5.1...v0.6.0

Contributors

thewh1teagle

Assets 2

21 Mar 17:23

neurlang

v0.5.1

retrained 49 new models for forward / reverse phonemization (non-breaking change, new patch version)
enable normalization for Vietnamese
fix normalization non working
split hyphen words
remove old analysis aligner
fix cleaning bug and do final clean
backend: add explain world feature
Starting with empty/null json.Map using short keywords
fix: resolve infinite loop and improve alignment logic with Efficient Memoization for padspace languages
Add new rule SrcDuplicate to language.json
Clear combobox on first partial search click
Don't overwrite model when training
Finetune italian/tamil using --rowlossimportance 6

What's Changed

update gitignore by @thewh1teagle in #17
add hebrew2 folder by @thewh1teagle in #13

New Contributors

@thewh1teagle made their first contribution in #17

Full Changelog: v0.5.0...v0.5.1

Contributors

thewh1teagle

Assets 34

14 Mar 17:43

neurlang

v0.5.0

retrained all 87 models for forward / reverse phonemization (breaking change, new minor version)
new transformer based models for out-of-dictionary word inference
feature: punctuation preservation: toggle preserve/hide punctuation mode
feature: multiple language dictionaries for use in the same sentence based on user preference

User interface

out of dictionary words in underlined
preserve punctuation in bold

New Languages:

english/american
english/british

Thanks to kokoro for generously offering their data.

Full Changelog: v0.4.0...v0.5.0

Assets 34

01 Mar 11:14

neurlang

v0.4.0

retrained all 85 models for forward / reverse phonemization (breaking change, new minor version)
new quaternary models (should be faster)
threading racing fix
dictionary processing error fix

Full Changelog: v0.3.0...v0.4.0

Assets 34

09 Feb 16:06

neurlang

v0.3.0

retrained all 85 models for forward / reverse phonemization (breaking change, new minor version)
fixes regarding thread safety of goruut

Full Changelog: v0.2.4...v0.3.0

Assets 34

01 Feb 21:11

neurlang

v0.2.4

fix compilation on architectures without avx512
reverse phonemization models for all provided languages (from IPA to the language)
fix forward direction phonemization to work (load the model)

Full Changelog: v0.2.3...v0.2.4

Assets 34

01 Feb 19:34

neurlang

v0.2.3

fix compilation on architectures without avx512
reverse phonemization models for all provided languages (from IPA to the language)

Full Changelog: v0.2.2...v0.2.3

Assets 2