Maise - Mobile Artificial Intelligence Speech Engine

Maise is an open-source Android speech engine that provides high-quality, on-device text-to-speech synthesis and automatic speech recognition. The TTS component is implemented as an Android system TTS service, meaning it works out of the box with any app that uses the standard Android TextToSpeech API — no special integration required. The ASR component is implemented as an Android RecognitionService, compatible with any app using the standard SpeechRecognizer API.

How It Works

All processing runs fully on-device using ONNX Runtime.

Text-to-Speech

Text normalization — raw input text is cleaned and normalized (numbers, abbreviations, punctuation, etc.)
Phonemization — Open Phonemizer converts normalized text into phoneme sequences
Synthesis — phonemes are fed into Kokoro, a high-quality multi-lingual neural TTS model, to produce a raw PCM audio waveform
Streaming playback — sentences are synthesized and played concurrently using a producer-consumer pipeline so audio starts playing before the full text has been synthesized

Audio output is 24 kHz mono 16-bit PCM.

Automatic Speech Recognition

Recording — 16 kHz mono 16-bit PCM audio is captured from the microphone
Log-mel spectrogram — a Whisper-compatible 80-band log-mel spectrogram is computed on-device
Transcription — the spectrogram is fed through distil-whisper/distil-small.en, an encoder-decoder Transformer model, using greedy decoding to produce the transcribed text

Voices

Maise ships with a large collection of Kokoro voices across multiple languages.

Language	Voices
English (US)	alloy, aoede, bella, heart, jessica, kore, nicole, nova, river, sarah, sky, adam, echo, eric, fenrir, liam, michael, onyx, puck, santa
English (UK)	alice, emma, isabella, lily, daniel, fable, george, lewis
German	dora, alex, santa
French	siwis
Greek	alpha-f, beta-f, omega-m, psi-m
Italian	sara, nicola
Japanese	alpha-f, gongitsune, nezumi, tebukuro, kumo
Portuguese (BR)	dora, alex, santa
Chinese (Simplified)	xiaobei, xiaoni, xiaoxiao, xiaoyi, yunjian, yunxi, yunxia, yunyang

The default voice is en-US-heart-kokoro.

App

The Maise app provides a simple interface for:

Selecting a voice from the full list
Entering text and previewing speech synthesis directly in-app
Opening Android TTS settings to configure Maise as the system default

The selected voice is persisted and shared with the background TTS service so your preference is respected system-wide.

Setup

Text-to-Speech

To use Maise as your system TTS engine, set it as the default in your device settings:

Settings > Accessibility > Text-to-Speech Output

Select Maise as the preferred engine. After that, any app using the Android TextToSpeech API will use Maise automatically.

Automatic Speech Recognition

To use Maise as your system speech recognizer, set it as the default in your device settings:

Settings > Apps > Default Apps > Assist & voice input

Select Maise as the preferred recognizer. After that, any app using the Android SpeechRecognizer API will use Maise automatically. The RECORD_AUDIO permission must be granted to the app.

Cloning

git clone https://github.com/Mobile-Artificial-Intelligence/maise.git

Building

./gradlew :app:assembleRelease

The output APK will be at:

Release: app/build/outputs/apk/release/app-release.apk
Debug: app/build/outputs/apk/debug/app-debug.apk

Name		Name	Last commit message	Last commit date
Latest commit History 93 Commits
.github/workflows		.github/workflows
.vscode		.vscode
app		app
gradle/wrapper		gradle/wrapper
kokoro		kokoro
open-phonemizer		open-phonemizer
resources		resources
whisper		whisper
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
build.gradle		build.gradle
gradle.properties		gradle.properties
gradlew		gradlew
gradlew.bat		gradlew.bat
logo.svg		logo.svg
settings.gradle		settings.gradle

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Maise - Mobile Artificial Intelligence Speech Engine

How It Works

Text-to-Speech

Automatic Speech Recognition

Voices

App

Setup

Text-to-Speech

Automatic Speech Recognition

Cloning

Building

About

Uh oh!

Releases 1

Contributors

Uh oh!

Languages

License

Mobile-Artificial-Intelligence/maise

Folders and files

Latest commit

History

Repository files navigation

Maise - Mobile Artificial Intelligence Speech Engine

How It Works

Text-to-Speech

Automatic Speech Recognition

Voices

App

Setup

Text-to-Speech

Automatic Speech Recognition

Cloning

Building

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Contributors

Uh oh!

Languages