OpenReader WebUI is an open source text to speech document reader web app built using Next.js, offering a TTS read along experience with narration for EPUB, PDF, TXT, MD, and DOCX documents. It supports multiple TTS providers including OpenAI, Deepinfra, and custom OpenAI-compatible endpoints like Kokoro-FastAPI and Orpheus-FastAPI
- π― Multi-Provider TTS Support
- Kokoro-FastAPI: Supporting multi-voice combinations (like
af_heart+af_bella) - Orpheus-FastAPI
- Custom OpenAI-compatible: Any TTS API with
/v1/audio/voicesand/v1/audio/speechendpoints - Cloud TTS Providers (requiring API keys)
- Deepinfra: Kokoro-82M + models with support for cloned voices and more
- OpenAI API ($$): tts-1, tts-1-hd, and gpt-4o-mini-tts w/ instructions
- Kokoro-FastAPI: Supporting multi-voice combinations (like
- π (Updated) Server-side Sync and Storage
- (New) External Library Import enables importing documents to the browser's storage from a folder mounted on the server
- (Updated) Sync documents between the browser and server to get them on other browsers or devices
- π§ Server-side Audiobook Export in m4b/mp3, with resumable, chapter-based export and regeneration
- π Read Along Experience providing real-time text highlighting during playback (PDF/EPUB)
- (New) Word-by-word highlighting uses word-by-word timestamps generated server-side with whisper.cpp (optional)
- π§ Smart Sentence-Aware Narration merges sentences across pages/chapters for smoother TTS
- π Optimized Next.js TTS Proxy with audio caching and optimized repeat playback
- π¨ Customizable Experience
- π¨ Multiple app theme options
- βοΈ Various TTS and document handling settings
- And more ...
- Recent version of Docker installed on your machine
- A TTS API server (Kokoro-FastAPI, Orpheus-FastAPI, Deepinfra, OpenAI, etc.) running and accessible
Note: If you have good hardware, you can run Kokoro-FastAPI with Docker locally (see below).
docker run --name openreader-webui \
--restart unless-stopped \
-p 3003:3003 \
ghcr.io/richardr1126/openreader-webui:latest(Optionally): Set the TTS API_BASE URL and/or API_KEY to be default for all devices
docker run --name openreader-webui \
--restart unless-stopped \
-e API_KEY=none \
-e API_BASE=http://host.docker.internal:8880/v1 \
-p 3003:3003 \
ghcr.io/richardr1126/openreader-webui:latestVisit http://localhost:3003 to run the app and set your settings.
Note: Requesting audio from the TTS API happens on the Next.js server not the client. So the base URL for the TTS API should be accessible and relative to the Next.js server. If it is in a Docker you may need to use
host.docker.internalto access the host machine, instead oflocalhost.
- Set the TTS Provider and Model in the Settings modal
- Set the TTS API Base URL and API Key if needed (more secure to set in env vars)
- Select your model's voice from the dropdown (voices try to be fetched from TTS Provider API)
docker stop openreader-webui && \
docker rm openreader-webui && \
docker pull ghcr.io/richardr1126/openreader-webui:latestBy default (no volume mounts), OpenReader will store its server-side files inside the container filesystem (which is lost if you remove the container).
Persist server-side storage (/app/docstore)
/app/docstore)Run the container with the volume mounted:
docker run --name openreader-webui \
--restart unless-stopped \
-p 3003:3003 \
-v openreader_docstore:/app/docstore \
ghcr.io/richardr1126/openreader-webui:latestThis will create a Docker named volume openreader_docstore to persist all server-side files, including:
- Documents: Stored under
/app/docstore/documents_v1 - Audiobook exports: Stored under
/app/docstore/audiobooks_v1- Per-audiobook settings:
/app/docstore/audiobooks_v1/<bookId>-audiobook/audiobook.meta.json - Chapters:
0001__<title>.m4bor0001__<title>.mp3(no per-chapter.meta.jsonfiles)
- Per-audiobook settings:
- Settings
This ensures that your documents, exported audiobooks, and server-side settings are retained even if the container is removed or recreated.
Mount an external library folder (read-only recommended)
docker run --name openreader-webui \
--restart unless-stopped \
-p 3003:3003 \
-v openreader_docstore:/app/docstore \
-v /path/to/your/library:/app/docstore/library:ro \
ghcr.io/richardr1126/openreader-webui:latestSeparate from the main docstore volume, this mounts an external folder into the container at /app/docstore/library (read-only recommended) so OpenReader can use an existing library of documents.
To import from the mounted library: Settings β Documents β Server Library Import
Note: Every file in the mounted volume is imported to the client browser's storage. Please ensure that the mounted library is not too large to avoid performance issues.
You can run the Kokoro TTS API server directly with Docker. We are not responsible for issues with Kokoro-FastAPI. For best performance, use an NVIDIA GPU (for GPU version) or Apple Silicon (for CPU version).
Kokoro-FastAPI (CPU)
docker run -d \
--name kokoro-tts \
--restart unless-stopped \
-p 8880:8880 \
-e ONNX_NUM_THREADS=8 \
-e ONNX_INTER_OP_THREADS=4 \
-e ONNX_EXECUTION_MODE=parallel \
-e ONNX_OPTIMIZATION_LEVEL=all \
-e ONNX_MEMORY_PATTERN=true \
-e ONNX_ARENA_EXTEND_STRATEGY=kNextPowerOfTwo \
-e API_LOG_LEVEL=DEBUG \
ghcr.io/remsky/kokoro-fastapi-cpu:v0.2.4Adjust environment variables as needed for your hardware and use case.
Kokoro-FastAPI (GPU)
docker run -d \
--name kokoro-tts \
--gpus all \
--user 1001:1001 \
--restart unless-stopped \
-p 8880:8880 \
-e USE_GPU=true \
-e PYTHONUNBUFFERED=1 \
-e API_LOG_LEVEL=DEBUG \
ghcr.io/remsky/kokoro-fastapi-gpu:v0.2.4Adjust environment variables as needed for your hardware and use case.
β οΈ Important Notes:
- For best results, set the
-e API_BASE=for OpenReader's Docker tohttp://kokoro-tts:8880/v1- For issues or support, see the Kokoro-FastAPI repository.
- The GPU version requires NVIDIA Docker support and works best with NVIDIA GPUs. The CPU version works best on Apple Silicon or modern x86 CPUs.
-
Node.js (recommended: use nvm)
-
pnpm (recommended) or npm
npm install -g pnpm
-
A TTS API server (Kokoro-FastAPI, Orpheus-FastAPI, Deepinfra, OpenAI, etc.) running and accessible Optionally required for different features:
-
FFmpeg (required for audiobook m4b creation only)
brew install ffmpeg
-
libreoffice (required for DOCX files)
brew install libreoffice
-
whisper.cpp (optional, required for word-by-word highlighting)
# clone and build whisper.cpp (no model download needed β OpenReader handles that) git clone https://github.com/ggml-org/whisper.cpp.git cd whisper.cpp cmake -B build cmake --build build -j --config Release # point OpenReader to the compiled whisper-cli binary echo WHISPER_CPP_BIN=\"$(pwd)/build/bin/whisper-cli\"
Note: The
WHISPER_CPP_BINpath should be set in your.envfile for OpenReader to use word-by-word highlighting features.
-
Clone the repository:
git clone https://github.com/richardr1126/OpenReader-WebUI.git cd OpenReader-WebUI -
Install dependencies:
With pnpm (recommended):
pnpm i # or npm i -
Configure the environment:
cp template.env .env # Edit .env with your configuration settingsNote: The base URL for the TTS API should be accessible and relative to the Next.js server
-
Start the development server:
With pnpm (recommended):
pnpm dev # or npm run devor build and run the production server:
With pnpm:
pnpm build # or npm run build pnpm start # or npm start
Visit http://localhost:3003 to run the app.
For feature requests or ideas you have for the project, please use the Discussions tab.
If you encounter issues, please open an issue on GitHub following the template (which is very light).
Contributions are welcome! Fork the repository and submit a pull request with your changes.
This project would not be possible without standing on the shoulders of these giants:
- Kokoro-82M model
- Kokoro-FastAPI
- whisper.cpp
- ffmpeg
- react-pdf npm package
- react-reader npm package
- linux/amd64 (x86_64)
- linux/arm64 (Apple Silicon, Raspberry Pi, SBCs, etc.)
- Framework: Next.js (React)
- Containerization: Docker
- Storage:
- Dexie.js IndexedDB wrapper for client-side storage
- PDF:
- EPUB:
- Markdown/Text:
- UI:
- TTS: (tested on)
- Deepinfra API (Kokoro-82M, Orpheus-3B, Sesame-1B)
- Kokoro FastAPI TTS
- Orpheus FastAPI TTS
- NLP:
- compromise NLP library for sentence splitting
- cmpstr String comparison library
- whisper.cpp for TTS timestamps (word-by-word highlighting)
This project is licensed under the MIT License.