The Audiobook Generator is a Python application that automates the process of converting a PDF book into an audiobook. It leverages the Anthropic API for text optimization and the ElevenLabs API for text-to-speech conversion.
- Extracts text from a PDF book and splits it into chapters
- Translates the text from English to Czech using multiple translation services
- Optimizes the translated text for speech synthesis using Anthropic's API
- Generates audio files with word-level timing information using ElevenLabs API
- Stores the original, translated, and optimized text, as well as the generated audio files
- Provides a URL to access the generated audiobook
- Clone the repository:
git clone https://github.com/sparesparrow/eleven-audiobooks.git
cd eleven-audiobooks
- Create a virtual environment and install the dependencies:
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
pip install -e .
- Set up the necessary environment variables:
export ANTHROPIC_API_KEY=your_anthropic_api_key
export ELEVENLABS_API_KEY=your_elevenlabs_api_key
export DEEPL_API_KEY=your_deepl_api_key
To generate an audiobook from a PDF file:
python main.py data/LidskeJednani.pdf
The application will:
- Process the PDF and extract text
- Split into chapters using configurable markers
- Translate content if needed
- Optimize text for speech synthesis
- Generate audio with timing information
- Provide a URL to access the audiobook
To run the unit tests:
python -m pytest tests/
pdf_processor.py
: Handles PDF text extraction and chapter splittingtranslation_pipeline.py
: Manages text translation through multiple servicesbatch_text_optimizer.py
: Optimizes text for speech synthesisaudio_generator.py
: Generates audio with timing informationstorage_engine.py
: Handles data persistencemain.py
: Main application entry point
Contributions are welcome! If you find any issues or have suggestions for improvement, please open an issue or submit a pull request.
This project is licensed under the MIT License. See the LICENSE file for more information.