Audiobook Generator

The Audiobook Generator is a Python application that automates the process of converting a PDF book into an audiobook. It leverages the Anthropic API for text optimization and the ElevenLabs API for text-to-speech conversion.

Features

Extracts text from a PDF book and splits it into chapters
Translates the text from English to Czech using multiple translation services
Optimizes the translated text for speech synthesis using Anthropic's API
Generates audio files with word-level timing information using ElevenLabs API
Stores the original, translated, and optimized text, as well as the generated audio files
Provides a URL to access the generated audiobook

Installation

Clone the repository:

git clone https://github.com/sparesparrow/eleven-audiobooks.git
cd eleven-audiobooks

Create a virtual environment and install the dependencies:

python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
pip install -e .

Set up the necessary environment variables:

export ANTHROPIC_API_KEY=your_anthropic_api_key
export ELEVENLABS_API_KEY=your_elevenlabs_api_key
export DEEPL_API_KEY=your_deepl_api_key

Usage

To generate an audiobook from a PDF file:

python main.py data/LidskeJednani.pdf

The application will:

Process the PDF and extract text
Split into chapters using configurable markers
Translate content if needed
Optimize text for speech synthesis
Generate audio with timing information
Provide a URL to access the audiobook

Testing

To run the unit tests:

python -m pytest tests/

Project Structure

pdf_processor.py: Handles PDF text extraction and chapter splitting
translation_pipeline.py: Manages text translation through multiple services
batch_text_optimizer.py: Optimizes text for speech synthesis
audio_generator.py: Generates audio with timing information
storage_engine.py: Handles data persistence
main.py: Main application entry point

Contributing

Contributions are welcome! If you find any issues or have suggestions for improvement, please open an issue or submit a pull request.

License

This project is licensed under the MIT License. See the LICENSE file for more information.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.github/workflows		.github/workflows
data		data
docs		docs
tests		tests
.env.example		.env.example
.gitignore		.gitignore
BatchProcessor.py		BatchProcessor.py
CHANGES.md		CHANGES.md
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
audio_generator.py		audio_generator.py
examples.yaml		examples.yaml
main.py		main.py
pdf_processor.py		pdf_processor.py
setup.py		setup.py
split_markdown.py		split_markdown.py
storage_engine.py		storage_engine.py
translation_pipeline.py		translation_pipeline.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Audiobook Generator

Features

Installation

Usage

Testing

Project Structure

Contributing

License

About

Releases

Packages

Languages

License

sparesparrow/eleven-audiobooks

Folders and files

Latest commit

History

Repository files navigation

Audiobook Generator

Features

Installation

Usage

Testing

Project Structure

Contributing

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages