Skip to content

This project provides a real-time speech-to-speech translation system using Deepgram for speech recognition, LangChain for language translation, and ElevenLabs for voice synthesis. All without using Openai's Realtime model.

Notifications You must be signed in to change notification settings

sumit03guha/speech-to-speech-realtime-translation

Repository files navigation

Speech-to-Speech Realtime Translation Without OpenAI Realtime

This project provides a real-time speech-to-speech translation system using Deepgram for speech recognition, LangChain for language translation, and ElevenLabs for voice synthesis.

Features

  • Real-time speech recognition using Deepgram
  • Language translation using LangChain and OpenAI's GPT model (not using OpenAI's real-time model, so it won't be pricey)
  • Voice synthesis using ElevenLabs

Requirements

  • Python 3.12.0
  • Deepgram API key
  • OpenAI API key
  • ElevenLabs API key

Installation

  1. Clone the repository:

    git clone https://github.com/yourusername/speech-to-speech-realtime-translation.git
    cd speech-to-speech-realtime-translation
  2. Install dependencies using Poetry or pip:

    poetry install

    or

    pip install -r requirements.txt
  3. Create a .env file based on the .env.example file and add your API keys:

    cp .env.example .env

    Fill in the .env file with your API keys:

    OPENAI_API_KEY=your_openai_api_key
    DEEPGRAM_API_KEY=your_deepgram_api_key
    ELEVEN_API_KEY=your_elevenlabs_api_key

Usage

  1. Run the main script:

    poetry run python main.py
  2. Follow the prompts to enter the input and output languages.

Project Structure

License

This project is licensed under the MIT License.

About

This project provides a real-time speech-to-speech translation system using Deepgram for speech recognition, LangChain for language translation, and ElevenLabs for voice synthesis. All without using Openai's Realtime model.

Topics

Resources

Stars

Watchers

Forks

Languages