Repo sponsors: Recall.ai - API for desktop recording
If you’re looking for a hosted desktop recording API, consider checking out Recall.ai, an API that records Zoom, Google Meet, Microsoft Teams, in-person meetings, and more.
LocalVocal lets you transcribe, locally on your machine, speech into text and simultaneously translate to any language. âś… No GPU required, âś… no cloud costs, âś… no network and âś… no downtime! Privacy first - all data stays on your machine.
The plugin runs OpenAI's Whisper to process real-time speech and predict a transcription, utilizing Whisper.cpp from ggerganov to run the model efficiently on CPUs and GPUs. Translation is done with CTranslate2.
Â
Â
https://youtu.be/ns4cP9HFTxQ
https://youtu.be/4llyfNi9FGs
https://youtu.be/R04w02qG26o
Do more with LocalVocal:
- RealTime Translation
- Translate Caption any Application
- Real-time Translation with DeepL
- Real-time Translation with OpenAI
- ChatGPT + Text-to-speech
- POST Captions to YouTube
- Local LLM Real-time Translation
- Usage Tutorial
Current Features:
- Transcribe audio to text in real time in 100 languages
- Display captions on screen using text sources
- Send captions to a .txt or .srt file (to read by external sources or video playback) with and without aggregation option
- Sync'ed captions with OBS recording timestamps
- Send captions on a RTMP stream to e.g. YouTube, Twitch
- Bring your own Whisper model (any GGML)
- Translate captions in real time to major languages (with cloud prviders, Whisper built-in translation as well as NMT models)
- CUDA, hipBLAS (AMD ROCm), Apple Arm64, AVX & SSE acceleration support
- Filter out or replace any part of the produced captions
- Partial transcriptions for a streaming-captions experience
- 100s of fine-tuned Whisper models for dozens of languages from HuggingFace
Check out the latest releases for downloads and install instructions.
LocalVocal is available in multiple versions to cater to different hardware configurations and operating systems. Below is a brief explanation of the different versions you can download:
- Windows (please ensure you have the latest MSVC runtime installed)
- generic: This version runs on all systems. See Generic variants for more details
- NVidia: This version is optimized for systems with NVIDIA GPUs. See NVidia optimized variants for more details
- AMD: This version is optimized for systems with AMD GPUs. See AMD optimized variants for more details
- MacOS
- Intel (x86_64): This version is for Mac computers with Intel processors. See MacOS variants
- Apple Silicon (arm64): This version is optimized for Mac computers with Apple Silicon (M1, M2, etc.) processors. See MacOS variants
- Linux x86_64: This version is for Linux systems with x86_64 architecture.
- generic: This version runs on all systems. See Generic variants for more details
- NVidia: This version is optimized for systems with NVIDIA GPUs. See NVidia optimized variants for more details
- AMD: This version is optimized for systems with AMD GPUs. See AMD optimized variants for more details
Make sure to download the version that matches your system's hardware and operating system for the best performance.
Whisper backends are now loaded dynamically when the plugin starts, which has 2 major benefits:
- Better CPU performance and compatibility - Whisper can automatically select the best CPU backend that works on your system out of all the ones available. This means that the plugin can now make full use of newer CPUs with more features, as well as making it usable on even older hardware than before (prior to v0.5.0 it was assumed that users would have at least AVX2 capable CPUs)
- More stability - If a backend is present that cannot be used on your system, either due to unavailable CPU features, missing dependencies, or something else, it will simply not be loaded instead of causing a crash
To ensure the plugin works "out-of-the-box", it is configured by default to use the CPU only (this is also the case for users upgrading from versions older than v0.5.0). This is to avoid immediate crashes on startup if for any reason your GPU cannot be used by one of the Whisper backends (e.g. the Metal backend on Apple just crashes if it is unable to allocate a buffer to load a model into)
If you want to use GPU acceleration, please ensure you go into the plugin settings and select your desired GPU acceleration backend
These variants should run well on any system regardless of hardware configuration. They contain the following Whispercpp backends:
- CPU
- Generic x86_64
- Generic x86_64 with SSE4.2
- Sandy Bridge (CPU with SSE4.2, AVX)
- Haswell (CPU with SSE4.2, AVX, F16C, AVX2, BMI2, FMA)
- Sky Lake (CPU with SSE4.2, AVX, F16C, AVX2, BMI2, FMA, AVX512)
- Ice Lake (CPU with SSE4.2, AVX, F16C, AVX2, BMI2, FMA, AVX512, AVX512_VBMI AVX512_VNNI)
- Alder Lake (CPU with SSE4.2, AVX, F16C, AVX2, BMI2, FMA, AVX_VNNI)
- Sapphire Rapids (CPU with SSE4.2, AVX, F16C, AVX2, BMI2, FMA, AVX512, AVX512_VBMI AVX512_VNNI, AVX512_BF16, AMX_TITLE, AMX_INT8)
- OpenBLAS - Used in conjunction with a CPU backend to accelerate processing speed
- Vulkan - Standard cross-platform graphics library allowing for GPU accelerated processing on GPUs that aren't supported by CUDA or ROCm. Can also work with integrated GPUs)
- May need the Vulkan runtime on Windows which can be downloaded at https://sdk.lunarg.com/sdk/download/1.4.328.1/windows/VulkanRT-X64-1.4.328.1-Installer.exe
- OpenCL (currently Linux only) - Industry standard parallel compute library that may be faster than Vulkan on supported GPUs
These variants contain all the backends from the generic variant, plus a CUDA backend that provides accelerated performance on supported NVidia GPUS. If the OpenCL backend is available on your platform, it also uses the CUDA OpenCL library instead of the generic one.
Make sure you have the latest NVidia GPU drivers installed and you will likely also need the CUDA toolkit v12.8.0 or newer.
If installing on Linux, to avoid installing the entire CUDA toolkit if you don't need it you can just install either the cuda-runtime-12.8 package to get all the runtime libs and drivers, or the cuda-libaries-12-8 package to just get the runtime libraries.
These variants contain all the backends from the generic variant, plus a hipblas backend using AMD's ROCm framework that accelerates computation on supported AMD GPUs
Please ensure you have a compatible AMD GPU driver installed
These variants come with the following backends available:
- CPU
- The same x86_64 variants as listed in Generic variants for Intel CPUs
- m1, m2/m3, and m4 variants for ARM CPUs
- Accelerate - Used in conjunction with a CPU backend to accelerate processing speed
- Metal - Uses the system's GPU for accelerated processing
- CoreML - Special backend that uses Apple's CoreML instead of Whisper's normal model processing, running on either the Metal or CPU backends
The plugin ships with the Tiny.en model, and will autonomously download other Whisper models through a dropdown. There's also an option to select an external GGML Whisper model file if you have it on disk.
If using CoreML on Apple, it will also automatically download the appropriate CoreML encoder model for your selected model.
Get more models from https://ggml.ggerganov.com/ and HuggingFace, follow the instructions on whisper.cpp to create your own models or download others such as distilled models.
The plugin was built and tested on Mac OSX (Intel & Apple silicon), Windows (with and without Nvidia CUDA) and Linux.
Start by cloning this repo to a directory of your choice.
Using the CI pipeline scripts, locally you would just call the zsh script, which builds for the architecture specified in $MACOS_ARCH (either x86_64 or arm64).
$ MACOS_ARCH="x86_64" ./.github/scripts/build-macos -c ReleaseThe above script should succeed and the plugin files (e.g. obs-localvocal.plugin) will reside in the ./release/Release folder off of the root. Copy the .plugin file to the OBS directory e.g. ~/Library/Application Support/obs-studio/plugins.
To get .pkg installer file, run for example
$ ./.github/scripts/package-macos -c Release(Note that maybe the outputs will be in the Release folder and not the install folder like pakage-macos expects, so you will need to rename the folder from build_x86_64/Release to build_x86_64/install)
-
Clone the repository and if not using Ubuntu install the development versions of these dependencies using your distribution's package manager:
- libcurl
- libsimde
- libssl
- icu
- openblas (preferably the OpenMP variant rather than the pthreads variant)
- OpenCL
- Vulkan
Installing ccache is also recommended if you are likely to be building the plugin multiple times
-
Install rust via rustup (recommended), or your distribution's package manager
-
Set the
ACCELERATIONenvironment variable to one ofgeneric,nvidia, oramd(defaults togenericif unset)export ACCELERATION="nvidia"
-
Then from the repo directory build the plugin by running:
./.github/scripts/build-linux
If you can't use the CI build script for some reason, you can build the plugin as follows
cmake -B build_x86_64 --preset linux-x86_64 -DCMAKE_INSTALL_PREFIX=./release cmake --build build_x86_64 --target install
-
Installing
If using Ubuntu and the plugin was previously installed using a .deb package, copy the results to the standard OBS folders on Ubuntu
sudo cp -R release/RelWithDebInfo/lib/* /usr/lib/ sudo cp -R release/RelWithDebInfo/share/* /usr/share/
Otherwise, follow the official OBS plugins guide and copy the results to your user plugins folder
mkdir -p ~/.config/obs-studio/plugins/obs-localvocal/bin/64bit cp -R release/RelWithDebInfo/lib/x86_64-linux-gnu/obs-plugins/* ~/.config/obs-studio/plugins/obs-localvocal/bin/64bit/ mkdir -p ~/.config/obs-studio/plugins/obs-localvocal/data cp -R release/RelWithDebInfo/share/obs/obs-plugins/obs-localvocal/* ~/.config/obs-studio/plugins/obs-localvocal/data/
Note: The lib path in the release folder varies depending on your Linux distribution (e.g. on Gentoo the plugin libraries are found in
release/RelWithDebInfo/lib64/obs-plugins) but the destination directory to copy them into will always be the same.
If you can't use the CI build script for some reason, or simply prefer to build the Whispercpp dependency from source along with the plugin, follow the steps above but build the plugin using the following commands:
cmake -B build_x86_64 --preset linux-x86_64 -DLINUX_SOURCE_BUILD=ON -DCMAKE_INSTALL_PREFIX=./release
cmake --build build_x86_64 --target installWhen building from source, the Vulkan and OpenCL development libraries are optional and will only be used in the build if they are installed. Similarly if the CUDA or ROCm toolkits are found, they will also be used and the relevant Whisper backends will be enabled.
The default for a full source build is to build both Whisper and the plugin optimized for the host system. To change this behaviour add one or both of the following options to the CMake configure command (the first of the two):
- to build all CPU backends add
-DWHISPER_DYNAMIC_BACKENDS=ON - to build all CUDA kernels add
-DWHISPER_BUILD_ALL_CUDA_ARCHITECTURES=ON
Use the CI scripts again, for example:
> .github/scripts/Build-Windows.ps1 -Configuration ReleaseThe build should exist in the ./release folder off the root. You can manually install the files in the OBS directory.
> Copy-Item -Recurse -Force "release\Release\*" -Destination "C:\Program Files\obs-studio\"LocalVocal will now build with CUDA support automatically through a prebuilt binary of Whisper.cpp from https://github.com/locaal-ai/locaal-ai-dep-whispercpp. The CMake scripts will download all necessary files.
To build with cuda add ACCELERATION as an environment variable (with cpu, hipblas, or cuda) and build regularly
> $env:ACCELERATION="cuda"
> .github/scripts/Build-Windows.ps1 -Configuration Release