LocalVocal - Speech AI assistant OBS Plugin

Repo sponsors: Recall.ai - API for desktop recording

If you’re looking for a hosted desktop recording API, consider checking out Recall.ai, an API that records Zoom, Google Meet, Microsoft Teams, in-person meetings, and more.

LocalVocal - Speech AI assistant OBS Plugin

Download:

Introduction

LocalVocal lets you transcribe, locally on your machine, speech into text and simultaneously translate to any language. ✅ No GPU required, ✅ no cloud costs, ✅ no network and ✅ no downtime! Privacy first - all data stays on your machine.

The plugin runs OpenAI's Whisper to process real-time speech and predict a transcription, utilizing Whisper.cpp from ggerganov to run the model efficiently on CPUs and GPUs. Translation is done with CTranslate2.

Usage

https://youtu.be/ns4cP9HFTxQ https://youtu.be/4llyfNi9FGs https://youtu.be/R04w02qG26o

Do more with LocalVocal:

Current Features:

Transcribe audio to text in real time in 100 languages
Display captions on screen using text sources
Send captions to a .txt or .srt file (to read by external sources or video playback) with and without aggregation option
Sync'ed captions with OBS recording timestamps
Send captions on a RTMP stream to e.g. YouTube, Twitch
Bring your own Whisper model (any GGML)
Translate captions in real time to major languages (with cloud prviders, Whisper built-in translation as well as NMT models)
CUDA, hipBLAS (AMD ROCm), Apple Arm64, AVX & SSE acceleration support
Filter out or replace any part of the produced captions
Partial transcriptions for a streaming-captions experience
100s of fine-tuned Whisper models for dozens of languages from HuggingFace

Download

Check out the latest releases for downloads and install instructions.

Available Versions

LocalVocal is available in multiple versions to cater to different hardware configurations and operating systems. Below is a brief explanation of the different versions you can download:

Windows (please ensure you have the latest MSVC runtime installed)
- generic: This version runs on all systems. See Generic variants for more details
- NVidia: This version is optimized for systems with NVIDIA GPUs. See NVidia optimized variants for more details
- AMD: This version is optimized for systems with AMD GPUs. See AMD optimized variants for more details
MacOS
- Intel (x86_64): This version is for Mac computers with Intel processors. See MacOS variants
- Apple Silicon (arm64): This version is optimized for Mac computers with Apple Silicon (M1, M2, etc.) processors. See MacOS variants
Linux x86_64: This version is for Linux systems with x86_64 architecture.
- generic: This version runs on all systems. See Generic variants for more details
- NVidia: This version is optimized for systems with NVIDIA GPUs. See NVidia optimized variants for more details
- AMD: This version is optimized for systems with AMD GPUs. See AMD optimized variants for more details

Make sure to download the version that matches your system's hardware and operating system for the best performance.

Whisper backends are now loaded dynamically when the plugin starts, which has 2 major benefits:

Better CPU performance and compatibility - Whisper can automatically select the best CPU backend that works on your system out of all the ones available. This means that the plugin can now make full use of newer CPUs with more features, as well as making it usable on even older hardware than before (prior to v0.5.0 it was assumed that users would have at least AVX2 capable CPUs)
More stability - If a backend is present that cannot be used on your system, either due to unavailable CPU features, missing dependencies, or something else, it will simply not be loaded instead of causing a crash

To ensure the plugin works "out-of-the-box", it is configured by default to use the CPU only (this is also the case for users upgrading from versions older than v0.5.0). This is to avoid immediate crashes on startup if for any reason your GPU cannot be used by one of the Whisper backends (e.g. the Metal backend on Apple just crashes if it is unable to allocate a buffer to load a model into)

If you want to use GPU acceleration, please ensure you go into the plugin settings and select your desired GPU acceleration backend

Generic variants

These variants should run well on any system regardless of hardware configuration. They contain the following Whispercpp backends:

CPU
- Generic x86_64
- Generic x86_64 with SSE4.2
- Sandy Bridge (CPU with SSE4.2, AVX)
- Haswell (CPU with SSE4.2, AVX, F16C, AVX2, BMI2, FMA)
- Sky Lake (CPU with SSE4.2, AVX, F16C, AVX2, BMI2, FMA, AVX512)
- Ice Lake (CPU with SSE4.2, AVX, F16C, AVX2, BMI2, FMA, AVX512, AVX512_VBMI AVX512_VNNI)
- Alder Lake (CPU with SSE4.2, AVX, F16C, AVX2, BMI2, FMA, AVX_VNNI)
- Sapphire Rapids (CPU with SSE4.2, AVX, F16C, AVX2, BMI2, FMA, AVX512, AVX512_VBMI AVX512_VNNI, AVX512_BF16, AMX_TITLE, AMX_INT8)
OpenBLAS - Used in conjunction with a CPU backend to accelerate processing speed
Vulkan - Standard cross-platform graphics library allowing for GPU accelerated processing on GPUs that aren't supported by CUDA or ROCm. Can also work with integrated GPUs)
- May need the Vulkan runtime on Windows which can be downloaded at https://sdk.lunarg.com/sdk/download/1.4.328.1/windows/VulkanRT-X64-1.4.328.1-Installer.exe
OpenCL (currently Linux only) - Industry standard parallel compute library that may be faster than Vulkan on supported GPUs

NVidia optimized variants

These variants contain all the backends from the generic variant, plus a CUDA backend that provides accelerated performance on supported NVidia GPUS. If the OpenCL backend is available on your platform, it also uses the CUDA OpenCL library instead of the generic one.

Make sure you have the latest NVidia GPU drivers installed and you will likely also need the CUDA toolkit v12.8.0 or newer.

If installing on Linux, to avoid installing the entire CUDA toolkit if you don't need it you can just install either the cuda-runtime-12.8 package to get all the runtime libs and drivers, or the cuda-libaries-12-8 package to just get the runtime libraries.

AMD optimized variants

These variants contain all the backends from the generic variant, plus a hipblas backend using AMD's ROCm framework that accelerates computation on supported AMD GPUs

Please ensure you have a compatible AMD GPU driver installed

Mac OS variants

These variants come with the following backends available:

CPU
- The same x86_64 variants as listed in Generic variants for Intel CPUs
- m1, m2/m3, and m4 variants for ARM CPUs
Accelerate - Used in conjunction with a CPU backend to accelerate processing speed
Metal - Uses the system's GPU for accelerated processing
CoreML - Special backend that uses Apple's CoreML instead of Whisper's normal model processing, running on either the Metal or CPU backends

Models

The plugin ships with the Tiny.en model, and will autonomously download other Whisper models through a dropdown. There's also an option to select an external GGML Whisper model file if you have it on disk.

If using CoreML on Apple, it will also automatically download the appropriate CoreML encoder model for your selected model.

Get more models from https://ggml.ggerganov.com/ and HuggingFace, follow the instructions on whisper.cpp to create your own models or download others such as distilled models.

Building

The plugin was built and tested on Mac OSX (Intel & Apple silicon), Windows (with and without Nvidia CUDA) and Linux.

Start by cloning this repo to a directory of your choice.

Mac OSX

Using the CI pipeline scripts, locally you would just call the zsh script, which builds for the architecture specified in $MACOS_ARCH (either x86_64 or arm64).

$ MACOS_ARCH="x86_64" ./.github/scripts/build-macos -c Release

Install

The above script should succeed and the plugin files (e.g. obs-localvocal.plugin) will reside in the ./release/Release folder off of the root. Copy the .plugin file to the OBS directory e.g. ~/Library/Application Support/obs-studio/plugins.

To get .pkg installer file, run for example

$ ./.github/scripts/package-macos -c Release

(Note that maybe the outputs will be in the Release folder and not the install folder like pakage-macos expects, so you will need to rename the folder from build_x86_64/Release to build_x86_64/install)

Linux

Using pre-compiled variants

Clone the repository and if not using Ubuntu install the development versions of these dependencies using your distribution's package manager:
- libcurl
- libsimde
- libssl
- icu
- openblas (preferably the OpenMP variant rather than the pthreads variant)
- OpenCL
- Vulkan
Installing ccache is also recommended if you are likely to be building the plugin multiple times
Install rust via rustup (recommended), or your distribution's package manager
Set the ACCELERATION environment variable to one of generic, nvidia, or amd (defaults to generic if unset)
```
export ACCELERATION="nvidia"
```

Then from the repo directory build the plugin by running:

./.github/scripts/build-linux

If you can't use the CI build script for some reason, you can build the plugin as follows

cmake -B build_x86_64 --preset linux-x86_64 -DCMAKE_INSTALL_PREFIX=./release
cmake --build build_x86_64 --target install

Installing

If using Ubuntu and the plugin was previously installed using a .deb package, copy the results to the standard OBS folders on Ubuntu
```
sudo cp -R release/RelWithDebInfo/lib/* /usr/lib/
sudo cp -R release/RelWithDebInfo/share/* /usr/share/
```
Otherwise, follow the official OBS plugins guide and copy the results to your user plugins folder
```
mkdir -p ~/.config/obs-studio/plugins/obs-localvocal/bin/64bit
cp -R release/RelWithDebInfo/lib/x86_64-linux-gnu/obs-plugins/* ~/.config/obs-studio/plugins/obs-localvocal/bin/64bit/
mkdir -p ~/.config/obs-studio/plugins/obs-localvocal/data
cp -R release/RelWithDebInfo/share/obs/obs-plugins/obs-localvocal/* ~/.config/obs-studio/plugins/obs-localvocal/data/
```
Note: The lib path in the release folder varies depending on your Linux distribution (e.g. on Gentoo the plugin libraries are found in release/RelWithDebInfo/lib64/obs-plugins) but the destination directory to copy them into will always be the same.

Building Whispercpp from source along with the plugin

If you can't use the CI build script for some reason, or simply prefer to build the Whispercpp dependency from source along with the plugin, follow the steps above but build the plugin using the following commands:

cmake -B build_x86_64 --preset linux-x86_64 -DLINUX_SOURCE_BUILD=ON -DCMAKE_INSTALL_PREFIX=./release
cmake --build build_x86_64 --target install

When building from source, the Vulkan and OpenCL development libraries are optional and will only be used in the build if they are installed. Similarly if the CUDA or ROCm toolkits are found, they will also be used and the relevant Whisper backends will be enabled.

The default for a full source build is to build both Whisper and the plugin optimized for the host system. To change this behaviour add one or both of the following options to the CMake configure command (the first of the two):

to build all CPU backends add -DWHISPER_DYNAMIC_BACKENDS=ON
to build all CUDA kernels add -DWHISPER_BUILD_ALL_CUDA_ARCHITECTURES=ON

Windows

Use the CI scripts again, for example:

> .github/scripts/Build-Windows.ps1 -Configuration Release

The build should exist in the ./release folder off the root. You can manually install the files in the OBS directory.

> Copy-Item -Recurse -Force "release\Release\*" -Destination "C:\Program Files\obs-studio\"

Building with CUDA support on Windows

LocalVocal will now build with CUDA support automatically through a prebuilt binary of Whisper.cpp from https://github.com/locaal-ai/locaal-ai-dep-whispercpp. The CMake scripts will download all necessary files.

To build with cuda add ACCELERATION as an environment variable (with cpu, hipblas, or cuda) and build regularly

> $env:ACCELERATION="cuda"
> .github/scripts/Build-Windows.ps1 -Configuration Release

Name		Name	Last commit message	Last commit date
Latest commit History 283 Commits
.github		.github
build-aux		build-aux
cmake		cmake
data		data
deps/c-webvtt-in-video-stream		deps/c-webvtt-in-video-stream
docs		docs
src		src
.clang-format		.clang-format
.cmake-format.json		.cmake-format.json
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
CMakePresets.json		CMakePresets.json
LICENSE		LICENSE
README.md		README.md
buildspec.json		buildspec.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Repo sponsors: Recall.ai - API for desktop recording

LocalVocal - Speech AI assistant OBS Plugin

Introduction

Usage

Download

Available Versions

Generic variants

NVidia optimized variants

AMD optimized variants

Mac OS variants

Models

Building

Mac OSX

Install

Linux

Using pre-compiled variants

Building Whispercpp from source along with the plugin

Windows

Building with CUDA support on Windows

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

amazon-contributing/upstream-to-obs-localvocal

Folders and files

Latest commit

History

Repository files navigation

Repo sponsors: Recall.ai - API for desktop recording

LocalVocal - Speech AI assistant OBS Plugin

Introduction

Usage

Download

Available Versions

Generic variants

NVidia optimized variants

AMD optimized variants

Mac OS variants

Models

Building

Mac OSX

Install

Linux

Using pre-compiled variants

Building Whispercpp from source along with the plugin

Windows

Building with CUDA support on Windows

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages