This repository contains the YARP plugin for voice synthesis with VoiceBox APIs.
🚧 This repository is currently work in progress. 🚧 🚧 The software contained is this repository is currently under testing. 🚧 APIs may change without any warning. 🚧 This code should be not used before its first official release 🚧
Documentation of the individual devices is provided in the official Yarp documentation page:
- Ubuntu/Debian: sudo apt-get install libcurl4-openssl-dev nlohmann-json3-dev
- macOS (homebrew): brew install curl nlohmann-json
You need to download and install VoiceBox on your pc (check the repo documentation on how to install the package)
**Hint:** Try and launch the voicebox app and see if your GPU is listed in the GPU tab of the options section

If not, check if the torch version in the python virtual env, created by VoiceBox, is compatible with your CUDA version by typing:
> python3 -c "import torch; print(torch.cuda.is_available(), torch.version.cuda)"If you get an error, uninstall torch in the virtual env and reinstall a version compatible with your CUDA version. This should allow you tu use VoiceBox with your GPU
# Configure, compile and install
cmake -S. -Bbuild -DCMAKE_INSTALL_PREFIX=<install_prefix>
cmake --build build
cmake --build build --target install
To be able to actually use the device you need to create at least one voice profile in VoiceBox.
To do it follow these steps:
(For this example we will use a built-in voice)
3. Select the voice, fill the little form on the right and click Create Profile

NB: If you want, there are two bash scripts in src/devices/tests named create_test_profile.sh and create_second_test_profile.sh that will create an English based voice profile named "test_API_001" based on the preset voice "af_sarah" and "bm_daniel" respectively.
To launch the VoiceBox server you have to either launch the app or to manually lanuch the server by going in the VoiceBox repo folder and type:
source backend/venv/bin/activate
bun run dev:serverThe recommended way to lanch the device is via yarprobotinterface
The following lines contains a simple configuration file to launch the device with the voice profile created previously
<!-- SPDX-FileCopyrightText: 2023 Istituto Italiano di Tecnologia (IIT) -->
<!-- SPDX-License-Identifier: BSD-3-Clause -->
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE robot PUBLIC "-//YARP//DTD yarprobotinterface 3.0//EN" "http://www.yarp.it/DTD/yarprobotinterfaceV3.0.dtd">
<robot name="voiceBoxSynthesizer" build="2" portprefix="/ttsBot" xmlns:xi="http://www.w3.org/2001/XInclude">
<devices>
<device name="voiceBoxSpeechSynthesis" type="voiceBoxSynthesizer">
<param name="voice" extern-name="voiceBox_voice">
Sarah_001
</param>
</device>
<device name="synthesizerWrap" type="speechSynthesizer_nws_yarp">
<action phase="startup" level="5" type="attach">
<paramlist name="networks">
<elem name="subdeviceVoiceBox">
voiceBoxSpeechSynthesis
</elem>
</paramlist>
</action>
<action phase="shutdown" level="5" type="detach" />
</device>
</devices>
</robot>
🚧 This repository is currently work in progress. 🚧
🚧 This repository is currently work in progress. 🚧
This repository is maintained by:
| @elandini84 |


