Encoding and Decoding of Audio

Compressed Audio

Unfortunately the available memory on Microcontrollers is quite restricted and we do not get very far by storing a (uncompressed) WAV file e.g. in program/flesh memory, so I started to look into compressed audio formats: The compression and decompression can be done with the help of Codecs. Codecs are also important if you need to transmit audio data at a high sampling rate over a line with limited capacity.

On the desktop we can use the FFmpeg project which comes with a rich set of functionality. Unfortunately the situation is much more fragmented for Microcontrollers.

I started to collect the relevant libraries and in order to make things simple to use I also added a simple C++ API on top of the available libraries:

AudioDecoder convert an encoded format to 16 bit PCM
AudioEncoder convert from 16 bit PCM data to the encoded format.

Supported Codecs

Library	Class	Include	Function	Format	Application
-	DecoderL8	AudioCodecs/CodecL8.h	Decoding	PCM	Audio
-	EncoderL8	AudioCodecs/CodecL8.h	Encoding	PCM	Audio
-	DecoderL16	AudioCodecs/CodecL16.h	Decoding	PCM	Audio
-	EncoderL16	AudioCodecs/CodecL16.h	Encoding	PCM	Audio
-	DecoderFloat	AudioCodecs/CodecFloat.h	Decoding	PCM	Audio
-	EncoderFloat	AudioCodecs/CodecFloat.h	Encoding	PCM	Audio
-	WAVDecoder	AudioCodecs/CodecWAV.h	Decoding	WAV	Audio
-	WAVEncoder	AudioCodecs/CodecWAV.h	Encoding	WAV	Audio
-	WavIMADecoder	AudioCodecs/CodecWavIMA.h	Decoding	WAV/IMA	Audio
-	DecoderBase64	AudioCodecs/CodecBase64.h	Decoding	Base64
-	ADTSDecoder	AudioCodecs/CodecADTS.h	Decoding	ADTS	Audio
-	EncoderBase64	AudioCodecs/CodecBase64.h	Encoding	Base64
-	MetaDataFilterDecoder		Decoding	MP3	Audio
libhelix	MP3DecoderHelix	AudioCodecs/CodecMP3Helix.h	Decoding	MP3	Audio
libmad	MP3DecoderMAD	AudioCodecs/CodecMP3MAD.h	Decoding	MP3	Audio
minimp3	MP3DecoderMini	AudioCodecs/CodecMP3Mini.h	Decoding	MP3	Audio
liblame	MP3EncoderLAME	AudioCodecs/CodecMP3LAME.h	Encoding	MP3	Audio
libhelix	AACDecoderHelix	AudioCodecs/CodecAACHelix.h	Decoding	AAC	Audio
libfaad	AACDecoderFAAD	AudioCodecs/CodecAACFAAD.h	Decoding	AAC	Audio
fdk-aac	AACDecoderFDK	AudioCodecs/CodecAACFDK.h	Decoding	AAC	Audio
fdk-aac	AACEncoderFDK	AudioCodecs/CodecAACFDK.h	Encoding	AAC	Audio
libflac	FLACDecoder	AudioCodecs/CodecFLAC.h	Decoding	FLAC	Audio
libflac	FLACEncoder	AudioCodecs/CodecFLAC.h	Encoding	FLAC	Audio
libfoxenflac	FLACDecoderFoxen	AudioCodecs/CodecFLACFoxen.h	Decoding	FLAC	Audio
libvorbis-tremor	VorbisDecoder	AudioCodecs/CodecVorbis.h	Decoding	OGG Vorbis	Audio
libsbc	SBCDecoder	AudioCodecs/CodecSBC.h	Decoding	SBC	Audio
libsbc	SBCEncoder	AudioCodecs/CodecSBC.h	Encoding	SBC	Audio
liblc3	LC3Decoder	AudioCodecs/CodecLC3.h	Decoding	LC3	Audio
liblc3	LC3Encoder	AudioCodecs/CodecLC3.h	Encoding	LC3	Audio
libopenaptx	APTXDecoder	AudioCodecs/CodecAPTX.h	Decoding	APTX	Audio
libopenaptx	APTXEncoder	AudioCodecs/CodecAPTX.h	Encoding	APTX	Audio
codec-opus	OpusAudioDecoder	AudioCodecs/CodecOpus.h	Decoding	Opus	Audio
codec-opus	OpusAudioEncoder	AudioCodecs/CodecOpus.h	Encoding	Opus	Audio
codec-opus	OpusOggDecoder	AudioCodecs/CodecOpusOgg.h	Decoding	Opus	Audio
codec-opus	OpusOggEncoder	AudioCodecs/CodecOpusOgg.h	Encoding	Opus	Audio
adpcm	ADPCMDecoder	AudioCodecs/CodecADPCM.h	Decoding	ADPCM	Audio
adpcm	ADPCMEncoder	AudioCodecs/CodecADPCM.h	Encoding	ADPCM	Audio
adpcm-xq	ADPCMDecoderXQ	AudioCodecs/CodecADPCM.h	Decoding	ADPCM	Audio
adpcm-xq	ADPCMEncoderXQ	AudioCodecs/CodecADPCM.h	Encoding	ADPCM	Audio
alac	EncoderALAC	AudioCodecs/CodecALAC.h	Encoding	ALAC	Audio
alac	DecoderALAC	AudioCodecs/CodecALAC.h	Encoding	ALAC	Audio
libgsm	GSMDecoder	AudioCodecs/CodecGSM.h	Decoding	GSM	Speech
libgsm	GSMEncoder	AudioCodecs/CodecGSM.h	Encoding	GSM	Speech
libg7xx	G711_ALAWDecoder	AudioCodecs/CodecG7xx.h	Decoding	ALAW	Speech
libg7xx	G711_ALAWEncoder	AudioCodecs/CodecG7xx.h	Encoding	ALAW	Speech
libg7xx	G711_ULAWDecoder	AudioCodecs/CodecG7xx.h	Decoding	ULAW	Speech
libg7xx	G711_ULAWEncoder	AudioCodecs/CodecG7xx.h	Encoding	ULAW	Speech
libg7xx	G721Decoder	AudioCodecs/CodecG7xx.h	Decoding	G.721	Speech
libg7xx	G721Encoder	AudioCodecs/CodecG7xx.h	Encoding	G.721	Speech
libg722	G722Decoder	AudioCodecs/CodecG722.h	Decoding	G.722	Speech
libg722	G722Encoder	AudioCodecs/CodecG722.h	Encoding	G.722	Speech
libg7xx	G723_24Decoder	AudioCodecs/CodecG7xx.h	Decoding	G.723	Speech
libg7xx	G723_24Encoder	AudioCodecs/CodecG7xx.h	Encoding	G.723	Speech
libg7xx	G723_40Decoder	AudioCodecs/CodecG7xx.h	Decoding	G.723	Speech
libg7xx	G723_40Encoder	AudioCodecs/CodecG7xx.h	Encoding	G.723	Speech
codec2	Codec2Decoder	AudioCodecs/CodecCodec2.h	Decoding	Codec2	Speech
codec2	Codec2Encoder	AudioCodecs/CodecCodec2.h	Encoding	Codec2	Speech
codec-amr	AMRNBEncoder	AudioCodecs/CodecAMRNB.h	Encoding	AMR narrowband	Speech
codec-amr	AMRNBDecoder	AudioCodecs/CodecAMRNB.h	Decoding	AMR narrowband	Speech
codec-amr	AMRWBEncoder	AudioCodecs/CodecAMRWB.h	Encoding	AMR wideband	Speech
codec-amr	AMRWBDecoder	AudioCodecs/CodecAMRWB.h	Decoding	AMR wideband	Speech

Container

An audio container, also known as a container format, is a file format that encapsulates audio data, along with metadata and other information.

Library	Class	Include	Function	Format	Application
libopus	OggContainerDecoder	AudioCodecs/ContainerOgg.h	Decoding	Ogg	Audio
libopus	OggContainerEncoder	AudioCodecs/ContainerOgg.h	Encoding	Ogg	Audio
-	BinaryContainerEncoder	AudioCodecs/ContainerBinary.h	Encoding	-	Audio
-	BinaryContainerDecoder	AudioCodecs/ContainerBinary.h	Decoding	-	Audio
-	AVIDecoder	AudioCodecs/ContainerAVI.h	Decoding	AVI	Video
-	MTSDecoder	AudioCodecs/CodecMTS.h	Decoding	MPEG-TS (MTS)	Audio
tsdemux	MTSDecoderDemux	AudioCodecs/CodecTSDemux.h	Decoding	video/mp2t MPEG-TS MTS	Audio Video

Installation

If you want to use a codec, do not forget that you need to install the related library!

Decoding

Most decoders inherit from AudioDecoder. I am also providing an integration into my Arduino Audio Tools where you can use these libraries with the EncodedAudioStream class:

#include "AudioTools.h"
#include "AudioTools/AudioCodecs/CodecMP3Helix.h"
#include "BabyElephantWalk60_mp3.h"

MemoryStream data(BabyElephantWalk60_mp3, BabyElephantWalk60_mp3_len); // MP3 data source
I2SStream i2s; // final output of decoded stream
MP3DecoderHelix mp3; // Codec
EncodedAudioStream dec(&i2s, &mp3); // Decoding stream
StreamCopy copier(dec, data); // copy in to out

void setup(){
  Serial.begin(115200);

  i2s.begin();
  dec.begin();
}

void loop(){
  if (mp3) {
    copier.copy();
  } 
}

The above stream is implementing the following flow: mp3 MemoryStream -copy-> EncodedAudioStream -> I2SStream This method is used by most codecs.

The MultiDecoder

I have added the MultiDecoder class to support the decoding of multiple data formats. The actual decoder is only opened at it's first use: The relevant decoder is determined dynamically at the first write from the determined mime type. You can add your own custom mime type determination logic.

When you change the data source to provide a new format you need to call end() on the EncodedAudioStream, to let the decoder know that it needs to determine a new format.

#include "AudioTools.h"
#include "AudioTools/AudioCodecs/CodecMP3Helix.h"
#include "AudioTools/AudioCodecs/CodecAACHelix.h"
#include "AudioTools/AudioLibs/AudioBoardStream.h"
#include "AudioTools/Communication/AudioHttp.h"

URLStream url("ssid","password");
I2SStream i2s;
MultiDecoder multi;
MP3DecoderHelix mp3;
AACDecoderHelix aac;
WAVDecoder wav;
EncodedAudioStream dec(&i2s, &multi); // Decoding stream
StreamCopy copier(dec, url); // copy url to decoder


void setup(){
  Serial.begin(115200);
  AudioToolsLogger.begin(Serial, AudioToolsLogLevel::Info);  

  // register supported codecs with their mime type
  multi.addDecoder(mp3, "audio/mpeg");
  multi.addDecoder(aac, "audio/aac");
  multi.addDecoder(wav, "audio/vnd.wave");

  // setup i2s
  auto config = i2s.defaultConfig(TX_MODE);
  // you could define e.g your pins and change other settings
  //config.pin_ws = 10;
  //config.pin_bck = 11;
  //config.pin_data = 12;
  //config.mode = I2S_STD_FORMAT;
  i2s.begin(config);

  // setup I2S based on sampling rate provided by decoder
  dec.begin();

// mp3 radio
  url.begin("http://stream.srg-ssr.ch/m/rsj/mp3_128","audio/mpeg");

}

void loop(){
  copier.copy();
}

Please note:

The distinction between mp3 and aac just from their content is difficult and can't be 100% reliable. Files are more reliable because mp3 usually starts with some metadata.
You can provide the URLStream object in the constructor of the MultiDecoder, so that the system can look up the mime type from the http response.
The MimeDetector contains different alternatives predefined implementations that you can use when you add a decoder.
If you don't want to rely on the automatic determination you can select the decoder by calling the selectDecoder method.

Decoding on the Input Side

You can also decode on the input side, which is less efficient, but sometimes more convenient. This is implementing the following flow: mp3 MemoryStream -> EncodedAudioStream -copy-> I2SStream

#include "AudioTools.h"
#include "AudioTools/AudioCodecs/CodecMP3Helix.h"
#include "BabyElephantWalk60_mp3.h"

MemoryStream data(BabyElephantWalk60_mp3, BabyElephantWalk60_mp3_len); // MP3 data source
MP3DecoderHelix mp3;
EncodedAudioStream dec(&data, &mp3); // Decoding stream
I2SStream i2s; // final output of decoded stream
StreamCopy copier(i2s, dec); // copy dec to out

void setup(){
  Serial.begin(115200);

  i2s.begin();
  dec.addNotifyAudioChange(i2s);
//dec.resizeReadResultQueue(1024 * 10);
  dec.begin();
}

void loop(){
  copier.copy();
}

Please note that the read functionality is implemented by reading the data from the indicated source and calling the write functionality on the decoder to store the result in a queue. Calling readBytes() is then providing the data from this queue.
The MP3 decoder provides the PCM data in big arrays (e.g. 4608 bytes). The default decoding queue might not be big enough to store this result, so you might need to specify the queue size yourself. You can do this by setting it to a fixed size by calling dec.resizeReadResultQueue(1024 * 10); .

Because of this complexity I do not recommend this functionality for beginners and only use it if you have a good understanding of the result sizes provided by the involved functionality.

Streaming Decoding

Ogg (FLACDecoder) and Vorbis VorbisDecoder are using an alternative method which is pulling the data directly from an input stream:

In this case a StreamingDecoder is used.

#include "AudioTools.h"
#include "AudioTools/AudioCodecs/CodecVorbis.h"

const char* ssid = "ssid";
const char* pwd = "password";
URLStream url(ssid, pwd);
VorbisDecoder dec;
I2SStream i2s;

void setup() {
  Serial.begin(115200);
  AudioLogger::instance().begin(Serial, AudioLogger::Info);  

  i2s.begin(i2s.defaultConfig(TX_MODE));

  url.begin("http://marmalade.scenesat.com:8086/bitjam.ogg","application/ogg");

  // setup decoder
  dec.setInputStream(url);
  dec.setOutputStream(i2s);
  dec.begin();
}

void loop() {
  dec.copy();
}

You can transform any AudioDecoder into a StreamingDecoder with the help of a StreamingDecoderAdapter.

MultiStreamingDecoder

The MultiStreamingDecoder class is able to handle multiple decoders: It supports both StreamingDecoders and regular AudioDecoders efficiently. The mime type is determined internally with the help of the MimeDetector class. Alternatively you can specify the mime source: This is helpfull if you want to determine the mime e.g. from an URLStream.

#include "AudioTools.h"
#include "AudioTools/AudioCodecs/CodecMP3Helix.h"
#include "AudioTools/AudioCodecs/CodecAACHelix.h"
#include "AudioTools/AudioCodecs/CodecVorbis.h"
#include "AudioTools/AudioLibs/AudioBoardStream.h"

URLStream url("Phil Schatzmann","sabrina01"); 
AudioBoardStream i2s(AudioKitEs8388V1); // final output of decoded stream
MP3DecoderHelix mp3;
AACDecoderHelix aac;
VorbisDecoder ogg;
MultiStreamingDecoder multi;

void setup(){
  Serial.begin(115200);
  AudioToolsLogger.begin(Serial, AudioToolsLogLevel::Info);  

  // setup i2s
  auto config = i2s.defaultConfig(TX_MODE);
  i2s.begin(config);

  // register supported codecs with their mime type
  multi.addDecoder(mp3, "audio/mpeg");
  multi.addDecoder(aac, "audio/aac");
  multi.addDecoder(ogg, "audio/ogg; codec=vorbis"); // from mime detector
  multi.addDecoder(ogg, "application/ogg"); // from url

  // setup I2S based on sampling rate provided by decoder
  multi.setInput(url);
  multi.setOutput(i2s);
  multi.setMimeSource(url); // get mime from url
  multi.begin();

  // select a urls
  //url.begin("http://stream.srg-ssr.ch/m/rsj/mp3_128");
  //url.begin("https://audio.wavefarm.org/pondstation.mp3");
  url.begin("https://locus.creacast.com:9443/santander_bay.ogg");

}

void loop(){
  multi.copy();
}

Encoding

The encoding of audio data to a different format is also done with the help of the EncodedAudioStream class. The only difference (to the decoding examples) is that we pass an Encoder as argument.

Here is the related Arduino sketch:

#include "AudioTools.h"
#include "SdFat.h"

AudioInfo info(44100, 2, 16);                                          // The stream will have 2 channels 
WhiteNoiseGenerator<int16_t> noise(32000);                             // subclass of SoundGenerator with max amplitude of 32000
GeneratedSoundStream<int16_t> in_stream(noise);                   // Stream generated from sine wave
SdFat SD;
File audioFile;                                                   // final output stream
WAVEncoder encoder;
EncodedAudioStream out_stream(&audioFile, &encoder);             // encode as wav file
StreamCopy copier(out_stream, in_stream);                                // copies sound to out

void setup(){
  Serial.begin(115200);
  AudioLogger::instance().begin(Serial, AudioLogger::Info);  

  auto cfg = noise.defaultConfig();
  cfg.copyFrom(info);
  noise.begin(info);

  in_stream.begin();

  // we need to provide the audio information to the encoder
  
  out_stream.begin(info);
  // open the output file
  SD.begin(SdSpiConfig(PIN_CS, DEDICATED_SPI, SD_SCK_MHZ(2)));
  audioFile = SD.open("/test/002.wav", O_WRITE | O_CREAT);
}

void loop(){
    copier.copy();  
    // audioFile.flush(); // force write down of data
}

We create an input stream which is based on some sound generator. In the out_stream we indicate the final output stream (which is a file in the example) and the encoder that is used when the data is written: GeneratedSoundStream -copy-> EncodedAudioStream -> File

Please note the following:

you need to make sure that the file content is written to the file by calling audioFile.close() at the end - or by flushing the individual writes.
call out_stream.begin() before you write to a new file, this makes sure that the header is written to the file if necessary (e.g. for WAV files).
Before you start to write to a file, delete the file or move to the beginning. Otherwise the content is just appended!
This example is using the sdfat library, but you can use any other Arduino file library implementation.
MP3 and AAC are quire popular audio format, but they require a lot of memory and are at the edge what a Microcontroller can do. I recommend to avoid them and to prefer a lean format like ADPCM.

Uh oh!

Encoding and Decoding of Audio

Compressed Audio

Supported Codecs

Container

Installation

Decoding

The MultiDecoder

Decoding on the Input Side

Streaming Decoding

MultiStreamingDecoder

Encoding

Further Information

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally