Self healing audio streams with codecs? #1069
-
Hi, I'd like to use a codec to reduce bandwidth to transfer audio over RS485 (currently bandwidth is ok for a single one way stream, but I want to send in both directions without collisions on the half duplex line where it will get tight). As a first step I tried ADPCM (docs sounded promising) and it basically works (much easier than I thought), but... Serial communication is not error free, so sooner or later the encoded stream is corrupted. Are there other light weight and still efficient (low cpu cost, high compression, lossy) codecs that would self heal after a short period after the transmission error? EncodedAudioStream dec(&sink, new ADPCMDecoder(AV_CODEC_ID_ADPCM_IMA_WAV));
StreamCopy copier(dec, Serial1, 1024); Currently I detect how many samples per second I get in the sink and restart the decoder if it is zero. That works, but not good enough... |
Beta Was this translation helpful? Give feedback.
Replies: 11 comments 9 replies
-
I think the issue is that if you loose some bytes the decoding stops to work because it uses the number of bytes to determine the frame start. I would expect that it helps to pack the encoded data into some container. Recently I tried to add some forward error correction that might also help. However I did not have the time to test this yet... |
Beta Was this translation helpful? Give feedback.
-
Cool, this shoud work. I'll try. I hoped the decoder could detect that a frame start is not like it should be (e.g. by not seeing some magic number in a header) and reinits like at the start, because if I reset the receiver the decoder (usually) finds a valid entry of the incoming stream. Btw. what I described as restarting the decoder is effectively doing a full ESP32 reset, because this always crashes:
shouldn't that work? In the meantime I tried some other codecs (I found your blog):
|
Beta Was this translation helpful? Give feedback.
-
You could check the result of copier.copy(); to determine if you were getting any data. But if you don't have any reliable way to determine the start of the frame, just restarting will not help. I am curious if e.g my BinaryContainerEncoder/BinaryContainerDecoder is resolving this issue... |
Beta Was this translation helpful? Give feedback.
-
I havn't done some heavy testing on my ContainerBinary: so there might still be some bugs in it... |
Beta Was this translation helpful? Give feedback.
-
By the way, if you use the 8 bit codec you can still compress the audio by half and any lost byte will not disturb at all, it just might cause that left and right get switched if you play stereo. |
Beta Was this translation helpful? Give feedback.
-
Maybe the problem is somewhere else. |
Beta Was this translation helpful? Give feedback.
-
There is no real sync. The full pipeline is: SenderI2SStream i2s; // INMP441 delivers 24 as 32bit
Convert024to16 cvt(i2s); // convert 2ch 24bit to 1ch 16bit
BinaryContainerEncoder bcd(new ADPCMEncoder(AV_CODEC_ID_ADPCM_IMA_WAV));
EncodedAudioStream enc(&Serial1, &bcd);
StreamCopy copier(enc, cvt, 1024); // data pump ReceiverI2SStream i2s; // sink MAX98357A mono amp
Convert cvt(i2s); // 1ch -> 2ch as required for the mono amp - go figure... :)
BinaryContainerDecoder bcd(new ADPCMDecoder(AV_CODEC_ID_ADPCM_IMA_WAV));
EncodedAudioStream dec(&cvt, &bcd);
StreamCopy copier(dec, Serial1, 1024); I guess the amp will just get garbage if no data is available in the pace it expects it. In fact that is what happened while I still used standard sample rates with 2 channels and no codec (the RS485 maxed out with that). This did not cause program errors. |
Beta Was this translation helpful? Give feedback.
-
If you want to lookup any details: I just created these github repos: |
Beta Was this translation helpful? Give feedback.
-
no instant success with L8: updated the repos. It looks like throughput is very low (high load?). Usually, with ADPCM I see uncompressed 32kB/s as expected, now it is only about 5kB/s
sound is not recognizable, just loud cracks... Will be offline for a bit. Thanks for your support so far! |
Beta Was this translation helpful? Give feedback.
-
The simplest solution is to send each audio sample as 1 byte. If some bytes get lost, this will be not audible at all and in terms of audio quality it is very hard to distinguish between 8 bit and 16 bits. Here is the related example.. To test this I was increasing the baud rate to a very high value... For adpcm I have extended by binary container: it stores the audio info, metadata and audio in different record structures/segments. Each encoded frame is stored in a separate segment and a checksum is calculated to determine if the audio is still valid. The records start with a crlf so that the beginning of a segment can be found easily. Here is the related example |
Beta Was this translation helpful? Give feedback.
-
For reference, this is the code that works: /**
* Derived from
*
* @file send-adpcm-receive.ino
* @author Phil Schatzmann
* @brief Sending and receiving audio via Serial. You need to connect the RX pin
* with the TX pin!
*
* We send encoded ADPCM audio over the serial wire: The higher the transmission rate
* the higher the risk of data loss!
*
* @version 0.1
* @date 2023-11-25
*
* @copyright Copyright (c) 2022
*/
#include "AudioTools.h"
#include "AudioCodecs/CodecADPCM.h" // https://github.com/pschatzmann/adpcm
//#include "AudioLibs/AudioKit.h"
AudioInfo info(16000, 1, 16);
I2SStream out; // or AnalogAudioStream, AudioKitStream etc
I2SStream in;
// SineWaveGenerator<int16_t> sineWave(32000);
// GeneratedSoundStream<int16_t> sineStream(sineWave);
auto &serial = Serial2;
ADPCMEncoder enc(AV_CODEC_ID_ADPCM_IMA_WAV);
ADPCMDecoder dec(AV_CODEC_ID_ADPCM_IMA_WAV);
EncodedAudioStream enc_stream(&serial, &enc);
EncodedAudioStream dec_stream(&out, &dec);
// Throttle throttle(enc_stream);
static int frame_size = 256;
// StreamCopy copierOut(throttle, sineStream, frame_size); // copies sound into Serial
StreamCopy copierOut(enc_stream, in, frame_size); // copies mic into Serial
StreamCopy copierIn(dec_stream, serial, frame_size); // copies sound from Serial
void inputTask( void * parameter ){
Serial.printf("input() on core %d\n", xPortGetCoreID());
while( true ) {
// copy from serial
copierIn.copy();
delay(0); // nop?
}
}
void outputTask( void * parameter ){
Serial.printf("output() on core %d\n", xPortGetCoreID());
while( true ) {
// copy to serial
copierOut.copy();
delay(0); // nop?
}
}
void setup() {
Serial.begin(115200);
AudioLogger::instance().begin(Serial, AudioLogger::Warning);
BaseType_t coreId = xPortGetCoreID();
Serial.printf("setup() on core %d\n", coreId);
Serial.printf("ESP model: %s\n", ESP.getChipModel());
Serial.printf("ESP cores: %u\n", ESP.getChipCores());
Serial.printf("ESP rev: %u\n", ESP.getChipRevision());
Serial.printf("ESP freq: %u\n", ESP.getCpuFreqMHz());
Serial.printf("ESP mac: %08lx\n", ESP.getEfuseMac());
Serial.printf("ESP fsize: %u\n", ESP.getFlashChipSize());
Serial.printf("ESP fspeed: %u\n", ESP.getFlashChipSpeed());
Serial.printf("ESP fmode: %u\n", ESP.getFlashChipMode());
Serial.printf("ESP heap: %u\n", ESP.getFreeHeap());
Serial.printf("ESP psram: %u\n", ESP.getFreePsram());
// Note the format for setting a serial port is as follows:
// Serial.begin(baud-rate, protocol, RX pin, TX pin);
Serial2.begin(921600, SERIAL_8N1, 18, 19);
// sineWave.begin(info, N_B4*AMP);
// throttle.begin(info);
enc_stream.begin(info);
dec_stream.begin(info);
// PCM5102 SCK -> GND
pinMode(22, OUTPUT);
digitalWrite(22, LOW);
// start I2Sin
auto configIn = in.defaultConfig(RX_MODE);
configIn.copyFrom(info);
configIn.pin_data = 23;
configIn.pin_bck = 5;
configIn.pin_ws = 26;
configIn.port_no = 0;
in.begin(configIn);
// start I2Sout
auto configOut = out.defaultConfig(TX_MODE);
configOut.copyFrom(info);
configOut.pin_data = 17;
configOut.pin_bck = 21;
configOut.pin_ws = 16;
configOut.port_no = 1;
out.begin(configOut);
// better visibility in logging
copierOut.setLogName("out");
copierIn.setLogName("in");
xTaskCreatePinnedToCore(
inputTask, /* Task function. */
"inputTask", /* String with name of task. */
10000, /* Stack size in words. */
NULL, /* Parameter passed as input of the task */
AMP, /* configMAX_PRIORITIES - 1, Priority of the task. */
NULL, /* Task handle. */
coreId ? 1 : 0); /* same core id as main task */
xTaskCreatePinnedToCore(
outputTask, /* Task function. */
"outputTask", /* String with name of task. */
10000, /* Stack size in words. */
NULL, /* Parameter passed as input of the task */
3-AMP, /* configMAX_PRIORITIES - 2, Priority of the task. */
NULL, /* Task handle. */
coreId ? 0 : 1); /* other core id than main task */
}
void loop() {
static bool first = true;
if( first ) {
first = false;
Serial.printf("loop() on core %d\n", xPortGetCoreID());
}
delay(100);
} with this platformio.ini
|
Beta Was this translation helpful? Give feedback.
The simplest solution is to send each audio sample as 1 byte. If some bytes get lost, this will be not audible at all and in terms of audio quality it is very hard to distinguish between 8 bit and 16 bits. Here is the related example.. To test this I was increasing the baud rate to a very high value...
For adpcm I have extended by binary container: it stores the audio info, metadata and audio in different record structures/segments. Each encoded frame is stored in a separate segment and a checksum is calculated to determine if the audio is still valid. The records start with a crlf so that the beginning of a segment can be found easily. Here is the related example