Realtime audio processing with Linux: part 3

Realtime audio processing with Linux: part 3

Introduction

In the previous articles we have introduced basic concepts about audio processing and got ourselves acquainted with command line ALSA tools.

If you didn't have the chance to read the previous articles, you can find them here:

After so much “theory”, albeit put in some sort of practice with command line tools, it's finally time to dig into actual code.

A quick note about code quality: this code is not meant to be production-ready. It's not even worth of an alpha release. Many common practices, like error checking, pointer checking, exception management, etc etc have intentionally been omitted to keep code short, clear, and straight to the point.

A simple example

We will focus on a simple example whose only purpose, initially, will be to implement the loopback command we have created in the second article:


$ arecord -c 2 -r 44100 -f S16_LE -D default | aplay -c 2 -r 44100 -f S16_LE -D default

This command reads audio from the “default” input device (which, we have seen, maps to the pulse server) and pipes it to the “default” output device (again, mapped to the pulse server)

This command performs the following sequence internally:

  1. Open the default input device
  2. Configure the input device with 2 channels, 44.1 kHz sampling rate, and S16_LE (signed 16 bits integers in little endian format)
  3. Read samples from the input device and write them to stdout
  4. Open the default output device
  5. Configure the output device with 2 channels, 44.1 kHz sampling rate, and S16_LE (signed 16 bits integers in little endian format)
  6. Read samples from stdin and write them to the output device

Our code will do more or less the same, with some exceptions, as we will need to:

  • configure more parameters, which arecord and aplay configure by default (for instance, the buffer size, the period size, etc). See the first article if you don't remember the meaning of these parameters.
  • pre-fill the output buffer (more about this later)
  • enable the input and output devices just before we are ready to process data
  • use an internal buffer to transfer data from the input device to the output device
  • manage exceptions like underrun and overrun (again, see the first article if you don't remember the meaning)

General project structure

Our example is going to be a Qt-based application. Although this is not needed right now, it will prove useful later when we will add more complex features to the application.

We create the project using the Qt wizard and create a simple Qwidgets application with the main window derived from “QWidget”.

This will generate the following files:

MainWindow.h / MainWindow.cpp / MainWindow.ui: these files implement a very simple MainWindow class (derived from QWidget) and its UI file (consisting of an empty window).

#include <QMainWindow>

namespace Ui {
class MainWindow;
}
class MainWindow : public QMainWindow
{
    Q_OBJECT
public:
    explicit MainWindow(QWidget *parent = 0);
    ~MainWindow();
private:
    Ui::MainWindow *ui;

	private slots:
};




#include "MainWindow.h"
#include "ui_MainWindow.h"

MainWindow::MainWindow(QWidget *parent) :
    QMainWindow(parent),
    ui(new Ui::MainWindow)
{
    ui->setupUi(this);
}
///////////////////////////////////////////////////////////////////////////////
///
MainWindow::~MainWindow()
{
    delete ui;
}"

main.cpp: this creates an instance of QApplication, an instance of MainWindow, and runs the application loop.

#include "MainWindow.h"
#include <QApplication>
int main(int argc, char *argv[])
{
    QApplication a(argc, argv);
    MainWindow w;
    w.show();
    return a.exec();
}

If we run this very simple application we will get a plain old window doing nothing:

No alt text provided for this image

Enter audio processing

Our application will need to periodically read samples from the input device and copy them to the output device. This could be done in a number of ways, for example in a timer inside the MainWindow, but this would not guarantee that we will wake up in time to process samples as they become available.

The best solution is to use a separate thread which will run on its own, leaving the main thread to manage the UI undisturbed. The audio processing thread will constantly read from the input device, waking up when samples are available, and copy data to the output device. If the application is run with root privilege, we will also be able to increase the thread's priority to realtime, making it more unlikely (but not impossible) that ticks will be lost because of CPU time being stolen by other tasks.

We derive a class from QThread, which unsurprisingly is Qt's implementation of threads:

#include <QThread>
class AudioThread : public QThread
{
    Q_OBJECT
    
public:
    AudioThread();
    virtual ~AudioThread();
    virtual void run();
};

The “run” function needs to perform device opening, initialising, and audio processing:

void AudioThread::run()
{
    // 1. Initialise thread priority
    // …
    // …

    // 2. Open audio devices
    // …
    // …

    // 3. Configure audio devices
    // …
    // …

    // 4. Prefill output buffer
    // …
    // …

    // 5. Start audio devices
    // …
    // …
    
    // 6. Periodic processing 
    while(true)
    {
        // Read audio samples
        // …
        // …

        // Write audio samples
        // …
        // …
    }
}

Let's now investigate these steps in more detail.

1. Initialise thread priority

This is rather straightforward.

    struct sched_param param;
    param.__sched_priority = 99;
    ret = sched_setscheduler(0, SCHED_RR, &param);

Here we are setting the priority to 99, the maximum allowed priority, with a scheduling policy SCHED_RR (Round Robin). This is one of the two available real-time scheduling policies, the other one being SCHED_FIFO which will never yield the CPU unless this is done explicitly in code. Please notice that, without a realtime patch to the kernel, Linux will do its best to ensure the maximum priority to our audio thread but this cannot be enforced, as other system tasks in the kernel might interrupt our tasks in an unregulated manner.

If ret is < 0 it means the operation failed. This can happen if the user running the program does not have enough privileges, as setting a thread or process to realtime priority can fully freeze the entire system.

2. Open audio devices

For anything related to ALSA we need to include the SDK's include file:

#include <alsa/asoundlib.h>

ALSA devices are represented by pointers to an opaque structure called snd_pcm_t, devices can be opened by name using the same identifiers we supplied to -D for aplay and arecord:

snd_pcm_t *playbackHandle = nullptr;
snd_pcm_t *recordingHandle = nullptr;
snd_pcm_open(&playbackHandle, "default", SND_PCM_STREAM_PLAYBACK, 0);
snd_pcm_open(&recordingHandle, "default", SND_PCM_STREAM_CAPTURE, 0);

again, as usual, error checking should be done on the return value of snd_pcm_open as we might have specified the wrong device name.

3. Configure audio devices

This portion of code is lengthy and boring, as we need to specify all parameters which constitute an audio device's configuration. And, we should be doing error checking on all, which we won't do in this example.

Configuration consists of two blocks: configuration of hardware parameters (sampling rate, format, etc) and software parameters (how many samples should be available before the reading process unblocks, etc).

Hardware parameters

Here is a portion of code which sets the most important hardware parameters. We are setting a sampling rate of 44.1 kHz, 2 channels (stereo), interleaved, S16_LE, period time of 10 ms (441 samples), buffer size of 4 periods. We do this for both the recording and the playback device.

# define AUDIO_SAMPLE_SIZE_IN_MICROSEC                   10000
# define AUDIO_SAMPLE_SIZE_IN_SEC                        0.01
# define AUDIO_SAMPLE_SIZE_IN_MILLISEC                   10 
# define AUDIO_BUFFER_SIZE                               4


# define AUDIO_BUFFER_SIZE_IN_MICROSEC (AUDIO_BUFFER_SIZE *  
                                        AUDIO_SAMPLE_SIZE_IN_MICROSEC)
# define AUDIO_BUFFER_SIZE_IN_SEC     (AUDIO_BUFFER_SIZE * 
                                       AUDIO_SAMPLE_SIZE_IN_SEC)

static const int samplingRate = 44100;

snd_pcm_hw_params_t *pHwparams = nullptr;
snd_pcm_hw_params_alloca(&pHwparams);


// fill with full configuration space
snd_pcm_hw_params_any(pPCMHandle,pHwparams);
snd_pcm_hw_params_set_access(pPCMHandle, pHwparams, SND_PCM_ACCESS_RW_INTERLEAVED);
snd_pcm_hw_params_set_format(pPCMHandle, pHwparams, SND_PCM_FORMAT_S16_LE);
snd_pcm_hw_params_set_channels(pPCMHandle, pHwparams, 2);
// sampling rate
snd_pcm_hw_params_set_rate(pPCMHandle, pHwparams, samplingRate, 0);
// set the period time
// ALSA requires the value in micro seconds
periodTime = AUDIO_SAMPLE_SIZE_IN_MICROSEC;
snd_pcm_hw_params_set_period_time(pPCMHandle, pHwparams, periodTime, 0);
// set the ring buffer size
bufferTime = AUDIO_BUFFER_SIZE_IN_MICROSEC;
snd_pcm_hw_params_set_buffer_time(pPCMHandle, pHwparams, bufferTime, 0);
// apply hardware parameters
snd_pcm_hw_params(pPCMHandle, pHwparams);

Software parameters

The only parameter we care about in this simple application is the number of samples we require to complete a read request. We set this value to the number of samples within a period, so that when a period is complete our audio thread will wake up and process it. We do this only on the recording handle.

snd_pcm_sw_params_t *pSwparams = nullptr;
snd_pcm_sw_params_alloca(&pSwparams);
snd_pcm_sw_params_current(pPCMHandle, pSwparams);
snd_pcm_sw_params_set_avail_min(pPCMHandle, pSwparams, (AUDIO_SAMPLE_SIZE_IN_SEC*samplingRate) );
// apply the sw params
snd_pcm_sw_params(pPCMHandle, pSwparams);

Once we have properly configured the device we need to get it ready to run:

    snd_pcm_prepare(playbackHandle);
    snd_pcm_prepare(recordingHandle);

4. Prefill output buffer

If we start the read/write loop with the audio devices as they are now we are likely to get into an underrun condition on the output device: the output device starts with an “empty stomach”, i.e. its buffer is empty, and as it starts running it will complain about not having anything to stream. So we need to prefill it with empty data.

When we will run our main thread we will need an input buffer and an output buffer, here is their declaration:

    int16_t txBuf[reqFrames * 2];
    int16_t rxBuf[reqFrames * 2];

A couple of notes here:

  • int16_t is a type for signed, 16 bits integers. The data format is S16_LE which matches this format. As I am running this example on a PC with an Intel CPU, this format already has the correct endianness. If we were to run this example on a big endian CPU, we would need to switch to S16_BE if the device supports it, or perform some format conversion in format or use a plug plugin.
  • ReqFrames is defined as follows:
    static const int FREQUENCY = 44100;
    static const int reqFrames = FREQUENCY / 100;

So it defines the number of samples which are stored in a period of 10ms. The “*2” is because we have a stereo device.

  • These buffers are NOT the buffers used internally by ALSA. They just represent a local copy of one period as returned / expected by ALSA. The internal buffer contains 4 periods as defined in the hardware configuration.

Now that we have the definition of our buffers we can inject 4 periods into the output device so it won't underrun immediately:

snd_pcm_writei(playbackHandle, txBuf, reqFrames);
snd_pcm_writei(playbackHandle, txBuf, reqFrames);
snd_pcm_writei(playbackHandle, txBuf, reqFrames);
snd_pcm_writei(playbackHandle, txBuf, reqFrames);

5. Start audio devices

This is very easy, and it needs to be done only on the output device:

 snd_pcm_start(playbackHandle);

6. Periodic processing 

Now everything is ready, we just need to start a periodic loop in which we will read samples from the input device into the input buffer, copy the input buffer into the output buffer, and write the output buffer into the output device.

We wouldn't theoretically need to have two separate buffers for this application, and we could just use the same buffer to gather samples and spit them out. However, we are using two separate buffers in preparation for more complex processing in future articles.

Without further ado let's have a look at the processing loop:


  while(true)
  {
    int availFramesRec = snd_pcm_readi(recordingHandle, rxBuf, reqFrames);
    printf("availFramesRec: %d\n", availFramesRec);
    memcpy(txBuf, rxBuf, sizeof(txBuf));
    int availFramesPlay = snd_pcm_writei(playbackHandle, txBuf, reqFrames);
    printf("availFramesPlay: %d\n", availFramesPlay);
    if(availFramesPlay < reqFrames)
    {
      printf("Overrun !\n");
      snd_pcm_prepare(playbackHandle);
      snd_pcm_writei(playbackHandle, txBuf, reqFrames);
      snd_pcm_writei(playbackHandle, txBuf, reqFrames);
      snd_pcm_writei(playbackHandle, txBuf, reqFrames);
      snd_pcm_writei(playbackHandle, txBuf, reqFrames);
      snd_pcm_start(playbackHandle);
    }
  }

We read from the input device and we print how many frames we have been able to fetch: given the hardware / software configuration we have chosen for the device, we should always get 441 frames (10 ms with a sampling frequency of 44.1 kHz).

We then copy the input buffer into the output buffer, and check if we have space to write it into the output device (which should always be the case). If we don't, we restart the output device to recover the overrun.

When we run this application we should hear the microphone input mirrored to the output.

What's next ?

Let's face it: this was a lot of effort to achieve absolutely nothing. An echo application, come on.

In the following articles we will try to do something a bit more interesting, introducing some processing into the audio flow - stay tuned for more !







Andrew Francis

To write software that users will adore.

10 个月

Guido Piasenza I just discovered your posts: I love them! For the past few months I have been using Python sound-device that is a binding for Portaudio. I am slowly learning the concepts and getting stuff to work. I need and appreciate all the help I can get!

赞
回复
Salvatore Ercole

Software Embedded designer

2 å¹´

Bello e semplice....complimenti!!!

要查看或添加评论,请登录

Guido Piasenza的更多文章

  • Realtime audio processing with Linux: part 2

    Realtime audio processing with Linux: part 2

    Introduction In the previous article we have introduced basic concepts about audio processing, and begun to map them to…

    4 条评论
  • Realtime Audio processing with Linux

    Realtime Audio processing with Linux

    Introduction Do you drive a recent car ? Chances are your infotainment system, i.e.

    4 条评论
  • 3D visualisation using Qt3D: part 3

    3D visualisation using Qt3D: part 3

    Introduction In this third installement of my Qt3D tutorials we will dig a bit deeper into materials and the impact…

    1 条评论
  • 3D visualisation using Qt3D: part 2

    3D visualisation using Qt3D: part 2

    Introduction In the second part of the tutorial we will dig deeper into the structure used by Qt3D to represent scenes,…

  • 3D visualisation using Qt3D: part 1

    3D visualisation using Qt3D: part 1

    Introduction In my company we are developing a machine vision + AI based assistant which can provide visual hints about…

    6 条评论
  • C++: detailed analysis of the language performance (Part 3)

    C++: detailed analysis of the language performance (Part 3)

    This is the third of a series of articles about C++ perfomance. Did you read part 2 already ? If not, please do so: In…

    2 条评论
  • C++: detailed analysis of the language performance (Part 2)

    C++: detailed analysis of the language performance (Part 2)

    This is the second in a series of articles about C++ performance. Did you read part 1 already ? If not, please do so:…

    3 条评论
  • C++: detailed analysis of the language performance (Part 1)

    C++: detailed analysis of the language performance (Part 1)

    As an avid C++ supporter I frequently had to face several objections to the language, mostly based on its (supposed)…

    16 条评论

社区洞察

其他会员也浏览了