Audio processing systems are known from the state of the art in various forms, for example for a playback of MIDI files or for computer games.
Audio processing is very time-critical by nature. An audio subsystem typically produces a block of audio samples while simultaneously playing out a previously produced block. If the processing or generation of a new block takes more time than playing out one block, then a gap referred to as “drop-out” can be heard in the audio playback. In order to avoid such a gap, it is possible to queue up more than one produced block for playback in a buffer.
Interactive audio applications require in addition a low latency between an interaction event and the response in the audio playback. This can be achieved in principle with short audio frames. A large buffer size for avoiding gaps, however, will lead to an added latency between a possible user input and the resulting audio output, as the user input can only have influence on the blocks which are still to be produced.
Further, an interaction event can happen any time during the lifetime of the application and often requires additional processing. While an interaction is processed, the generation of new output blocks may be slowed down. It is thus a difficult task to find the shortest possible buffer size which results in a low latency but which does not produce audible gaps in any usage situation.
User inputs also result in a very uneven distribution of the total processing load as a function of time. The software design for a Digital Signal Processor (DSP) taking care of the processing is more complicated if varying loads have to be dealt with.
There are several audio processing software systems, in which at least a part of the processing is split up into units that conform to a unified interface, irrespective of the nature of the processing. These units are also referred to as components. A component is thus a building block for a software system framework and implements an audio processing feature, such as a mixer, a sampling rate converter, or a reverberation effect. The components can usually be plugged in the system without recompilation, and are hence called “plug-ins”. Two of such systems are the VST (Virtual Studio Technology) API (Application Programmer Interface) by Steinberg, and the LADSPA (Linux Audio Developer's Simple Plugin API) by the Linux audio community.
The Steinberg VST plug-in architecture enables an integration of virtual effect processors and instruments into the digital audio system, for example of a VST mixer. The audio system can be run on a PC or on a Macintosh computer.
The LADSPA is an open Linux activity that provides a standard way for plug-in audio processors which are to be used with Linux audio synthesis and recording software.
In both solutions, the control calculations for interactions and the real-time signal processing calculations are carried out in the same process. This means that the total load of the audio processing varies according to the user interaction.
In the document “Design of Low Latency Audio Software for General Purpose Operating Systems”, University of Turku, Department of Information Technology, Computer Science Master's Thesis of December 2002 by Kai Vehmanen, it is proposed to separate the audio processing code into real-time and non-real-time parts and to use a real-time safe mechanism for designing low latency audio applications. It is further proposed to use separate execution contexts for the user interface and the audio code, for instance multiple threads. A thread is a special case of a process. Each thread has its own execution context that can be independently scheduled, like other processes, but threads of one logical group have a shared memory space.