When capturing sound for live concerts, studio recording sessions or other applications, multiple microphones can be typically used. Some of these microphones are dedicated at capturing sound coming from an individual sound source, namely the close microphones. Their name originates from the fact that they are set close to the desired source. In real-life applications, the close microphones cannot perfectly isolate the desired sound source and capture other simultaneous sounds too. This phenomenon is called microphone leakage, microphone bleed or microphone spill and it is a well-known problem to the sound engineers since the early days of sound capturing. In typical sound engineer setups, there can also be microphones aiming to capture sound coming from a plurality of sources; these are sometimes called ambient or room microphones.
Although microphones are initially set according to a specific acoustic setup, the acoustic conditions during sound capturing may change. For example microphones and/or sound sources sometimes move around the acoustic scene. In other cases, microphones and/or sources are accidentally displaced. In all above cases the original number of microphones and the initial microphone setup might not be sufficient from the sound engineering point of view. Therefore, there is a need for exploring the output of as many microphones as possible.
The characteristics of the captured sound mainly depend on the acoustic path between the microphone and each source, the microphone specifications (e.g. frequency response, microphone directivity, etc), the sound source properties, the room acoustic characteristics (when not in open spaces), etc. Therefore, each sound signal captured by each microphone (either close or ambient) is unique and from the signal processing point of view it has distinctive spectral and temporal properties. While processing and mixing sounds, a sound engineer takes advantage of these distinctive characteristics of each captured signal. The diversity of captured signals often allows for a successful final result. Therefore, careful microphone selection and placement as well as the decision on the number of microphones of each setup are very important in sound engineering.
The cost of professional microphones, the available space, the cabling infrastructure, the need to avoid acoustic feedback and other practical limitations reduce the number of microphones that can be practically used in real-world setups. On the other hand, the more microphones are set for sound capturing, the more options for the engineer when mixing or processing sound. Therefore there is a need for methods and systems that provide new ways of using every available microphone in a concert or studio setup.
Multiple microphones are also used in speech applications, typically in order to improve the performance of speech enhancement algorithms. These microphones are sometimes assembled in devices like mobile phones, tablets, laptop or desktop computers, wearables, smart TVs and other smart appliances, etc. Multiple microphones can be also found pre-installed into specific environments (e.g. smart homes, conference centers, meeting rooms, outdoors, etc) or become available via distributed microphone networks. In such cases, there is a need for methods and systems that provide new ways of taking into account every available microphone when enhancing speech, improving automatic speech recognition performance, etc.
Signal decomposition methods are a set of techniques that decompose a signal into its “intrinsic” parts and they are often used for the extraction of desired signals from a mixture of sound sources (i.e. source separation). In some cases signal decomposition can be performed with methods such as: non-negative matrix factorization, non-negative tensor factorization, independent component analysis, principal component analysis, singular value decomposition, dependent component analysis, low-complexity coding and decoding, stationary subspace analysis, common spatial pattern, empirical mode decomposition, tensor decomposition, canonical polyadic decomposition, higher-order singular value decomposition, or tucker decomposition. Although there are single-channel signal decomposition methods, multi-channel approaches (where separation is performed in each or some of the available channels) can be beneficial. Such techniques can be applied in multichannel recordings resulting from multi-microphone setups of concerts and studio sessions, where high-quality sound processing is needed. In addition, there is a need to develop methods that fully take into account multichannel information when identifying the desired parts of decomposed signals. Overall, there is a need for improved multichannel signal decomposition methods and systems.