Detecting components of signals is a fundamental objective of signal processing. Detected components of acoustic signals can be used for myriad purposes, including speech detection and recognition, background noise subtraction, and music transcription, to name a few. Most prior art acoustic signal representation methods have focused on human speech and music where detected component is usually a phoneme or a musical note. Many computer vision applications detect components of videos. Detected components can be used for object detection, recognition and tracking.
There are two major types of approaches to detecting components in signals, namely knowledge based, and unsupervised or data driven. Knowledge-based approaches can be rule-based. Rule-based approaches require a set of human-determined rules by which decisions are made. Rule-based component detection is therefore subjective, and decisions on occurrences of components are not based on actual data to be analyzed. Knowledge based system have serious disadvantages. First, the rules need to be coded manually. Therefore, the system is only as good as the ‘expert’. Second, the interpretation of inferences between the rules often behaves erratically, particularly when there is no applicable rule for some specific situation, or when the rules are ‘fuzzy’. This can cause the system to operate in an unintended and erratic manner.
The other major types of approach to detecting components in signals are data driven. In data driven approaches, the components are detected directly from the signal itself, without any a priori understanding of what the signal is, or could be in the future. Since input data is often very complex, various types of transformations and decompositions are known to simplify the data for the purpose of analysis.
U.S. Pat. No. 6,321,200, “Method for extracting features from a mixture of signals,” issued to Casey on Nov. 20, 2001 describes a system that extracts low level features from an acoustic signal that has been band-pass filtered and simplified by a singular value decomposition. However, some features cannot be detected after dimensionality reduction because the matrix elements lead to cancellations, and obfuscate the results.
Non-negative matrix factorization (NMF) is an alternative technique for dimensionality reduction, see, Lee, et al, “Learning the parts of objects by non-negative matrix factorization,” Nature, Volume 401, pp. 788-791, 1999.
There, non-negativity constraints are enforced during matrix construction in order to determine parts of faces from a single image. Furthermore, that system is restricted within the spatial confines of a single image, that is, the signal is stationary.