Systems and methods for determining a source of a sound in an audio signal are known. Generally, these techniques focus on high-level information that is non-trivial to derive from a raw signal. Speech, for example, can be largely characterized by the frequencies of the pitch and vocal tract formants. Existing techniques usually rely on omitting detected features that fall below an energy threshold. These approaches, however, may lose a great deal of information that would otherwise be useful for source identification.
Furthermore, in “noisy” conditions (e.g., either sound noise or processing noise) the accuracy and/or precision of conventional techniques may drop off significantly. Since many of the settings and/or audio signals in and on which these techniques are applied may be considered noisy, conventional processing to identify a source of a sound in an audio signal may be only marginally useful.