Sound processing may be utilized to provide a wide range of functionality. One example of this functionality is sound decomposition, which may be leveraged to identify sources of sound data in a recording. For example, sound data may be captured for use as part of a movie, recording of a song, and so on. Parts of the sound data, however, may reflect capture in a noisy environment or may include different parts that are and are not desirable. The sound data, for instance, may include dialog for a movie which is desirable, but may also include sound data of an unintended ringing of a cell phone. Accordingly, the sound data may be decomposed according to different sources such that the sound data corresponding to the dialog may be separated from the sound data that corresponds to the cell phone.
However, conventional techniques that are employed to automatically perform this decomposition could result in inaccuracies as well as be resource intensive. For example, conventional techniques typically did not address a temporal evolution of sound from different sources and thus ignored temporal characteristics of the sound from the respective sources, which could result in mislabeling of portions of the sound data.