Internet media sharing enables users to share media content virtually anywhere at any time, as long as they have access to a media capable device with an internet connection. The convenience of being able to view media content via the internet, essentially on demand, has resulted in explosive growth of internet media viewing. Internet media traffic is currently near a majority of consumer internet traffic, and the rate of demand is projected to continue increasing.
People have the ability to quickly identify or recognize known media content that has undergone a transformation, such as a popular song that has been slowed down, or when a person other than an original artist is covering a known song in a user created video or audio recording. However, transformations of media content such as temporal stretches, aspect ratio alterations, and so forth have proven difficult and computationally expensive for computer recognition systems.
Typically, conventional systems for media content matching extract features from the media content with fixed reference frames. The fixed reference frames cause the extracted features to be brittle when subjected to various transformations, such as time stretching. As a result, many conventional systems for media matching experience performance degradation when the media content is subject to transformations.