Every compact disc (CD) stores a table of contents (TOC) that includes the start position of the individual tracks on the CD. When a CD is inserted into an audio CD player, the player automatically reads the TOC data, and from that, is able to display where the individual tracks start and how long the CD is.
It would be desirable, however, to have other information about the CD automatically displayed on a display in addition to the track length information. For example, it would be desirable to display the artist name, album name, track titles, and the like.
Prior art exists which generates identifiers for audio CDs based on approximate track length information calculated from the TOC data. The identifiers are stored in a central database along with information about the CD, such as, for example, the artist name and album title. However, different pressings of the same release can generate slightly different details about the starting positions of the tracks stored in the TOC. The prior art can deal with these slight differences by calculating the length of each track and reducing the timing accuracy of each calculated length by truncating the length information. The truncated length data of the various tracks is then used as the CD identifier for the CD.
The prior art determines whether a particular CD matches a CD in the central database by comparing the CD identifiers generated using the truncated lengths. A drawback with this prior art mechanism, however, is that the truncating of the length information results in data loss that may lead to erroneous results. For example, if two tracks of the same song respectively have a length of 19999 in one CD pressing and a length of 20001 in a second CD pressing, but the length accuracy is reduced by eliminating the last 4 digits of the length data, the resultant lengths are 10000 and 20000 respectively. A comparison of these two truncated lengths may exceed a set threshold and contribute to an erroneous conclusion that the CDs containing the respective tracks are different.
In another example, a particular CD may have a track length sequence of (15, 5, 15) which is to be matched against a database having a first CD with a track length sequence of (20, 10, 20) and a second CD with a track length sequence of (10, 10, 10). The prior art computes the difference of the particular CD against the CDs in the database by computing the sum of the absolute difference for each track length. Thus, according to the prior art formula, it is impossible to distinguish between the first and second CDs because they result in the same difference computation. However, when considering the “shape” of the sequences, the first CD has a similar shape sequence as the particular CD in question. The prior art mechanism does nothing to preserve information about the “shape” of a sequence.
Thus, the problem of the prior art may be summarized as follows. Two number sequences are to be compared and given a measure of similarity for the purpose of determining whether they come from the same or different “sources” (e.g. same or different CDs). The differences, in the case of a common source, are assumed to be small random perturbations of the underlying “real” data. If these sequences are examined as a plotted curve, an intuitive determination may be made based on the large scale features of the shape of the curve, while ignoring the small details. This may be analogized to zooming out on a map of a coastline.
Accordingly, what is desired is a system and method for determining whether two number sequences are from the same source that does not rely on truncated track length information and that retains information about the relationship between the different values.