The present invention relates to apparatus and methods for searching (or retrieving) for waveform data of sound fragments in music piece data for purposes of editing etc. of the music piece, and a storage medium containing a computer program therefor.
Among the conventionally-known techniques pertaining to music piece editing is a technique called “audio mosaicing”. According to the audio mosaicing technique, various music pieces are divided into fragments each of a short time length, so that fragment data indicative of waveforms of the individual fragments are collected to build a fragment database. Desired fragment data are selected from the fragment database, and then the selected fragment data are interconnected on the time axis to thereby create a new music piece. Examples of literatures pertaining to this type of technique include:
AriLazier, Perry Cook, “MOSIEVIUS: FEATURE DRIVEN INTERACTIVE AUDIO MOSAICING”, [on line], Proc of the 6th Int. Conference on Digital Audio Effects (DAFx-03), London, UK, Sep. 8-11, 2003 [searched Mar. 6, 2007], Internet <URL: http://soundlab.cs.princeton.edu/publications/mosievius_dafx—2003.pdf> (non-patent literature 1)
Further, among the conventionally-known music-piece editing styles is a style in which operation for replacing a fragment of an existing or original music piece with a fragment of another music pieces is repeated to make a music piece having a different impression from the existing music piece. If the fragments of the existing music piece are replaced with fragments having completely different characters therefrom, then coherence will be lost between the fragments so that the thus-made music piece may completely differ from a music piece intended by a user. In order to avoid such an undesired situation, it is preferable to select, from among replacing fragments prepared in advance, fragments similar in character to fragments of an existing music piece to be edited and then use the selected fragments for replacement. Thus, in earlier-filed Japanese Patent Application No. 2006-311325 (laid-open as Japanese Patent Application Laid-open Publication No. 2008-129135), which corresponds to U.S. patent application Ser. No. 11/985,212 and European Patent Application No. 07120926.6 (Publication No. 1923863), the same assignee of the instant application proposed a music piece processing apparatus which comprises: a storage section that stores, for each of a plurality of music pieces, respective tone data of a plurality of fragments of the music piece and respective character values of the fragments; a similarity determination section that calculates a similarity index value indicative of a degree of similarity, to a character value of each of fragments of one of the plurality of music pieces (i.e., main music piece), of a character value of each individual one of the fragments of a plurality of sub music pieces other than the main music piece; and a processing section that processes the tone data of the fragments of the main music piece on the basis of the tone data of some of the fragments of the one or more sub music pieces of which the similarity index value indicates high similarity. Because the tone data of the fragments similar in character value to the individual fragments of the main music piece are used in processing the tone data of the fragments of the main music piece, the proposed music piece processing apparatus can prevent, to some extent, the processed music piece from becoming a music piece not intended or desired by the user.
As one example of a method for obtaining character values to be evaluated for similarity, it is conceivable to divide a fragment into frames each of a predetermined time length, calculate a character value per divided frame, average the character values of these frames and then set the averaged character value as a character value of the fragment. With this method, fragments to be used for tone data processing can be selected with a considerably reduced number of arithmetic operations. However, this method may present the problem that fragments clearly auditorily different in character from original fragments would be undesirably selected as replacing fragments, as detailed below.
For example, there is conceivable a case where a plurality of percussion instrument tones are being generated within a fragment. If, in such a case, all character values of the entire fragment are averaged, information indicative of presence of the plurality of percussion instrument tones will be abstracted away from the averaged character value. Also conceivable is a case where tone color variation occurs partway through a fragment, e.g. where a tone volume relatively small in the former half of the fragment gets relatively great in the latter half of the fragment. In this case, information indicative of the tone color variation partway through the fragment will be abstracted away from the averaged character value.
As noted above, the averaged character value may sometimes fail to represent an auditory character depending on the fragment. Thus, even if the scheme for replacing an original fragment with a selected fragment similar in character value to the original fragment, the original fragment may sometimes be undesirably replaced with a fragment that is not at all similar to the original fragment.
Further, in TristanJehan, “Creating Music by Listening”, [on line], PhD Thesis, Massachusetts Institute of technology, September, 2005 [searched Sep. 20, 2007], Internet <URL: http://web.media.mit.edu%7Etristan/phd/pdf/Tristan_PhD_MIT.pdf> (non-patent literature 2), there is proposed a technique which divides each fragment into frames and evaluates degrees of similarity between the fragments using all of character values of the individual frames. This proposed technique permits accurate evaluation of the degrees of similarity between the fragments because none of the characters of the fragments is abstracted. With the proposed technique, however, the amount of information indicative of the character values per fragment would become enormous, and thus, a necessary memory capacity for implementing the technique and the quantity of arithmetic operations for evaluating the degrees of similarity between the fragments would also become enormous. For these reasons, it is impractical to use the proposed technique in apparatus for music piece editing.