This invention relates to search methodologies and search engines for retrieving information from a global computer network. More particularly, this invention relates to a method for selecting among equivalent files, documents, pages or other information resources on a global computer network.
Global computer networks, such as the INTERNET and its World Wide Web (WWW), store data in many formats among many different sites. There are text files, binary files, audio files, video files, multimedia files, and other types of data files and executable files. The data typically is stored in a file format, but also is referred to as a page, document, graphic, video clip, audio clip, program, or data base. All of these formats of storing information on a global computer network, and other formats for storing data on such a network, are referred to herein as a data file unit, or simply, a file.
A common way of finding and accessing information of unknown location on a global computer network is to use a search engine. There are many conventional search engines available on the WWW that allow one to log onto a search site and execute a search engine. The user typically inputs parameters and/or keywords to define what the search engine is to look for. Some search engines organize resources on the WWW into categories and allow one to limit the search to files organized under the category. Some allow searching only of web site titles, or keywords from a web site or other resource.
Common search engines for searching among digitized audio files are MP3-based search engines. MP3 refers to a conventional encoding standard defining the formats for recording and storing audio files in digital format. There are many MP3 audio files which may be accessed and downloaded from the global computer network. For example, many songs and other audio works are stored on the global computer network. In particular it is common for there to be many different copies of the same audio work located at different sites on a global computer network. Accordingly, there is a need for selecting which copy to select from the many equivalent copies of the same audio work.
As used herein xe2x80x98equivalent copyxe2x80x99 means identical and non-identical copies of the same audio work. Such copies may be different because they resulted from different digital encodings of the same audio work (e.g., different MP3 encoding bit rate, mono versus stereo encoding). Further, by xe2x80x98same audio workxe2x80x99 it is meant the same song as performed by the same artist in a given performance or prerecorded event, (e.g., ultimately derived from a common master or duplicate master; a xe2x80x98dubbing;xe2x80x99 one of many xe2x80x98bootlegsxe2x80x99 of the same performance). For example, several end users may have an analog audiotape of a song. Such audiotape copies may be derived from a common master or duplicate master. Different operators may own digital encoding equipment which allows them to record the song in digital format and upload it onto the global computer network. Such uploaded files are two equivalent copies, as the term equivalent is used herein. Even if the operators use the same kind of encoder system or a different encoder system or the same or a different encoding protocol, the uploaded files are considered herein as being equivalent.
For some embodiments, xe2x80x98equivalent copyxe2x80x99 also may mean the identical and nonidentical copies of a similar audio work of the same song. For example, copies of the same song by the same artist from different performances or prerecorded masters (i.e., different versions of same song by the same artist). Or in another example, xe2x80x98equivalent copiesxe2x80x99 also may mean copies of the same song as performed by different artists.
According to the invention, at least one file among equivalent files (preferably equivalent audio files) identified during a search of a global computer network are selected to be downloaded to a local computer.
According to one aspect of the invention, a plurality of the equivalent files are begun to be downloaded to the local computer for a brief trial period. Such period is predetermined empirically and is expected to be less than the time required to download the entire contents of a file. Typically the equivalent files are located at different sites on the global computer network. The bandwidth of the data pathway from the local computer to these different sites may vary. The performance of these different pathways is estimated based upon the throughput bandwidth for the downloading process during the trial period.
According to another aspect of the invention, there are multiple criteria for selecting one of the equivalent files to completely download at an end user computer. First, with regard to files for which downloading attempts result in a message xe2x80x9cfile not foundxe2x80x9d (or a similar message), the files are eliminated from consideration. Second, there may be some selectable parameters which may be automatically set or adjusted by the end user (e.g., stereo versions only; server must support xe2x80x98resumexe2x80x99, stereo versus mono version, MP3 encoding bit rate). Third, the downloading bandwidth determined for the trial period is to be an acceptable bandwidth.
According to another aspect of the invention, there is a minimum desired throughput bandwidth. Any of the equivalent files downloaded during the trial period which meet the minimum desired throughput bandwidth are acceptable. When none are acceptable, then the equivalent file meeting the other criteria whose pathway is performing with the highest bandwidth is selected to be completely downloaded. When multiple files are acceptable and meet the other criteria, then the file whose pathway is performing with the highest bandwidth is selected to be completely downloaded.
According to another aspect of the invention, each time a file is selected to be downloaded a popularity ranking is incremented. In one embodiment the end user computer sends a message to the search web site which performs the search and which maintains the popularity counters. The counter refers to the file and its storage location on the global computer network. In some embodiments an additional counter is maintained to track the popularity of a given song, as distinct from the counter of the song/site combination.
According to another aspect of this invention, when a search lists multiple equivalent files, a number of equivalents are chosen to be evaluated. Some are chosen at random from the list returned from the search. Some are chosen because they have a song/site popularity rating. Among those selected that have a song/site popularity rating, those with the highest popularity rating are selected to be evaluated during the downloading trial period. This corresponds to copies of the song whose download pathway previously was selected.
According to another aspect of the invention, the equivalent files correspond to equivalent copies of the same audio work.
According to one advantage of this invention, a method for selecting among equivalent files meeting some threshold criteria is provided. According to another advantage of the invention, end user waiting time is reduced because an optimally-downloadable equivalent is selected to be downloaded.
These and other aspects and advantages of the invention will be better understood by reference to the following detailed description taken in conjunction with the accompanying drawings.