The present invention relates to identifying a file or set of files sharing a particular characteristic, and in particular, identifying such files while minimizing data transfer over a low-bandwidth communications pathway.
The prior art is replete with various schemes for providing data-file updates over a network connection. Ever since the days of large centralized corporate mainframes, maintaining software has been a pressing concern. With the proliferation of fast and powerful workstations, maintaining current software has become even more complex. Recently, software maintenance has been somewhat automated by the creation of automatic computer software updates programs such as xe2x80x9cOil Changexe2x80x9d by CyberMedia.
There is a fundamental problem, however, with present-day automatic update schemes. With the explosion in program size, where a single complex application may have hundreds or thousands of program modules, one ends up with a prohibitively large data-file listing all files and associated version tracking information for all application program modules being tracked by a server. (For the purposes of this application, the term xe2x80x9cserverxe2x80x9d refers to the entity or entities providing the updating service, and the term xe2x80x9cclientxe2x80x9d refers to the computer or organization receiving updated files.) In order for a client to determine whether there are program updates available, a potentially huge volume of data has to be transferred between client and server. If the client only has a low-bandwidth connection to the server, such coordination can be very time consuming.
In accordance with a preferred embodiment of the invention, the foregoing and other disadvantages of the prior art are overcome.
In the preferred embodiment, a set of software programs on the client computer is compared against a set of updates on the server computer to determine which updates are applicable and should be transferred from the server to the client. A many-to-one mapping function (e.g. a hash function) is applied to update identifiers to generate a table of single bit entries indicating the presence of particular updates on the server. This table is transferred to the client over the slow link. At the client, the same mapping function is applied to program identifiers, and corresponding entries of the transferred table are checked to determine whether the server has a potential update. If such a potential update is noted, a second transmission is requested by the client from the serverxe2x80x94this one conveying additional data by which hash collisions can be identified by the client and disregarded. If availability of an actual update (versus a hash collision) is thereby confirmed, the client requests a third transmission from the serverxe2x80x94this one conveying the actual update data. By this arrangement, optimized use is made of the low bandwidth link, with successively more information transferred as the likelihood of an applicable update is successively increased. (The same arrangement can be employed in reverse, with the bit table generated at the client and identifying program files available for possible updating, transferred to the server, etc.)
The foregoing and other features and advantages will be more readily apparent from the following detailed description, which proceeds with reference to the accompanying drawings.