A variety of available web browsers are used to access the Internet. Microsoft's Internet Explorer and Netscape's Netscape 6 are two widely used browsers. Other browsers, such as Konqueror, Opera, Mosaic, and Mozilla, enjoy wide popularity. Typically, a browser stores data files associated with Internet sites and web pages, or other network sites, such as a LAN, WAN, etc., on a user's hard drive, usually in a dedicated storage folder or directory. Such a storage folder or directory is often referred to as a cache. The data files are usually graphics files, such as .jpg and .gif files, and text files, such a .html, .txt, asp and cookie files. Usually the text files define the content of an Internet site or web page, and reference the graphics files to be included in the browser display. Each data file stored in the storage folder includes a file name and data stored within the file. Also, there is additional data stored and used by the browser's data file management system. This additional data is usually in the form of data fields indicating the Internet address, size, when the file was last modified, and when it was last accessed: Commonly used names for these data fields are “Internet Address”, “Size”, “Expires”, “Last Modified”, “Last Accessed”, and “Last Checked”.
Several of these fields can be used in conjunction with a “Conditional Get” command as specified in the Hypertext Transfer Protocol specification defined as “HTTP/1.1”, June 1999, published by the World Wide Web Consortium, the disclosure of which is incorporated herein by reference. When a user browses the Internet, the data files for each Internet address the user accesses are automatically stored on the user's hard drive in the storage folder. The data files for an Internet address are stored and are accessed on repeated visits to that address to reduce the amount of data that must be downloaded. Illustratively, this is accomplished by the browser searching for a data file on the user's hard drive that is identified by the .html or .asp file for that address. Once found, the browser sends the “Last Modified” date and file name to the server. If the file has not been modified since the “Last Modified” date, the server need not send the file but only an acknowledgment. This significantly reduces the bandwidth requirement of the system. Similarly, the “Expires” field, which contains a date, can be check against the current date. If the current date is later than the “Expires” field, the client will request a new file from the server.
Computers have the ability to store many megabytes (MB) of browser downloaded data, thus allowing data from more addresses to be stored locally and thereby reducing bandwidth requirements and increasing server response speed for the user. As a user continues to browse various web pages, the number of data files stored on the user's hard drive increases proportionally. However, while browsing the Internet for product and/or information, the user will access data at many Internet addresses that are of no value or interest to the user. Nevertheless, the data files for each address are automatically stored on the user's hard drive regardless of value or interest. Over time the amount of data stored by the browser can become excessive and tend to slow down the response of the browser, as the browser must search through possibly tens of thousands of stored data files contained in the cache.
Illustratively, most web browsers allow a user to specify the amount of space allotted to this storage area, usually as a percentage of total hard drive space. However, as this data increases in quantity, the response of the user's computer to search and find these stored data files will decrease. Additionally, current hard drives have a capacity of at least 20 gigabytes (GB) or more. Thus, setting the allotted storage space to even a small capacity, such as 1% of the total hard drive space, results in a storage space of approximately 200 MB. Thus, over time tens of thousands of files may be stored in the drive space, resulting in decreased access time to data files stored on the computer and difficulty in searching the storage area.
Browser data file management systems attempt to compensate for this problem by allowing the user to “empty” the designated storage area on the hard disk by deleting all of these files. However, this also deletes files for addresses that the user would be expected to access in the future. Additionally, if a user desires to delete certain cookies to ensure privacy, the user cannot easily determine which cookies should be save and which cookies should be deleted, especially if the cookies do not contain descriptive names. Thus, the user is often left to “guess” as to which cookie to delete. Often, users will err on the side of caution and delete all cookies, thus deleting cookies that the user would otherwise keep. Accordingly, when the user visits a web page of interest, such as an on-line shopping page, the user must again input relevant information that would otherwise be stored in the cookie, such as mailing address, e-mail address, and other contact information.
Browser data file management systems also attempt to compensate for this storage problem by adding a least recently used (“LRU”) algorithm in conjunction with the disk space limitation for the storage of the data files. The browser, through the LRU algorithm, will remove the oldest data files when the allocated hard disk space becomes full. The LRU determines which file has been least recently used by interrogating the “Last Accessed” field. By allowing the user to limit the amount of disk space allocated for the browser storage of data files to a smaller amount, the effects of handling large quantities of data files are minimized. Thus, if a user elects to specify a storage space smaller than 1% of the disk capacity, such as 5 MB, access degradation is minimized. However, the small storage space tends to becomes full soon after browsing a number of addresses, and the LRU algorithm removes those files most likely to be accessed in the future. Thus, the bandwidth requirement is no longer minimized, and access time between the client PC and server is increased.
Finally, a user may interrogate the contents of the storage area directly, searching through addresses stored therein and selectively deleting each data file not of interest to the user. However, this method is labor intensive and highly inefficient, as usually hundreds, if not thousands of data files are often stored in the cache.
In addition to deleting selected files, a user may desire to interrogate the contents of the stored data files. Illustratively, a user may want to ascertain all stored data files associated with a particular Internet address. One method currently available to the user is to selectively sort the contents of the stored data files by the Internet address, and thereafter scroll to the particular Internet address of an associated data file to examine related files. However, this method is time consuming and does not often list all associated data files together, e.g. the cookie file associated with a particular Internet address will be listed separate and apart from other files associated with that same Internet address, as the cookie file name often begins with “Cookie:”. Thus, the user experiences difficulty in ascertaining the full set of data files associated with a particular Internet address.