The invention relates to Internet web browsers, and, more particularly, to data file managements systems for browsers.
A variety of available web browsers are used to access the Internet. Microsoft's Internet Explorer and Netscape's Netscape 6 are two widely used browsers. Other browsers, such as Konqueror, Opera, Mosaic, and Mozilla, enjoy wide popularity. Typically, a browser stores data files associated with Internet sites and web pages, or other network sites, such as a LAN, WAN, etc., on a user's hard drive, usually in a dedicated storage folder or directory. Such a storage folder or directory is often referred to as a cache. The data files are usually graphics files, such as .jpg and .gif files, and text files, such a .html, .txt, .asp and cookie files. Usually the text files define the content of an Internet site or web page, and reference the graphics files to be included in the browser display. Each data file stored in the storage folder includes a file name and data stored within the file. Also, there is additional data stored and used by the browser's data file management system. This additional data is usually in the form of data fields indicating the Internet address, size, when the file was last modified, and when it was last accessed. Commonly used names for these data fields are “Internet Address”, “Size”, “Expires”, “Last Modified”, “Last Accessed”, and “Last Checked”.
Several of these fields can be used in conjunction with a “Conditional Get” command as specified in the Hypertext Transfer Protocol specification defined as “HTTP/1.1”, June 1999, published by the World Wide Web Consortium, the disclosure of which is incorporated herein by reference. When a user browses the Internet, the data files for each Internet address the user accesses are automatically stored on the user's hard drive in the storage folder. The data files for an Internet address are stored and are accessed on repeated visits to that address to reduce the amount of data that must be downloaded. Illustratively, this is accomplished by the browser searching for a data file on the user's hard drive that is identified by the .html or .asp file for that address. Once found, the browser sends the “Last Modified” date and file name to the server. If the file has not been modified since the “Last Modified” date, the server need not send the file but only an acknowledgment. This significantly reduces the bandwidth requirement of the system. Similarly, the “Expires” field, which contains a date, can be check against the current date. If the current date is later than the “Expires” field, the client will request a new file from the server.
Computers have the ability to store many megabytes (MB) of browser downloaded data, thus allowing data from more addresses to be stored locally and thereby reducing bandwidth requirements and increasing server response speed for the user. As a user continues to browse various web pages, the number of data files stored on the user's hard drive increases proportionally. However, while browsing the Internet for product and/or information, the user will access data at many Internet addresses that are of no value or interest to the user. Nevertheless, the data files for each address are automatically stored on the user's hard drive regardless of value or interest. Over time the amount of data stored by the browser can become excessive and tend to slow down the response of the browser, as the browser must search through possibly tens of thousands of stored data files contained in the cache.
Illustratively, most web browsers allow a user to specify the amount of space allotted to this storage area, usually as a percentage of total hard drive space. However, as this data increases in quantity, the response of the user's computer to search and find these stored data files will decrease. Additionally, current hard drives have a capacity of at least 20 gigabytes (GB) or more. Thus, setting the allotted storage space to even a small capacity, such as 1% of the total hard drive space, results in a storage space of approximately 200 MB. Thus, over time tens of thousands of files may be stored in the drive space, resulting in decreased access time to data files stored on the computer and difficulty in searching the storage area.
Browser data file management systems attempt to compensate for this problem by allowing the user to “empty” the designated storage area on the hard disk by deleting all of these files. However, this also deletes files for addresses that the user would be expected to access in the future. Additionally, if a user desires to delete certain cookies to ensure privacy, the user cannot easily determine which cookies should be save and which cookies should be deleted, especially if the cookies do not contain descriptive names. Thus, the user is often left to “guess” as to which cookie to delete. Often, users will err on the side of caution and delete all cookies, thus deleting cookies that the user would otherwise keep. Accordingly, when the user visits a web page of interest, such as an on-line shopping page, the user must again input relevant information that would otherwise be stored in the cookie, such as mailing address, e-mail address, and other contact information.
Browser data file management systems also attempt to compensate for this storage problem by adding a least recently used (“LRU”) algorithm in conjunction with the disk space limitation for the storage of the data files. The browser, through the LRU algorithm, will remove the oldest data files when the allocated hard disk space becomes full. The LRU determines which file has been least recently used by interrogating the “Last Accessed” field. By allowing the user to limit the amount of disk space allocated for the browser storage of data files to a smaller amount, the effects of handling large quantities of data files are minimized. Thus, if a user elects to specify a storage space smaller than 1% of the disk capacity, such as 5 MB, access degradation is minimized. However, the small storage space tends to becomes full soon after browsing a number of addresses, and the LRU algorithm removes those files most likely to be accessed in the future. Thus, the bandwidth requirement is no longer minimized, and access time between the client PC and server is increased.
Finally, a user may interrogate the contents of the storage area directly, searching through addresses stored therein and selectively deleting each data file not of interest to the user. However, this method is labor intensive and highly inefficient, as usually hundreds, if not thousands of data files are often stored in the cache.
In addition to deleting selected files, a user may desire to interrogate the contents of the stored data files. Illustratively, a user may want to ascertain all stored data files associated with a particular Internet address. One method currently available to the user is to selectively sort the contents of the stored data files by the Internet address, and thereafter scroll to the particular Internet address of an associated data file to examine related files. However, this method is time consuming and does not often list all associated data files together, e.g. the cookie file associated with a particular Internet address will be listed separate and apart from other files associated with that same Internet address, as the cookie file name often begins with “Cookie:”. Thus, the user experiences difficulty in ascertaining the full set of data files associated with a particular Internet address.
According to the invention, a system for managing a plurality of data files for web browsers is provided. The system includes a storage area on a computer storage medium, the storage area storing the data files; a computer configured to access the storage area; a first database configured to index the data files stored in the storage area; and a program configured to generate automated search strings, the program further configured to search the database index according to the automated search strings and identify data files associated with the automated search strings.
Also according to the invention, a system for managing a plurality of data files for a web browser includes a computer; a storage area on a computer storage medium, the storage area storing the data files and accessible to the computer; a database configured to index data files stored in the storage area during a single browsing session; and a program configured to search the database and identify data files indexed by the database.
Also according to the invention, a method for managing a plurality of data files stored in a storage area for a web browser is provided. The method comprising the steps of indexing the stored data files in a database to provide a database index; generating automated search strings based on the stored data files in the storage area; searching the database according to the automated search strings; and identifying data files associated with the search strings.
Another system according to the invention comprises a computer storage medium; a computer configured to access the storage medium; a first list of network addresses stored on the computer storage medium; a storage area on a computer storage medium, the storage area storing the data files; and a program executable on the computer, the program configured to identify data files associated with the first list of network addresses and delete data files not associated with the first list of network addresses.
Another system according to the invention includes a computer storage medium; a computer configured to access the storage medium; a list of network addresses stored on the computer storage medium; a storage area on the computer storage medium, the storage area storing the data files; and a program executable on the computer, the program configured to determine an access frequency associated with one of the data files and modify the list of network addresses based on the access frequency of the data file.
Also according to the invention, a system for managing a plurality of data files for a web browser includes a storage area on a computer storage medium, the storage area storing the data files; a computer configured to access the storage area; and a program executable on the computer, the program configured to determine an access time associated with the computer accessing storage area, and further configured to delete data files in the storage area if the access time exceeds a threshold value.
Also according to the invention, a method for managing a plurality of data files stored in a storage area for web browsers is disclosed. The method comprises the steps of indexing the stored data files in a database; storing user-defined search strings in a database; generating automated search strings based on the stored data files in the storage area; storing the automated search stings in the database; searching the database index according to the user-defined search strings and automated search stings; and identifying data files associated with the search strings.
Another method in accordance with the present invention includes the steps of inputting a search string, searching a data field in a storage area, associating data files in the storage area with the data field, and deleting the data files associated with the data field having the search string.
Another method according to the present invention includes the steps of searching a data field in a storage area, establishing a common address in the data field, and associating data files with the common address.
Another method according to the present invention includes the steps of indexing data files in a storage area, and indexing network addresses associated with the data files in the storage area to increase access speed to the information in the data files.
Additional features of the invention will become apparent to those skilled in the art upon consideration of the following detailed description of the illustrated embodiment exemplifying the best mode of carrying out the invention as presently perceived.