The World Wide Web is the total set of interlinked hypertext documents residing on servers all around the world. Documents on the World Wide Web, called Web pages, are written in HTML (Hypertext Markup Language), identified by URLs (Uniform Resource Locators) that specify the particular machine and path name by which a file can be accessed, and transmitted from server to end user under HTTP (Hypertext Transfer Protocol). Codes, called tags, embedded in an HTML document associate particular words and images in the document with URLs so that a user can access another file, which may be halfway around the world, at the press of a key or the click of a mouse. These files may contain text (in a variety of fonts and styles), graphics, images, movie files, and sound, as well as Java applets, ActiveX controls, or other small embedded software programs that execute when the user activates them by clicking on a link. Like all computer networks, the World Wide Web connects two types of computers—clients, which reside at a user's site, and servers, which reside at a remote site—using a client/server architecture.
Client/server architecture is an arrangement that makes use of distributed intelligence to treat both the server and the individual work stations as intelligent, programmable devices, thus exploiting the full computing power of each. This is done by splitting the processing of an application between two distinct components: a “front-end” client and “back-end” server. The client component, itself a complete, standalone personal computer (versus the “dumb” terminal found in older architectures) offers the user its full range of power and features for running applications. The server component, which can be another personal computer, mini-computer, or mainframe, enhances the client component by providing the traditional strengths offered by mini-computers and mainframes in a time-sharing environment, such as data storage, data management, information sharing between clients, and sophisticated network administration and security features.
One of the valuable functions provided by the server component is its storage capability. The server component's disk drives and other external storage media represent facilities for holding information on a permanent basis allowing retrieval at a later time by either the server component or the client component. Over time, more and more files or data are stored on the server component, some of which will never be used again or will be forgotten. This condition can eventually slow read and write access times if the server component's storage media are very full and storage is badly fragmented. Also, if the data is stored in a database, the database size also affects the performance. Moreover, additional storage media will need to be purchased to accommodate the increased storage load, hence raising the cost of operating the server component. A system 100 in FIG. 1 illustrates this problem as well as other problems in greater detail.
The system 100 includes a personal computer 104 representative of the client component and a server 126 representative of the server component. The personal computer 104 allows a user 102 to access on-line services offered by the server 126 via a network 124, which is a group of computers and associated devices that are connected by communication facilities consisting from a range of only a few computers, printers, and other devices, to many small and large computers, which can even be distributed over a vast geographic area.
A Web browser 106 is software running on the personal computer 104 that lets the user 102 view HTML documents and access files and software related to those documents on the server 126. The browser 106 includes a number of tools for navigation, such as a Back button 112, a Forward button 114, and a Home button 116. These buttons are positioned on a navigation bar 110 that contains the name of the Web page (“HOME”) being displayed. The Web page 108 has an identifying line 118 for texturally describing the service provider that operates the server 126 (“NUTY ON-LINE SERVICES!”). Two lines 120, 122, on the Web page 108 act as hyperlinks, which are connections between an element in a hypertext document, such as a word, phrase, symbol, or image, and a different element in the document, another document, a file, or a script. The user 102 activates a link by clicking on a link element, which is usually underlined or in a color different from the rest of the document to indicate that the element is linked. For example, line 120 presents a phrase “CREATE WEB PAGES” which is underlined. Line 122 presents a phrase “PUBLISHED WEB PAGES” which is also underlined.
One of the services being provided by the server 126 to the user 102 is the ability to create personalized home pages via the activation of the hyperlink at line 120. When the user 102 is satisfied with the design of the personalized home page, the user 102 can publish the personalized home page by activating of the hyperlink at line 122. By publishing the personalized home page, the user 102 allows other users to access the personalized home page and view its contents. The key allowing other users to access a personalized home page is the ability of the server 126 to store the personalized home page, such as personalized home pages 128A, 128B, and 128C.
The storage capacity of the server 126 has a physical storage limit beyond which the server 126 can no longer accommodate additional personalized home pages. As more and more personalized home pages are created and stored in the server 126, read or write access to personalized home pages may slow down, hence affecting performance. In other words, the user 102 as well as other users may be frustrated while waiting for a desired Web page to display on the browser 106. The physical storage limit of the server 126 can be overcome by adding additional storage media, such as additional disk drives. However, this tends to increase both costs and complexity in the administration of multiple storage media by the server 126.
One deleterious effect that can occur over time is that the user 102 and other users may come to lack interest in the created personalized home pages and abandon them, hence clogging the server. Another similar problem occurs when the user 102 simply forgets about or is unable to recall how to access the created personalized home pages. For example, the user 102 may forget the URL by which to access the created personalized home page even though the personalized home page is still being stored by the server 126. A third problem is that the user 102 may access the server 126 and create yet another personalized home page. Thus, the server 126 may have to retain not only abandoned personalized home pages indefinitely but also those that are forgotten, unused, or redundant. The most pernicious problem of all, however, occurs when hackers create a program that automates the creation of multiple, personalized home pages to overload the server 126 and then forsake them.
Each problem discussed above as well as their combinations tend to increase costs of operating the server 126 because more storage capacity must be added to accommodate abandoned, forgotten, unused, and redundant information as well as useful information. These problems also cause a decrease in efficiency because of the perceived need for information which in fact no one cares about. Diminishing of performance of the server 126 is also expected because of the slowing of read or write access times.
While the above problems are discussed in the context of personalized home pages, any pieces of information that are stored in the server 126 may become abandoned, forgotten, or unused. Without a resolution to the problem of abandoned, forgotten, or unused information, users may eventually no longer trust the system 100 to provide a desired computing experience that can reproduce stored pieces of information within a short period of time and demand for the system 100 will diminish from the marketplace. Thus, there is a need for a system, method, and computer-readable medium for removing stale information while avoiding or reducing the foregoing and other problems associated with existing systems.