1. Field of the Invention
The present invention relates to a method and apparatus for automatically providing hypertext anchor codes and destination addresses for a user-readable text file. The destination addresses are intermittently updated under the control of a central server to ensure that the destination addresses remain current. The invention is particularly suitable for use with text files which are stored on a server in a computer network such as the Internet.
2. Discussion
As the volume of information stored on computers continues to dramatically increase, new methods are sought to organize the information in an easy, intuitively retrievable way. Hypertext, which may include Hypertext Markup Language (HTML), Extended Markup Language (XML), or other forms of Standard Generalized Markup Language (SGML), is a common method of linking related computer files or pages. A file that references other information stored on a computer, whether directly or indirectly, generally displays an icon for the referenced information in some form of distinguished or highlighted text, usually colored or underlined. A computer user viewing the page can access the referenced document simply by selecting the highlighted text in the instant file, e.g., by clicking on the highlighted text with a mouse or other pointing device. A markup language anchor, or markup language hyperlink, is the reference icon on a Web page which links a user's Web browser to relevant information.
An HTML anchor, or HTML hyperlink, is the underlined text on a Web page which links a user's Web browser to another location. An HTML file includes text and HTML tags, and may also include graphics (e.g., hypermedia). Inside an HTML file, a tag is surrounded by angle braces "&lt;. . . &gt;". Text is displayed on the browser's screen with selected attributes such as font size and style. Tags are used to designate the current font, style, location, or to add images or convey other formatting details about the Web page to the browser.
Stand-alone tags and container tags may be used. Stand-alone tags involve one set of braces. For example, to put an image on the Web browser's screen, one might use:
&lt;IMG SRC="picture.gif"&gt; PA1 &lt;A HREF="http://www.ibm.net"&gt;IBM&lt;/A&gt;
"IMG" refers to "image". "SRC", which refers to "source", is an attribute whose value is the name (i.e., source) of the file containing the image, e.g., "picture.gif". Container tags involve two sets of braces, namely one set to mark the beginning of a field, and another set of braces to mark the end of the field. HTML anchors are container tags. For example, to link the text "IBM" to the Uniform Resource Locator (URL) "www.ibm.com", one might use:
"&lt;A&gt;" is an anchor code in HTML. Note how the "&lt;/A&gt;" indicates the end of the container tag that began with the "&lt;A . . . &gt;" tag. "HREF" refers to a hypertext reference attribute.
This form of hypertext, illustrated in FIG. 1, was originally conceived in March of 1989 by Tim Berners-Lee at the European Nuclear Council (CERN) as a method to disseminate information to geographically distributed researchers in high energy physics.
FIG. 1 is a block diagram of a static link architecture for linking a primary computer file to one or more destination files. Computer files, such as the primary computer file 100, are stored locally on individual Web servers, but the hypertext links are capable of referencing documents on distant servers. For example, the primary computer file 100 includes two hypertext words, "A" and "B". The traversal of "A" (i.e., the user selecting "A") links the user to a destination file 110, which contains text related to A. Similarly, the traversal of "B" links the user to a destination file 120, which contains text related to B. Generally, destination file A (110), destination file "B" (120) and the primary computer file 100 are each stored on physically separate servers, or computers.
The now familiar World Wide Web was launched publicly in January of 1992 when CERN opened its Web server to allow researchers to access data from the CERN server. Since then, the World Wide Web has seen incredible growth. Its uses now reach well beyond the international physics community.
The unprecedented growth in the World Wide Web has hastened the creation of more advanced methods of linking computer represented information. Graphics objects can now achieve the same linking functionality as traditional hypertext. However, these links are "hard coded". That is, the developer of a computer file using hypertext links (e.g., a Web developer) establishes connections for the links that remain static. The developer can manually reposition the links, but their static nature remains. One important problem facing the developer, then, is where to point the hard coded hypertext or graphics links. The developer must choose wisely, because the link will have to be manually changed later if the developer's preferences change.
Fortunately, the growth of the World Wide Web has also led to the development of multiple search engines, such as Yahoo.TM. and Lycos.TM., that allow a user to find needles of Web documents in the haystack of available information. The Web developer can locate URLs of desired computer files by entering keywords in the search engine and manually filtering the results. These search engines use primarily voluntary site registrations and Web user suggestions to develop and categorize large databases of URLs. These databases allow a user to find a desired Web document, and allow a developer to find a desired URL for static hypertext and graphics links.
However, even the capability of these search engines leaves the Web developer unsatisfied. Practical considerations preclude using static links for all available information because of screen size and storage limits. Information organized in real time when requested or "on the fly" according to a user's preferences overcomes the static hypertext limitation. Therefore, a primary area of development has been interactivity with Java.TM., ActiveX.TM., and Common Gateway Interface (CGI) scripts. Java.TM. and ActiveX.TM. enable a personal computer to run applications that help interactively retrieve and format requested information from a local or distant Web server. Similarly, CGI scripts allow the computer to launch an application on the currently accessed Web server that interactively retrieves and formats information. The Web developer can use these methods to give the user who accesses the page some control over which files are retrieved by various links.
For example, FIG. 2 is a block diagram of a dynamic link architecture for linking a primary computer file to one or more destination files. Here, a CGI script, Java Applet, or ActiveX control "A" (210) is responsive to a user input (200) for linking the hypertext "A" in the primary computer file 100 to the destination file "A" (110). Likewise, a CGI script, Java Applet, or ActiveX control "B" (220) is responsive to a user input (230) for linking the hypertext "B" in the primary computer file 100 to the destination file "B" (120).
Thus, the Web developer has two options for providing hypertext links in a primary computer file. The developer can insert static hypertext or graphics links using the search engines to determine the precise destination of the links. Alternatively, the developer can use an interactive method that allows the current user viewing the computer page to input preferences. These preferences are then used to filter, in real time, available files and retrieve the desired information.
However, these options suffer from two important disadvantages. First, the manual process by which static links are entered is tedious. A Web developer must find the desired destination URLs using available search engines and manually annotate the hypertext file with those URLs. If the developer's preferences later change, or if the URL is changed, the process must be repeated.
FIG. 3 illustrates the manual insertion of hyperlinks into a primary computer file. A primary computer file 300 contains text, such as a news article. At 310, manual link insertion must be performed by manually identifying the particular words in the primary computer file 300 which are to have links. Next, corresponding anchor codes and URLs which are written in an HTML format must be inserted into the primary computer file. Finally, the primary computer file 100 with the hypertext "A" and "B" is obtained.
A second disadvantage with existing techniques for providing hypertext links is that a Web developer must either provide static links or allow the user some control over the destination of those links. Dynamic links created with Java, ActiveX, or CGI scripts can disallow user input, but current methods would reduce such emasculated dynamic links to effectively static links. That is, the developer would have to modify such links manually, and that manual modification is the essence of a static link.
Accordingly, it would be desirable to provide a system which allows a Web developer to automatically enter hypertext links into a computer file such as a news article or other sequence of user-readable character strings. The system should also provide simple and central control over the destination of previously static links. The system should allow updating of the links without requiring further processing of the computer file. The system should also provide pre-assigned preferred destination addresses for specific character strings.
For destination addresses which are not pre-assigned, the system should provide the capability to search a computer network to assign an appropriate destination addresses. This search should be performed in accordance with preference criteria. The system should provide the capability to assign class codes to the specific character strings. Additionally, the system should assign expiration periods or dates to the destination addresses. The system should maintain a hit count of the character strings at each content server, and provide a capability for transmitting hit count data to the central server.
The present invention provides a system having the above and other advantages.