The present invention generally relates to data processing. The invention relates more specifically to methods and apparatus for automatically updating website content. One common technique for providing website content is via hypertext systems, although other techniques for providing website content are also known to those skilled in the art. Hypertext systems are used herein for exemplary descriptions, however other suitable techniques known in the art for providing website content may also be used and are equally applicable. For example, database records may be used to provide the content, without the use of hypertext links.
One particularly popular system involves the combination of hypertext information technology and distributed network technology and is known as the World Wide Web. A “site” on the World Wide Web (“Web site”) is a set of hypertext information stored on a server that is coupled to the global, packet-switched set of internetworks known as the Internet. The hypertext information typically is created and stored in one or more electronic documents that are prepared using Hypertext Markup Language (HTML). The server and the network operate according to agreed-upon protocols such as Transmission Control Protocol (TCP) and Internet Protocol (IP). A networked end station such as a personal computer, workstation, or other device (“client”) connects to the Internet using the same protocols, executes a browser program that can interpret and display HTML documents, and requests one or more pages of the Web site. The server locates the requested pages and delivers them over the network to the client, which displays them.
Any number of clients may access the server. When most of the clients are located geographically close to the server, or are within the region served by the server, this approach works adequately. However, when a significant number of clients are located geographically distant from the server, or outside a region that is served by the server, delays in the intermediate networks, nodes, and telecommunications systems that make up the Internet may cause unacceptably long delays in receiving pages from the server at the clients. This problem, along with rapid expansion of the international economy and the globalization of online business, has led many enterprises to establish multiple geographically distributed Web sites.
In one approach, a first Web site (often located in the United States or created using the English language) is established in a first location. One or more “mirror” Web sites are established in one or more other locations. Alternatively, the mirror sites are co-located but logically separated and serve different audiences within the same location or region. Each mirror Web site stores and serves an exact copy of the content of the first Web site. Periodically the mirror sites are updated by copying changed content from the first Web site to all the mirror sites. However, a significant disadvantage of this approach is that the mirror sites are not localized or customized according to the clients that they serve. Often the mirror sites are not rendered in the language used for communication by clients in the region served by the mirror sites. This results in an undesirable end user experience.
In a related approach, in a process called “localization,” each local or regional Web site is customized according to local practices. Localization may also involve translation, in which each regional or local Web site is translated into the language of the region served by the site. However, the process of updating a large Web site that is mirrored and localized for multiple countries is complicated and time-consuming.
A related problem is that many Web sites now store and serve dynamic content. First generation Web sites generally consisted of a collection of static HTML pages; every client on every connection received the same pages from the server. Many Web sites now use HTML templates that are filled in with content from a database by the server dynamically just before delivery to a client. When the database is updated, pages delivered to clients automatically reflect the updates. Further, enterprises that still rely on static content often update the static pages frequently to reflect changes in information, products and services.
Content management systems have been developed to address this problem. A content management system seeks to organize Web site content and automate processes of data input, editing and revision, quality control review, and publication to multiple Web site. Some content management systems also provide personalization services, decision support services, and integration with other systems and applications. An example of a content management system is Vignette StoryServer, commercially available from Vignette Corporation, of Austin, Tex.
A drawback of known content management systems, however, is that they do not adequately address the problem of how to automatically update a translated or localized Web site when source documents or other source information underlying the translation or localization changes. In one approach, all Web site content is stored in a database. Translators access the database and translate content from the database into other languages for use in regional or localized Web sites. The databases tend to require storage of content in a highly structured form. This approach is adequate when an entire Web site is translated, but it is inadequate for carrying out selective changes on a site, or for translating a site that has extensive dynamic content.
In another approach, updated content of a first site is copied to a second site and then translated. This approach is best suited to making country by country modifications. However, it often results in inconsistent, ad hoc changes.
Based on the foregoing, there is a clear need in this field for an improved method or apparatus that provides automated updating of website content, in conjunction with, or without, a content management system.
There is a specific need for a method or apparatus that provides automated updating of translated Web sites or localized Web sites having highly dynamic, complicated and extensive site content.
There is also a need for such a method or apparatus that can automatically detect changes in a first set of hypertext information and automatically propagate the changes to a second set of hypertext information, with further changes that are appropriate for or associated with the context of the second set of hypertext information.