1. Field of the Invention
This invention generally relates to page display software languages for programmers on the world wide web. More specifically, this invention relates to providing uniform content information from a central location to display pages.
2. Description of the Related Art
A significant development in computer networking is the Internet, which is a sophisticated worldwide network of computer systems. A user at an individual PC (i.e., workstation) that wishes to access the Internet typically does so using a software application known as a web browser. A web browser uses a standardized interface protocol, such as HyperText Transfer Protocol (HTTP), to make a connection via the Internet to other computers known as web servers, and to receive information from the web servers that is displayed on the user""s display. Information displayed to the user is typically organized into pages that are constructed using a specialized language such as Hypertext Markup Language (HTML), Extensible Markup Language (XML), and Wireless Markup Language (WML), hereinafter (markup languages). Markup languages are typically based on the Standard Generalized Markup Language (SGML) that was created with the original purpose of having one standard language that could be used to share documents among all computers, regardless of hardware and operating system configurations. To this end, markup language files use a standard set of code tags embedded in their text that describes the elements of a document. The web browser interprets the code tags so that each computer having its own unique hardware and software capabilities is able to display the document while preserving the original format of the document. Each document typically resides in a separate file on the server.
For companies doing world-wide business over the Internet, web pages are translated into the appropriate language and stored as hard-coded HTML and/or active server pages (ASP). Further, business units in different countries or regions often target specific products and/or services for that particular area, requiring customized information on the web pages. Updating the pages may quickly entail an overwhelming amount of overhead for the business organization. Additional overhead is incurred with the proliferation of specialized mark-up languages having unique syntax for different types of computer systems, such as WML for portable, wireless, telephones and personal communication systems. In many instances, the format or style of the page may be common across servers, especially when a company strives for a unified appearance across their pages, but data on the page may be unique to a specific server.
There are a number of different web browsers available, each supporting their own extensions to markup languages such as HTML. Thus, a document written for one browser may not be interpreted as intended on another browser if it does not support the same extensions. In many situations, software developers are forced to create unique documents for each browsers, or to include logic in the markup language that bypasses or executes certain portions of code, depending on which browsers are being supported. This adds another layer of complexity to developing and updating these documents.
XML was designed to meet the requirements of large-scale web content providers for industry-specific markup (i.e., encoded descriptions of a document""s storage layout and logical structure), vendor-neutral data exchange, media-independent publishing, one-on-one marketing, workflow management in collaborative authoring environments, and the processing of web documents by intelligent clients. XML is also used in certain metadata applications. XML supports European, Middle Eastern, African, and Asian languages, and all conforming processors support the Unicode character set encodings.
It is therefore desirable to provide a mechanism for using XML that allows customized web pages to share format and other content/behavior information while providing capability to store data in structured, but flexible collections associated with owners. It is also desirable for the markup language to allow users to recombine and re-use data on many different pages, and to draw on different sources for data. An inheritance mechanism to allow the grouping of pages into classes, and to allow sub-classes of pages to be derived is also desired. It is also desirable for such a system to support standards provided in XML.
In the prior art, there are a variety of systems that provide limited content management capability. Some commercially available content management systems such as Vignette, StoryServer and Inso Dynabase, typically use templates or page components that are dynamically populated from system query language (SQL) databases and recombined into pages using pre-defined templates. These systems generally fit well with highly structured sites having many identically formatted pages, such as a news site, however, the template structures are generally fixed and not flexible. Further, in these systems, the data storage paradigm is based upon filling named slots in the templates, which does not lend itself to a flexible data format that prioritizes the expression of data and its relationships. The template model for such systems is typically based on either Java, or a scripting language such as VBScript or Tcl/Tk, and limited support is typically provided for XML as a data type.
Another variety of systems that provide limited content management capability are internet application servers such as ColdFusion. These application servers are primarily designed to support development of interactive applications. Most of the site template structures are hard-coded as server scripts, often using a mixture of standard HTML tags and proprietary tags that are pre-processed on the server. Each script is independent of the others; there is no inheritance mechanism. Even though the scripts are based on tags, the scripts are not well-formed XML, but rather customized HTML, and the separation of form and data is limited. Further, use of XML in these systems is limited to complete source data files.
Web-enabled object/XML databases such as ObjectStore/eXcelon, Poet, etc., provide a platform for high-performance application development around a flexible repository, but provide limited development tools. The data modeling capabilities are flexible and well-suited to free-form web content, however, and there is no high-level scripting language to provide a framework for managing content.
Traditional non-web content management systems such as Interleaf, ArborText, and TexCel are designed for generic, media-neutral content management, and are frequently SGML-based, therefore leading to a natural evolution towards XML. These systems are typically deployed for maintaining major documentation projects. The output of these systems is normally customized for a particular customer, and may be delivered online, on compact disc, or in print. These systems are designed to assemble explicit documents, however, and do not include capabilities for providing data-driven, script-aided document delivery.
One other system for populating pages includes using ASP and SQL with content selection rules supported by personalization/recommendation software components. This is a relatively simple approach to content management, however, most of the site template structures are hard-coded in HTML, and thus there is no inheritance mechanism. Additionally, most of the data is embedded in the pages, and pages are personalized by populating pre-defined slots with targeted data. Mass customization is possible, but there is little flexibility.
A method and computer program product for generating XML documents using a script language that extends the capabilities of XML. The script language includes control statements for including data content and style information from a plurality of sources. One or more scripts may be developed that includes script language control statements. A script processor processes the scripts and generates a content document and a style document. The content document specifies the content to be included in the XML document, and the style document specifies the style for displaying the content in the XML document. One set of program instructions transform the content document and the style document into an XML document. Another set of program instructions convert the XML document to an output document for a selected type of display. The script language and script processor provide facilities for gathering content and style information from a plurality of sources. Numerous scripts may be generated to override and/or extend information in one or more of the other scripts, thereby allowing a developer to customize selected portions of the output document while using shared content and style for the remaining portions of the output document.