The present invention relates generally to a markup language and system for processing electronic documents using the markup language. Specifically, the invention relates to a markup language used for generating news story documents and a system and method for editing and processing news story documents.
The television broadcast news industry has evolved from communicating information with paper and teletype formats to computer based information transfer systems that transfer electronic documents. Currently, specialized computer systems assist in news production, promotion and distribution of electronic documents to allow newsroom personnel to perform their functions more efficiently. Also, these specialized computer systems can store a wide variety of news media such as text, still images, and broadcast motion video for distribution within the newsroom and for transmission to external destinations.
News story information is generally shared by a number of different users with different information requirements. Generally, journalists, producers, directors, and announcers share news story information. In addition, news story information is made available on a wide variety of distribution media, such as video, teleprompters, journalist edit stations, and the like. Thus, a computer system that stores news story information should be able to provide news story information to different information consumers. In addition, with the rise in distribution of news information over the Internet via the World Wide Web (WWW), a larger audience of consumers of news story information exists. A common format for news story information that meets the requirements of a wide variety of consumers would be beneficial.
The Standard Generalized Markup Language (xe2x80x9cSGMLxe2x80x9d) is used to represent a wide variety of document types such as books, electronic software documentation, and equipment specifications, among other applications. SGML is an international standard (ISO-8879) published in 1986 for the electronic publication of documents. SGML defines a markup language wherein content of a document is structured using markup, i.e., tags or codes encapsulating the content. The markup defines elements which form a logical, predictable structure. SGML defines a strict markup scheme with a syntax for defining document elements and an overall framework for marking up documents. A document type definition (DTD) of SGML establishes the structure of a markup document of a particular type, and provides a framework for the kinds of elements that constitute a document of that type. The markup of a document is interpreted as an ordered hierarchy of markup elements when, taken together, form a tree or similar hierarchial object. A markup element describes the function or meaning of the content which it includes.
In such a document, markup elements include tags and their content, such as text, graphics, still images or other media. An SGML document includes markup tags that may be described as start tags, end tags, or empty tags. A start tag begins a markup element. An end tag ends the corresponding markup element. These start tags and end tags define the element in SGML, such as a book, library, or body of a document. An empty tag is understood as being both a start tag and an end tag with no content between the start and end tags. Between a start tag and an end tag other start tags and corresponding end tags may be arranged in a hierarchial manner such that there are children elements and parent elements having a defined relationship to each other.
Also in SGML documents, there are elements that contain metadata, or information about the document. Metadata may describe document information such as location, name, and creation date of an electronic document that may accompany the document or may be embedded in the document itself. Metadata is typically used to catalogue electronic documents or otherwise identify information relative to an electronic document.
The Hypertext Markup Language (HTML) is a particular document type that conforms to SGML by having a definitive DTD. HTML is widely used over the Internet for distributing information between servers and clients. Both SGML and HTML can be edited, viewed and verified according to their respective DTDs. By distributing HTML documents through networks such as the Internet, information providers can rapidly disseminate information to a large number of consumers.
HTML and SGML documents are generally viewed using a software program referred to in the art as a browser or viewer. A viewer program interprets a series of elements of a markup language document as viewer instructions. The elements contain text or images, and a number of formatting commands, when interpreted, change the appearance of text or images within the viewer program. Some viewer programs also provide the capability for editing a markup language document in an environment described in the art as a xe2x80x9cwhat-you-see-is-what-you-getxe2x80x9d (WYSIWYG) environment. In a WYSIWYG editing environment, markup language document element formatting commands, which are normally seen by a normal ASCII text editor, are interpreted in the same manner as in a viewer program.
HTML provides a limited subset of elements within its DTD. The HTML DTD defines a set of tags that support document structures such as lists and emphasis of document elements. The HTML DTD also provides a relatively presentation-oriented model for small documents with limited internal structure. Thus, HTML has fewer features than its more complex counterpart, SGML.
As discussed above, there are many consumers of news story information. Consumers include people with different roles in the news production environment and different equipment types such as teleprompters, viewers, video equipment, and editing terminals. News story documents should contain sufficient information to identify and represent content of a news story for all likely consumers. For example, it may be desired to provide story information from an editor to a teleprompter to display the story to an announcer.
Since a number of different information consumers exist with different information requirements, a news story document format that supports a wide variety of news story information in a structured manner would be desirable.
For example, when presenting news story information during a news story broadcast, there may be a particular timing relationship between news stories. The timing relationship of a story should be tracked to provide additional information to a director or producer during the news story broadcast.
In another example, elements within a news story may have an explicit timing relationship, such as a synchronization. For example, after a certain amount of story text is read, say for a lead-in to an interview, a video tape must be played directly after the text for the lead-in is read. The director of the news broadcast must perform the correct command that plays the video tape.
In summary, both SGML and HGML are inadequate for presenting news story information. SGML is too general in that there are insufficient constraints on the content of a document, while HTML is too limited in structure. In particular, existing markup languages do not provide sufficient constraints to sufficiently define timing information that may be used to properly sequence news story information, to define machine control commands that may be used to automate control functions, or to associate multiple elements within one or more documents for the purpose of synchronizing the elements.
The present invention defines a news story document format that supports a wide variety of news story information in a structured manner. The news story markup language of the present invention provides constraints to define timing information for a news story, to define machine control commands that may be used to automate control functions, and to associate multiple elements within one or more documents for the purpose of synchronizing the elements. The present invention defines a system and method for editing and processing news story documents.
According to one aspect of the present invention, a news story document includes machine control elements for controlling news story production equipment such as a VCR or digital video device. In another embodiment of the present invention to provide a news story markup language document that includes story timing information used for sequencing news stories.
According to another aspect of the present invention, a process for processing markup language documents relating to a news story, comprising the steps of reading an input file having a first file format including a plurality of elements, the input file further including at least one of timing information for representing timing of the news story and synchronization information for synchronizing one of the plurality of elements with another of the plurality of elements. The input file also includes news story information for representing news story information. The process further includes a step of verifying the first file format of the input file based on a document type definition defining a news story markup language.
According to another aspect of the present invention, the process further includes the steps of producing output data having a second file format wherein the second file format is formatted according to the document type definition, and creating an output file based on the output data.
According to another aspect of the present invention, the process further comprises a step of converting the output file to a document file having a format different than the format of the output file. According to another aspect, the format of the document file is HTML format. Also, according to another aspect, the step of converting includes the step of excluding information from the output file when converting the output file to the document file format.
According to another aspect of the present invention, the process further comprises a step of importing an import file having a file format different than the first and second file formats to produce an imported file having a format according to the document type definition. According to another aspect, the step of importing includes the step of adding import file information to a template document having a format according to the document type definition.
According to another aspect of the present invention, the process further comprises a lexical analysis step of analyzing the input file format for a plurality of elements and identifiers, and of producing an output token stream based on the plurality of elements and identifiers.
According to another aspect of the present invention, the step of verifying further includes the steps of checking usage of a plurality of elements and identifiers according to the document type definition to produce a parse tree from the plurality of elements and identifiers, and generating an output file having a hierarchical file structure based on the parse tree that conforms to the document type definition. According to another aspect, the process further includes a step of interpreting the output file by a viewer.
According to another aspect of the present invention, a data processing system for interpreting a news story markup language document, the system comprises means for obtaining a news story markup language document from a storage location, means for parsing the news story markup language document, producing a plurality of markup language tags and associated text, and means for converting the plurality of markup language tags and associated text to system instructions.
According to another aspect of the present invention, the data processing system further comprises means for rendering the system instructions as a visual interpretation of the news story markup language document. According to another aspect, the data processing system further comprises a machine control server and means for converting one of said plurality of markup language tags and associated text to a machine control instruction for execution by the machine control server. According to another aspect, the data processing system further comprises a teleprompter and means for displaying story information in the teleprompter. According to another aspect, the machine control server controls a media presentation device based upon the machine control instruction.
According to another aspect of the present invention, the present invention defines a method of anchoring document text to a control field in the same electronic document in an electronic document editor, comprising the steps of creating an electronic document in a markup language, the electronic document including a declarative tag enclosing the document text, creating the control field in the electronic document, the control field having a unique identification and containing machine control information, and referencing, at a location within the document text, the control field by the unique identification.
According to another aspect of the present invention, the markup language is a news story markup language having news story information for representing content of the news story, news story information including look information for controlling appearance of new story information and head information for identifying the news story. According to another aspect, the machine control information identifies a presentation element associated with a media presentation device. Also, according to another aspect, the presentation element is a video element.