It known that documents may be simultaneously published in multiple versions, where the versions are intended for different audiences or different uses. Examples of such multi-versioned documents are: (i) books that are published in their regular versions, and in alternate versions such as alternative language, large-print, or deluxe versions with additional, premium content; and (ii) movies (e.g., DVD's) that are released in their regular versions, and often, shortly later, in alternate versions such as a “director's cut,” a version “edited for airline use,” or versions edited (and perhaps sub-titled) for foreign markets. Furthermore, various documents, some with sensitive content, perhaps business-sensitive, or portions deemed for particular audiences, are alternately released in expurgated versions. However, all of these examples have in common the fact that multiple, simultaneous versions of these documents are handled and conveyed, throughout most of their production and distribution and use, almost entirely as if they were entirely different, separate documents—thereby increasing the cost of producing and distributing such documents in different versions. In other words, compiling/composing, publishing/producing and distributing multiple versions of documents gives rise to a cost structure that approaches the cost of an entirely separate document for each version.
Heretofore, a number of publications have disclosed methods for managing different versions of documents, particularly in the realm of software development. Similarly, word processing software such as Microsoft® Word is capable of altering the display or printing of information in response to user-specified preferences (e.g., hidden text, comments, track/highlight changes, views).
When documents are created, many decisions must be made as to style, content, layout, and the like. The text, images, and graphics must be organized and laid out in a two-dimensional format with the intention of providing a presentation to the viewer or user that will capture and preferably maintain their attention for a time sufficient to get the intended message across. Different style options are available for the various content elements and choices must be made. The best choices for style and layout depend upon content, intent, viewer interests, etc. In order to tell if a set of choices made as to the look and feel of the final version of the document were good or bad, one might request feedback from a set of viewers after viewing the document and compile the feedback into something meaningful from which the document's creators or developers can make alterations, changes, or other improvements. This cycle repeats until the document's owners are satisfied that the final version achieves the intended result.
Factors that contribute to the quality and effectiveness of layout and style decisions for a document, or a particular document version, are the handling of groups of content elements as style and layout choices affect such groups. A group is a collection of content elements. Group membership is a property of the logical structure of the document and may be impacted by a particular document version. The neighborhood of groups can be considered a layout property. While layout structure often matches the logical structure, there is no requirement that it do so.
Preferably, one would like to have a quantitative measure of various value properties of the document (measures of the document “goodness”) based on properties inherent in the document itself. In this manner the document itself provides a level of quantitative feedback. For instance, one property that developer's would like to be able to measure would be how easy it is to use a document. A measure for the ease of use of a document can be used in evaluating or making document design decisions.
One aspect of the ease of use of a document, or version of a document, is one's ability to tell which elements belong to a group and which do not. The style and layout decisions that are made in the presentation of a document can affect the degree of group identity that it conveys. In evaluating a document's design for its ease of use, it is useful to have a measure of the degree of group identity. Considerations for ease-of-use with respect to groups include spatial coherence, spatial separation, alignment separation, heading separation, background separation, and/or style separation. Measures for various characteristics of content, feature, and the like could be weighted by intent, relevance, and other parameters and these could then be combined to obtain one or more overall measures for the document itself. If one had a method for evaluating properties inherent in the document itself then such a measure could be used during the document development process to help determine optimal presentation.
For different uses of a document it may be desirable to have different versions or presentations of the document. For example, a document presented on a PDA or cell phone may need a different appearance from its presentation on a full-size display or print. These differences may go beyond layout and style changes and effect the actual document content. Another example of a multi-versioned document would be different versions that are displayed or depicted depending upon the user or the user's intended use of the document. In a first encounter with a document, a user may simply wish to review the document to determine its relevance to the research she is conducting. However, upon later review, the same user may wish to see underlying data, and perhaps even citations in the form of links to related materials (“live” footnotes or citations in an on-line version of the document that can re-direct the user to the content of a related document that was cited by the author).
An aspect of the ease of use of a document is its searchability. Searchability can be defined as the degree to which the document structurally supports the finding of a desired content element. A document with high searchability provides aids that help in finding desired content. In general, a document with high searchability measure is easier to use because it is easy to locate the portion of the document containing the information of interest.
Another aspect of a document's ease of use is the document's degree of distinguishability. The distinguishability of content can be defined as the ability to identify one particular content element from another content element within the document. Distinguishability is important in establishing the context for the information disclosed by the element. It can reduce confusion about what that element is and to what group or setting it belongs. It can also aid in locating a desired element. The distinguishability of the document elements is therefore a contributing factor to the ease of use of the document.
Another property that would be desirable to be able to quantitatively measure is the ability of the document to hold the viewer's attention and interest. While much of the document's ease of use depends upon the actual content and its relevance to the viewer, there can also be a contribution from the style with which that content is presented. If a measure of the effect of style decisions on ease of use could be defined it could be used in determining a measure of optimal presentation.
Documents can present content in ways that make it easier to locate individual items. This can be referred to as ‘locateability’. A way to distinguish one content object from another object is to evaluate the target object's locatability, i.e., how easy it is to find an object within the document. This is a little different from distinguishability, which tells how well an item can be differentiated from its neighbors. Structural aids such as layout of tables or bullet lists help the document viewer to locate objects. Presenting content in a structure such as a table allows its location to be identified by row or column. The presence of headings for the rows and columns can further increase the ease of locating items. Presenting content items in a structure such as a list introduces an ordering that aids in locating them, and the use of list bullets or item numbers aids further. Separability and distinguishability contribute to the locatability of an object.
Measures for various aspects of content, features, and the like could be weighted by intent, relevance, and other parameters that may reflect a weighting based upon the medium used to render or represent a version of the document, and these could then be combined to obtain one or more overall measures for the document itself. If one had a method for evaluating such properties inherent in the document itself then such a measure could be used during the document development process to help determine optimal presentation.
Therefore, it is desirable to provide a methodology to measure the quality of a document in a quantifiable way. Moreover, it is desirable to provide a quantifiable measurement of quality which is useable in evaluating the document and improving its quality so as to add value to the information being conveyed through the document.
What is disclosed is a method of document processing to analyze content and to automatically generate, for rendering, content alternatives embedded within a single, multi-versioned document. Such a document would include document structure such as content selection nodes that allow the various content alternatives to be collected within a single document rather than being captured in different document versions. The alternative content would be distinguished by tags (e.g., XML) or similar identifiers. The purpose of content alternatives is to enable, for example, different content to be displayed in different situations, e.g. depending on the display device, the user's requirements or preferences, etc. One important aspect is the architecture and use of tags to identify content alternatives.
A method of automatically choosing from content alternatives is disclosed, and includes methods to carry out content choices when the alternatives are “offspring” of a content selection node hierarchy within a document structure. In addition to the candidate content alternatives, each content selection node contains an indicator of which candidates are to be selected. By altering this indicator, the optimization techniques described herein can then examine different possible content choices to determine which work best. The technique can be used to optimize the presentation or rendering of the document, where content choices are examined along with layout and style choices.
In accordance with the present invention, there is provided a method for creating a multi-version document, comprising the steps of: identifying a first section of said document intended for representation in a first document version; identifying a second section of said document intended for representation in an alternative document version; tagging at least the second section of said document to indicate its association with said alternative version; and storing at least a portion of the multi-version document as a single digital file including the first and second sections.
In accordance with another aspect of the present invention, there is further provided a method for creating a multi-version document, comprising the steps of: identifying a first section of said document intended for representation in a first document version; identifying a second section of said document intended for representation in an alternative document version; tagging at least the second section of said document to indicate its association with said alternative version; storing at least a portion of the multi-version document as a single digital file including the first and second sections; using the multi-version document to render at least two document versions, wherein at least one alternative selection input is determined by a variable of a constraint optimization problem, and the document versions are created in response to alternative values for at least one variable variable; and analyzing the document versions created to determine which of the at least two document versions is a preferred version in accordance with at least one predetermined criteria.
In accordance with yet another aspect of the present invention, there is provided a method for creating a multi-version document, comprising the steps of: identifying a first section of said document intended for representation in a first document version; identifying a second section of said document intended for representation in an alternative document version; tagging at least the second section of said document to indicate its association with said alternative version; storing at least a portion of the multi-version document as a single digital file including the first and second sections; and further comprising the step of rendering a version of the multi-version document for a user, including locating, within the stored digital document, a selection node, inputting, to the selection node, information sufficient to cause the identification of at least one of the plurality of alternative content nodes associated therewith, and outputting, in a user-determined format, a version of the multi-version document including content from the identified at least one of the plurality of alternative content nodes.
One aspect of the invention deals with a basic problem in the creation and storage of documents having more than one intended version, particularly for documents to be used for different purposes and rendered using different media. This aspect is further based on the discovery of a technique that alleviates this problem. The technique employs a tagging structure, incorporating selection and content nodes within a larger document so as to represent a plurality of possible or alternative versions within a document. It is believed that such a structure, and a method for use of the structure, is more convenient and will lead to lower cost manipulation (e.g., storage, retrieval, and distribution) of multiple versions of a document as a single unit.
The techniques described herein are advantageous because they provide an inexpensive means for storing and controlling different versions of a document for use by different users and on different media. When compared to other approaches, the invention makes it unnecessary to store and maintain multiple disparate documents in order to provide different versions of the document. The techniques of the invention are advantageous because they are believed to provide a range of alternatives, each of which is useful in appropriate situations, while permitting the efficient storage and management of the alternative versions of the document.
The present invention will be described in connection with a preferred embodiment, however, it will be understood that there is no intent to limit the invention to the embodiment(s) described. On the contrary, the intent is to cover all alternatives, modifications, and equivalents as may be included within the spirit and scope of the invention as defined by the appended claims.