A document is, roughly, a combination of textual and graphical elements that are rendered together. Two ways of rendering a document are printing it or displaying it on a display device such as a computer monitor. In professional settings, documents often have versions. Different versions of a document can arise from editing it or correcting it over time. Different versions of a document can also arise by changing the content to focus on different audiences. For example, one version of a document can be intended for an English speaking audience while a different version is intended for a Spanish speaking audience. Versions intended for different audiences are concurrently valid while versions arising from editing over time are sequentially valid.
Concurrently valid documents versions are difficult to track and maintain. Originally, every concurrent version was treated individually. Changing one version often resulted in laboriously making a related change to every other version. For example, if every version contained the same watermark, then changing the watermark entailed changing it for every version.
Layering is a technique that eases the process of making similar changes to many concurrent document versions. Returning to the example above, the different versions can all share a common watermark layer while each version has a unique text layer. The English version has an English text layer while the Spanish version has a Spanish text layer. Rendering the common watermark layer and a unique text layer can produce a version of the document. Some documents have many common layers and unique layers.
FIG. 4, labeled as “prior art”, illustrates a document 401 containing layers. The document 401 is presented with as having a first layer 402 and a second layer 405 although many layered documents have many more layers. The first layer has layer content 404 and a layer type 403. The layer content 404 is what will be rendered when the layer is printed or displayed. The layer type 403 is a property of the layer 402. The layer type is usually inferred because most layers do not contain information that specifically indicates the layer type. The second layer 405 also has a layer type 406 and layer content 407. The layer content 407 of the second layer 405, however, contains numerous objects. The objects are two text layers 408, 409 and two graphics objects 410, 411.
Layers such as the first layer 402 have only one type of content, such as text or graphics. A layer containing only text has a “text only” layer type. A layer containing graphics has a “graphics” layer type. A layer, such as the second layer 405, containing both text and graphics has a “text containing” layer type. Layers that are “text only” are also “text containing” but are commonly treated differently because they are often kept in different, often simpler, formats.
FIG. 5, labeled as “prior art”, illustrates three layers. The English layer 501 has English language content 504. The Mandarin layer 502 has Mandarin language content 505. The Spanish layer 503 has Spanish language content 506. All three layers contain text and nothing else. As such, all three layers are text only layers having the text only layer type. The Spanish language content 506 and the Mandarin language content 505 are computer-generated translations of the English language content 504. The natural language of a phrase is, simply, the language it is in. English is the natural language of an English phrase. All three layers are illustrated as graphic elements. Typically, the layer type and the language of a text only layer is inferred although type and language can be explicitly marked as layer properties.
Machine translation between languages often requires the user to input a phrase and natural language. The user then chooses a destination language. A computer then translates the phrase from the natural language to the destination language. The natural language is input because the language identification algorithms run by computers are not 100% accurate in identifying languages. Language identification is difficult for very short phrases and becomes progressively easier as the amount of text increases. Many Internet search engines are capable of identifying a web page's natural language and provide utilities to translate web pages into a desired language.
The alphanumeric characters in text are often encoded using a standardized encoding. ASCII is an early character encoding used mainly for English language text. Other languages, however, use characters and punctuation that the English language does not use. Unicode is an international standard that contains many language specific character encodings. Other character encodings also exist.
FIG. 6, labeled as “prior art”, illustrates rendering a version of a document 301 having multiple concurrent versions. The document 301 has five layers. The watermark layer 303 is a common layer that is intended to be included in all the document versions. The cow layer 302 contains a graphic of a cow and is also a common layer. The English layer 304 is a text only layer containing English language content. The Spanish layer 305 is a text only layer containing Spanish language content. The Mandarin layer 306 is a text only layer containing Mandarin language content.
A document version specification 601 specifies a version of the document containing English, cow, and watermark. Layering 602 is a task, often performed manually, of taking the document version specification 601, assembling the layers, and passing them to a rendering device 316. The rendering device then produces the English, cow, watermark document version 317.
The document 301 has two common layers and three text only layers. As such, there are three concurrently valid document versions. Proofing and producing all three versions of the document 301 is fairly straightforward because there are only three concurrently valid versions. In a production environment, however, even this document is complicated enough for production personnel to err. They can forget to include a common layer. They can forget to produce one of the versions. Furthermore, many documents are far more complicated than the illustrated document 301. Systems and methods to address the shortcomings of current solutions are needed.