Electronic documents may be created using a variety of techniques. Thus, it may be desirable to store data from an electronic file in a format that is independent of the process used to create it, so the electronic document may be accessible to a range of users. One format that allows such access is the portable document format (“pdf”). Pdf is a file format for representing documents in a manner independent of the application software, hardware, and operating system used to create the documents and independent of the output device on which they are displayed or printed.
A pdf workflow assumes a one-way production process where the pdf document contains a rendition of document elements that are laid out for final presentation, i.e., no logical structural information is typically preserved for the document elements. Consequently, one problem with storing documents in a pdf format is that it is difficult to reuse parts of documents, because elements with semantic affinity are not stored as one logical group of elements. Therefore, it is difficult to select related elements of a pdf document that are desired by the user to be reused.
For example, it may be desirable for a user to insert a graph or chart from a pdf document into a document of the user's own creation or make a slide presentation with the graph or chart. However, most pdf documents do not generally support sharing or repurposing the content of the document, and it is generally difficult to select for reuse all the elements for a figure, an illustration or a paragraph as an integrated object from PDF.
There are a few techniques available for reusing pdf document content. However, the available techniques are complicated and may require extensive user interaction. For example, one complicated technique may extract a raster rendition of a selected document portion from a display bitmap. However, all the original document structure and attribute information is lost, as well as resolution, which is usually limited to the 72 dpi screen resolution. Therefore, the selected portion may not be readily assembled on a new document.