Products and services are commonly associated with certain online documentation that allows a user to learn how to use the products and services. For example, the package of a software application release commonly includes a detailed online manual to describe how to use the software application. These documents may be stored in a content management system which is a database from which content may be retrieved. One form of the content management system is a linear document where the content of the document is organized sequentially according to an ordered list of topics (e.g., headings) each of which may be associated with textual content that provides detailed information about the topic. Further, certain elements (such as words, phrases, or graphic symbols) of the content for the topic may be hypertexts that each denote an outgoing relation to another topic that can be described by the content (e.g., texts or images) associated with the other topic.
Since the linear document presents its content sequentially according to headings as a single document, the linear document is updated as a whole. Another form of content management system is the component content management system. Compared to the linear document, a component content management system (CCMS) manages the content at a much finer granular level (or component level) than the document level of the linear document. Each component may represent a single topic, concept, or asset (such as an image, table, product, or procedure) that may be updated independently from other components. Each component may further include blocks of content each including elements such as words, phrases, and images that may be changed after an update. Thus, each component may be associated with a specific version of the component, rather than associated with a version of the document. Further, each component may include one or more outgoing relationships linking to other components in the document. For example, a topic may include hyperlinks (in the form of hypertexts) that are linked to other topics that may further illustrate the topic. Therefore, a CCMS may track not only versions of individual components, but also the relationships among components within the CCMS. In this way, a CCMS document may be generated from components that may be easily updated and reused by other CCMS documents.
While CCMS documents may present certain advantages over linear documents, there are a large amount of legacy linear documents existing outside the CCMS that need to be imported into the CCMS. These legacy linear documents may contain topics that duplicate the topics already in the CCMS. Therefore, these legacy linear documents need to be imported into the CCMS in a way that avoids the duplicating content.