An ontology is a model of the important entities and relationships in a domain. Ontologies have been used in capturing the semantics of a system. Much of the focus in developing semantics and ontologies has been on handcrafting them. That may be appropriate for high level concepts that are domain-independent and which will be used by virtually all domain specific ontologies, but there is a need for a less time-consuming approach to generating domain ontologies, where extensive knowledge is already captured in unstructured texts.
Machine processable domain knowledge has been captured in a variety of formats ranging from expert systems to procedural code to UML models. Regardless of the representation paradigm, the knowledge has in general been hand-coded. On the other hand, large quantities of knowledge have been captured in unstructured texts such as conference proceedings, technical papers, books, and more recently web pages and electronic documents. The latter have generally been written by subject matter, or, domain experts while the former have traditionally required a tight collaboration between domain experts and knowledge engineers or programmers. It has generally been difficult or impossible to obtain the former from the latter by automated, computer-assisted means.
There are a number of ontology extraction tools mentioned in the literature including:                ASIUM—is a system based on clustering and cooperative methods that is targeted to extracting taxonomic relationships.        ExtrAKT—extracts ontology information from structured database like Prolog knowledge bases.        OntoLT—is a Protégé plugin for ontology extraction from annotated text.        OntoLearn—learns domain concepts and detects taxonomic relationships among them to produce a domain concept forest.        TextToOnto is part of KAON tool suite. TextToOnto extracts terms that can potentially be included in the ontology as concepts and it also does rule extraction either on the basis of proximity of two terms or by looking for common patterns like “term like term” to identify hierarchy relationships.        XRA—uses software reverse engineering approach to extract an initial ontology from given data sources and their application programs.        DOODLE-OWL—is a system for on-the-fly ontology construction. This system relies heavily on user interaction in this construction.        
In view of the large quantities of information available in unstructured hard copy texts, web pages and electronic documents, there is a continuing need to be able to extract concepts or relationships from existing, unstructured documents. In view of the existing volumes of unstructured documentation and the rate of increase, domain independent automated methods and systems are preferable. It would also be preferable if required human involvement could be minimized.