The present invention concerns ontologies in general. It relates more particularly to a method, used by a computer, for developing an ontology from a text in natural language.
In the present description, the following terms are employed with the meaning indicated, unless specifically indicated:                “Ontology”: an ontology is a structured set of organized concepts, for example organized into a graph the relations whereof can be semantic relations or relations of composition and inheritance (in the object sense). An objective of an ontology is to model a set of knowledge in a given domain.        “OWL” is a Web ontology language designed for applications that must not only present users with information but also process the content thereof. OWL is an XML “dialect” based on an RDF (Resource Description Framework) syntax, which designates a graph model for describing metadata and for certain automatic processing of that metadata. OWL provides means for defining structured Web ontologies. The OWL language offers machines greater capacities for interpreting Web content than usual, for example using XML, thanks to a supplementary vocabulary and a formal semantic. OWL is made up of three sub-languages offering increasing expressivity: OWL Lite (or OWL) OWL DL and OWL Full. OWL-S (where S stands for “semantic”) is “semantic” oriented, as yet exists only as a proposal and has not been standardized.        “Web service” designates an application accessible on the INTERNET, via a standard interface, that can interact dynamically with applications or other Web services using communication protocols, for example based on XML, independently of the operating system and the programming languages used. At the level of its interfaces as such, a Web service comprises processing operations that supply results based on input data or “input parameters”. To use a Web service, one of its operations is called, and supplied with the expected input data, and the output result is recovered.        “UML” (Unified Modeling Language): designates a notation (rather than a language) for modeling by means of objects, used to determine and to present the components of an object system during its development, and where applicable to generate its documentation. UML is currently the OMG standard. It results from merging the work of Jim Rumbaugh, Grady Booch and Ivar Jacobson, and has evolved in numerous ways.        “Semantic Web” designates an extension of the World Wide Web used to publish, consult and above all automate the processing of formalized knowledge, which means that documents processed by the semantic Web contain, instead of texts in natural language, formalized information to be processed automatically.        “XML” (extensible Markup Language): an evolution of the SGML language, which is used in particular by HTML document designers to define their own markers, with the aim of personalizing the data structure.        
Modern telecommunication technologies, in particular the Internet, enable users to access a variety of services quickly. In this field, the semantic Web is in full expansion, especially where applications for developing services with the aid of existing services, using a semantic approach, are concerned. In this regard, more and more Web services are provided with an ontology or, more generally, a semantic description.
In this context, the present inventor has set himself the objective of finding a solution to the following problem: automatically producing a semantic description (for example via a semantic graph or, in other words, an ontology) of a text in natural language. That text could, for example, correspond to a user enquiry written in natural language. Having a semantic description of such an enquiry would facilitate the search for a Web service corresponding to that enquiry, for example.
At present there is no automatic solution to this problem. A manual solution is known, which consists in “manually” establishing semantic descriptions using a semantic tool such as Protégé or MindManager, or even a UML modeling tool such as Rational Rose, Softteam Objecteering, IBM-Rational XDE or Microsoft UML Visio. In fact, by virtue of its specific construction, UML can cover all the conceptual elements required for a semantic description: inheritance, relations of aggregation or association, attributes, stereotypes, elementary data and labeled values, constraints, etc.).
However, such a solution is not satisfactory, mostly because it is entirely manual. It is consequently lengthy to implement and prone to errors. Moreover, this solution is subjective: the result depends on the user. The result of this is a lack of uniformity in the descriptions obtained.
There is therefore a need for a solution for producing automatically (i.e. by computer) a semantic description of a text in natural language from text data corresponding to that text.