This invention relates to method and apparatus for the input of data into computers and, in some embodiments, to subsequent retrieval thereof. Particularly but not exclusively, in one embodiment the invention relates to the input of data to, and data retrieval from, a database, and in another to the input of data defining a specification for a computer program.
1. Field of the Invention
The problem of providing communication between humans and computers has occupied those in the fields of computing hardware and software since the birth of computing. For decades, the goal has been to provide computers which can communicate with a human xe2x80x9cnaturallyxe2x80x9d by understanding free-form speech or text input. However, despite continued progress, this goal has not been reached yet.
Human-computer interaction is used for many things. For example, it is used to input immediate instructions for action by a computer (which is at present mainly provided by the combination of a cursor control device such as mouse and icons displayed on the screen, or by the use of menus). It is also used to input instructions for subsequent execution (which is mainly achieved at present by forcing human beings to use tightly constructed programming languages or descriptive languages which, despite their superficial resemblance to human languages, bear little relationship to way human beings actually communicate). Finally, it is used for data storage and retrieval (which is at present typically performed by storage of a textual document, and retrieval by searching for the occurrence of character strings within the document).
Those skilled in the art have approached this problem by the development of artificial intelligence techniques, with the aim either of providing a sufficiently comprehensive set of rules that a machine can eventually understand natural language input, or of providing a xe2x80x9cself learningxe2x80x9d machine capable of developing the same ability by repeated exposure to natural language.
In one aspect the present invention seeks to address the same technical problem, but from a different direction. In the present invention, an input document (which may be spoken or in text form, or indeed in any other form representative of natural language) is input to an input apparatus (which may be provided by a general purpose computer) and is analysed, to separate the meaningful concepts within the document and record these together their inter-relationship. The present invention has this in common with most attempted artificial intelligence systems.
For example, EP-A-0118187 discloses a natural language input system which is menu driven, allowing a user to select one word at a time from a menu, which prompts the next possible choices based on what has previously been input.
U.S. Pat. No. 5,677,835 discloses an authoring system in which documents to be translated are input and analysed, and where ambiguities are detected, the user is prompted to resolve them.
In this aspect of the invention, however, these meaningful entities (for example, the concepts described by nouns) are displayed on an output screen, in a graphical form, which represents them as separate icons and meaningfully indicates their interconnection or relationship.
This apparently simple step provides a number of benefits. The first is that it gives immediate feedback to the human inputting the data of the xe2x80x9cunderstandingxe2x80x9d gained by the computer. Natural human language is full of ambiguities which, normally, human beings are readily able to resolve without conscious thought because of their shared knowledge base, which are at best ambiguous and, at worst, mis-recognised by a computer.
To take an English example, xe2x80x9cMary was kissed by the lakexe2x80x9d is ambiguous, since it can be interpreted either as indicating that the lake is the active party (the kisser) or that the lake is the location at which Mary is kissed by an (unknown) active party.
Whereas a human immediately understands the correct meaning, and may not even see the presence of an ambiguity, a computer is unable to do so unless programmed by a rule or conditioned by experience.
By displaying the construction understood graphically, however, the present invention enables the avoidance of such ambiguities which are immediately recognisable to the user.
Very preferably, the invention provides a graphical user interface to enable the user to manipulate the graphical display, and means to interpret the results of such manipulation. Indeed, it would in principle be possible to allow the user to directly input the document graphically without previous direct document input (although this is not preferred, for reasons of speed, for most applications).
Thus, the user is able to allow the computer to extract as much meaning as possible from the input document and then to correct the ambiguities or errors graphically.
The invention will be understood to differ from so called xe2x80x9cvisual programmingxe2x80x9d systems, as described, for example, in EP-A-0473414. Such visual programming systems provide a graphic environment in which operations to be specified are represented visually, and a user may specify a sequence of such operations by editing the display to create and alter linkages between the elements. However, in visual programming, as in other known methods of creating or specifying programs, the user is constrained to select from a limited number of predefined operations and connections therebetween. By contrast, the present invention accepts documents as input and analyses the documents to provide the graphical display which may subsequently be edited.
The resulting semantic structures, corresponding to the graphical representation (corrected where necessary), are stored for subsequent processing or retrieval. In one embodiment, data retrieval apparatus is provided. In another embodiment, the stored data is employed by a code generator, to generate a computer program.
The invention is advantageously used for this latter application, because the detection of ambiguities eliminates one of the difficulties in existing software specification and automatic code generation from such specifications.
In either case, in the preferred embodiment there is a stored lexical table which stores data relating to the meanings of words which will be encountered in the source document (analogously to an entry in an well structured dictionary).
Preferably, in this case, the apparatus is arranged to perform xe2x80x9creasoningxe2x80x9d utilising this semantic information, by comparing the meanings of groups of words (i.e. clauses or sentences) of the document to locate inconsistencies, or by performing the same operation between multiple different documents.
This is particularly advantageous in embodiments where the source document is to act as a specification for the generation of computer code, because it enables the location of conflicting requirements.
Since the present invention, in this embodiment, has some xe2x80x9cunderstandingxe2x80x9d of the xe2x80x9cmeaningxe2x80x9d of words, it is able to store the content data (for example in the form of semantic structures representing groups of words such as clauses or sentences) by reference to such xe2x80x9cdictionary entriesxe2x80x9dxe2x80x94i.e. by reference to their xe2x80x9cmeaningxe2x80x9d, rather than the source language word which was input. This makes it possible to use a multilingual embodiment of the present invention, where the lexical entries are mapped onto corresponding words in each of a plurality of languages, so that data may be input in one language and output in one or multiple different languages.
In embodiments for data retrieval, or similar applications, each such lexical entry may have an associated code indicating the xe2x80x9cdifficultyxe2x80x9d, xe2x80x9cobscurityxe2x80x9d or xe2x80x9cunfamiliarityxe2x80x9d of the concept described. For example, concepts may be labelled as familiar to children upwards; familiar to adults; or familiar only to particular specialists such as physicists, chemists, biologists, or lawyers.
With knowledge of the level of familiarity of the data retriever, the present invention is in this embodiment able to utilise such ratings to output data appropriate to the understanding of the retriever so as not to output information which is too facile for an advanced user, or too complex for a casual user.
Different semantic elements may be associated, explicitly or implicitly, with an access level rating. Thus, for example, classified items may be available for access only to properly identified users; xe2x80x9cadult onlyxe2x80x9d items may be classified as unavailable to identified children; and proper names may under some circumstances be suppressed (for example, the name of parties to litigation).
By associating a classification with each item, rather than with documents or materials as a whole, a much finer-grained control of information is obtained.
In data retrieval embodiments, the data retrieval apparatus preferably comprises a natural language generator, for generating a document from semantic structures produced as described above. This has several advantages over the mere supply of corresponding portions of the original document.
Firstly, as described above, it provides the possibility of a multilingual embodiments since different generators may, from the same semantic structure, generate text in different languages.
Secondly, where access codes are employed as described above, the generator may in preferred embodiments be able to re-generate readable text from a reduced amount of information (for example, by using the passive voice where a name is suppressed, rather than the active voice). This aspect of the invention is also useful separately of the data input methods above.
The input and/or output according to various embodiments of the invention may be in the form of speech, text or animated video, in which case the input and/or output apparatus comprises, as appropriate, speech recognisers and/or synthesisers; text input and output; and image pick up and analysis/video generation apparatus.