The functionality of any information system may be described as a series of information transformations. A simple database query can transform the contents of the database and the query information into a specific result set. Financial applications can transform the contents of a database into meaningful financial metrics or a number of graphical illustrations. The ability of being able to perform the information transformation is often carried out by artificial intelligence systems, which include, expert systems, neural networks and case based reasoning applications.
Another type of artificial intelligence system which is gaining in popularity is natural language processing. Natural language processing is concerned with designing and building software that will analyse, understand and generate languages that humans naturally use. This concept is by no means easy, as understanding what a language means, for example, what concepts a word or phrase stands for and knowing how to link those concepts together in a meaningful way is an incredibly difficult task.
A major challenge in any natural language application is to understand and appreciate the context of a word, sentence or phrase. This involves analyzing the sentence to try and determine the context of the sentence; particularly, as the context of a sentence is influenced by many factors including, preceding content of a sentence, the location of the speaker of the sentence, and the particular environment of interest. In order to understand the context of the two sentences below, the natural language application would have to analyse both sentences and determine the context of each sentence. For example looking at the two sentences below:                James, the door is open        James, is the door open?Both sentences are almost identical but the first sentence ‘James, the door is open’, is an action and the sentence ‘James, is the door open?’ is a request. Therefore the context of the sentence must be understood.        
Therefore, when using natural language applications in a particular environment, for example, in a legal environment, to enable the natural language application to work successfully i.e. to produce better results, the natural language application needs to be customized to the environment that the application is working in. For example in a legal environment the difference between the use of the word ‘shall’ and ‘will’ is of great significance.
Another example of this is when, using a natural language application in a banking environment and the employees of the bank use particular terminology to describe various processes used throughout their day to day activities. The natural language application would need to be customized to comprise an understanding of the banking terminology, for example, the definition, and contextual meaning of terms, such as, debit, credit and overdraft etc.
In a business environment as more and more processes change or more and more different scenarios develop for a given situation; there is a need to customise the natural language application by adding new rules or modifying an existing training set. This becomes incredibly time consuming and can take days to write just one rule. Using conventional technologies, this creates a burden on the organizations that are using these types of applications and has hindered the development and adoption of natural language applications into common day practice.
Existing techniques for customizing natural language applications include the task of editing domain specific dictionaries and taxonomies to include the terminology expected in the final application. More advanced techniques include the development and implementation of specific language processing rules using systems, such as, VISUALTEXT®, from Text Analysis International, Inc,. VISUALTEXT provides the functionality for information extraction, natural language processing and text analysis for a variety of applications. For example, business events may be extracted from web sites and a searchable database can be built for searching for the latest business information.
Even with tools, such as, VISUALTEXT, the development of new language processing rules requires an in-depth expertise in the field of natural language.
Most types of artificial intelligence systems use a variety of mathematical techniques to automatically or semi-automatically discover patterns and develop rules. Whilst the specific form of these rules differs depending on the techniques used, the general principle of mapping some form of input pattern to some form of output pattern, is common across most techniques. For example, rule induction systems employ rule induction algorithms i.e. statistical modelling techniques, to perform recursive functions through a tree structure and construct sub structures and rules as a result. Rule induction system work better when the feature space that it is operating in is fixed and pre-defined. For example, when using a technique called N-Grams. N-grams are a separate technique from rule induction systems; however they both have the same limitation of working best with well defined and constrained feature spaces. Such techniques are described in the book: Programs for Machine Learning, by John Ross Quinlan, Morgan Kauffman, 1993.
Neural networks are modeled on a cellular structure; however the methods of specifying the cell structure are limited. For example, in some types of neuro-fuzzy systems it is possible to define a structure based on knowledge of the order of the underlying functions. Similarly, with lattice based neurons (e.g. CMAC) there are formal strategies or algorithms for knot placement. However, in many areas further work is still needed in understanding how to structure large systems with many thousands of cells. Many researchers have focused on the scale of large systems in the belief that self organizing rules will enable the emergence of structures. This has not proved to be the case. Garis's work on the emulation of billions of neurons focuses almost entirely on the processing load issues and not on the definition of the cell structure. This is still a problem that needs to be solved.
Finally, most artificial intelligence technologies are most successfully applied in constrained applications where a static feature space is formally defined and sufficient training data exists to fully populate that static feature space. Artificial intelligence technologies are less successful in areas where the feature space is dynamic and unconstrained; for example when new features are frequently introduced into the natural language system, or where a sub structure of a feature changes.
Therefore there is a need within the art for the above mentioned problems to be alleviated.