The present invention relates to a method for controlling a series of processes with a human centered interface. More precisely, the present invention relates to integrating a plurality of processes into a common user interface which is controlled by voice activated commands. The method further includes a common framework which allows hands-free control of each process within the framework. A multitude of processes can be easily integrated into the common framework. All processes which are controlled in the common framework can be executed in a multitasking environment.
Recent advances in computer technology has prompted an expansion in the use of personal computers for both business and home use. The widespread use of personal computers has lead to a migration away from central based computing on mainframes to distributed computing on personal computers. Business applications often share common databases and system utilities across an interoffice network. With the growth in the use of the internet, distributed computing models have become increasingly important. By distributing the resources necessary to accomplish a given task, the amount of data required to be transferred across a network can be reduced.
The desire to distribute processing and databases has produced an industry of object based programming architectures and languages. The proliferation of programming architectures/languages such as Java, Active X, C++, COM, OpenDoc and CORBA are a testament to this increased interest in distributed computing. Many prior art software designs have been implemented on personal computers based on these object oriented programming models.
The Common Request Broker Architecture (CORBA) provides an object based programming architecture which operates under a client/server topology. In a CORBA based application program, every task is handled as an object which is a self contained program. An Object Request Broker (ORB) serves as a mechanism for communicating client requests to target objects. Client requests appear as local procedure calls. When a client invokes an operation, the ORB finds the object, sends a request to the object and once the object completes the request returns any responses to the client. Each object operates independent of one another within the system.
In each object based programming model it is common for each executing object to "pop-up" a "window" when any type of input or output (I/O) access is required by the user. When an object is executing a request, focus (an active attention within its window) is granted to the object. Object oriented systems running on personal computers are generally limited to a single active focus to a single object (within it's window) at any given time.
Object based programming architectures like CORBA provide very complex standards with which to work. A programmer must adhere to very stringent programming requirements in order to follow the CORBA standard. In order to allow multiple objects to be used together, CORBA uses a scripting language which queues objects in a sequence. A CORBA architecture does not permit parameter passing directly between objects and requires all parameters to pass through the common request broker.
Current computer technology allows application programs to execute their procedures within individual process oriented graphical user interfaces (i.e. a "window"). Each process is encapsulated in such a manner that all services required by the process are generally contained within the encapsulated process. Thus each object is an entity unto itself. Each process generally contains all of its own I/O within its own operating window. When a process requires I/O, such as a keyboard input, mouse input or the like, the operating system passes the input data to the application or object. It is conventionally known that a process window (a parent window) spawns a child window when the application calls for specific data entry (I/O). This presents certain problems in that the child window does not release focus from the child window until the child window is terminated. When a keyboard and mouse are used as the primary interface, the keyboard and mouse control will maintain focus in the child window as long as the child window is active. The viewing area becomes cluttered with child windows and it is difficult to read and parse all the information on the computer screen.
Current voice driven software technology is useful for little more than a dictation system which types what is spoken on a display screen. Although many programs have attempted to initiate command sequences, this involves an extensive training session to teach the computer how to handle specific words. Since those words are not maintained in a context based model that is intelligent, it is easy to confuse such voice command systems. In addition, the systems are limited in capability to the few applications that support the voice interface.
One program, which was designed by the present inventor, allows for voice activated commands to control a user interface. This program (sold under the name VOICE PILOT.TM.) contains a voice interface which allows for voice initiated execution of programs as well as recording dictation. However, the overall architecture of this program requires the use of child/parent windows as previously discussed. Every voice initiated application maintains its own operating window as a "child window" of the parent process. The child window has to be satiated before releasing control (active focus) and returning I/O access back to the main program.
The child/parent window configuration does not allow for complex command processing. A complex command requires more than one process be performed in a specific order based on a single spoken command phrase. For Example, the spoken command phrase "add Bob to address book" is a multiple-step/multiple-process command. The appropriate commands required by the prior art are: "open address book", "new entry" and "name Bob". In the prior art, each operation is required to be completed one by one in a sequential order. Although this methodology works to a minimum satisfaction level, it does not use natural language speech. The prior art is not capable of performing multiple step operations with a single spoken command phrase.
In addition, the prior art does not provide that a single spoken command phrase causes multiple processes to be executed at the same time. For example, the spoken command phrase "Write a letter to Bob" requires multiple processes to be executed in order to effectuate the command. The prior art would have to do the following: "open address book", "select Bob", "copy address", "open editor", "new letter" and "paste address". The address book and text editor/word processor are generally different applications. Since these programs require the data to be organized in a specific order, the voice commands must be performed in a specific order to achieve the desired result. The prior art is not capable of performing operations simultaneously across multiple applications with a single spoken command phrase.
Current computer technologies are not well suited for use with a voice driven interface. The use of parent and child windows creates a multitude of problems since natural language modeling is best handled with complex command processing. Since child windows receive active focus as a single window, they tend to sequentially process simple (single process) voice commands.
The current invention seeks to overcome these limitations by providing a uniform speech aware interface that is optimized for a hands free, voice driven environment. This is especially useful for contact management, business professionals and anyone looking to eliminate the time wasting procedure of pushing and shoving windows around a video screen to find the useful data buried therein. By utilizing a voice interface, an innovative natural language processor and a unique graphical user interface which supports true multi-tasking, and I/O access which eliminates the use of "child" windows, the limitations of the prior art are overcome.