1. Field of the Invention
This invention relates to the field of task automation. More specifically, the invention comprises a system allowing a user to teach a computational device repetitive tasks using natural language communication and task demonstrations.
2. Description of the Related Art
Computers have been previously used to automate many complex tasks. One of the simplest forms of task automation is the creation of a “macro.” A “macro” performs a series of keystrokes previously defined by a user. The “macro” is often triggered by a single key stroke or command phrase. Once triggered, the macro is then capable of carrying out a complex series of tasks.
Many software applications provide macro capabilities. However, those using macros readily understand their limitations. A macro contains no innate knowledge of the process it is performing. Rather, it is simply a rote issuance of a series of previously-demonstrated user actions. Thus, a macro created in one software application is of no use when running a different application. Even within the same application, a macro may not remain viable when the application is revised. A simple change in menu location or terminology will render the macro useless. Thus, while macros serve to illustrate the possibilities of automation, their utility is limited.
The present invention can be configured to apply to virtually any computational device employing a graphical user interface (“GUI”), including computers, cell phones, video recorders, iPods, ATM's etc. Personal computers running applications such as Microsoft Windows will likely be its most common application. The techniques can apply to many hard-coded applications, such as spreadsheets and word processors. However, the invention will likely be most useful within web browsers. In order for the user to appreciate the scope of the invention's application, some detailed background regarding computer operations over the Internet may be helpful.
The computing environment has undergone a substantial transformation in the past two decades. Computers were historically isolated from each other, with data communications only occurring at specified times and in a limited fashion. The advent of the worldwide computer network known as the Internet has irrevocably altered this paradigm. Most computers are now in constant communication with hundreds of other computers. A communication protocol which allows data exchange over a virtually limitless number of platforms and operating systems was needed in order to allow this data exchange. HyperText Markup Language (“HTML”) has largely filled this need. As those skilled in the art will know, HTML allows communication using only a set of ASCII characters. A hosting computer—generally known as a “server”—transmits a series of instructions to a client computer that has logged into the server.
The instructions are typically a series of ASCII characters in the HTML format. The client computer runs an HTML-decoding application known as a “web browser.” The web browser takes the HTML code—which maybe thought of as a series of instructions—and uses it to create the display of the web page on the client computer's monitor. The instructions can be used to create blocks of text, place photographs, etc.
Of course, the instructions can also be used to create interactivity. The person viewing the web page on a browser may be allowed to make certain selections, enter text data, and even upload additional data such as photographs or videos. Once the client computer is asked to submit the interactive responses, another ASCII transmission is sent from the client computer back to the hosting server.
Those skilled in the art will know that the data transfer protocols have now evolved beyond the original HTML. Extensible Markup Language (“XML”) was created to add more structure to the existing—somewhat freeform—world of HTML. Features of XML were eventually combined with existing HTML code to create XHTML. Cascading Style Sheets (“CSS”) are often used to create layered complexity. In addition, “scripts” are used to carry out a variety of functions, with JavaScript being the de facto standard for this purpose.
Because all these components must interact with a variety of platforms over the Internet, the code used tends to be open. A user viewing a website can typically open the code used to create the display of the site on his or her computer (though the CSS and JavaScript components may only be partially visible). Thus, whereas the source code of most applications running on a computer is difficult to open (and even more difficult to analyze), Internet transfer source code is easy to open (and readily understood by those skilled in the art). The open nature of HTML and comparable code allows a user to “see” many of the functions a particular website presents.
This feature is especially useful in dealing with interactive websites. An interactive website is one which solicits input from the client-user. A good example is the website known as AMAZON.COM. This website allows a client-user to request searches of available inventory. The search request data is transmitted from the client-user to the host-server. The host-server, or more commonly another linked system, then conducts the search and reports the results back to the client-user.
The interaction between the client-user and the host-server can be observed and analyzed by studying the HTML (and other) code that is transmitted back and forth between the two systems. A “macro” can be created to interact with the host-server. It would function much like the prior art “macros” designed to run within specific applications. However, the open nature of the code used to transmit information over the Internet allows for functionality far beyond simply recording and repeating a sequence of user inputs. A much more sophisticated level of automation is possible. A system which identifies a user's intent and then “learns” how to carry out that intent is possible. Such a system is robust, in that it can adapt to changes in the host-server's website and potentially even apply the lessons learned in interacting with a first website to make choices regarding how best to interact with a second and different website. The present invention achieves these objectives.