The internet, and more specifically, the World Wide Web, has hundreds of millions of pages available, waiting to present information on a variety of topics to an individual. In addition, many companies, organizations, and individuals have constructed true applications using the World Wide Web as the delivery medium. Consumers can now conduct on-line banking, shop for automobiles, participate in on-line auctions, and many other activities of a transactional nature. World Wide Web based applications such as those mentioned above all present the user with the various screens and data that make up the application through a web browser such as NETSCAPE NAVIGATOR, or MICROSOFT INTERNET EXPLORER. Software engineers term these types of World Wide Web transactional programs ‘Web Applications’ in contrast to software applications that run on older text based terminals, or in contrast to older client server technologies. Broadly then, World Wide Web has produced 2 classes of content. The first class of content may be read-only content, in which the user is directed to a specific web page and can then read the content of that page. The second class of content are Web applications, wherein the user actually enters data in an interactive manner to perform some tasks, like shopping for a product, performing a banking transaction, etc.
All World Wide Web applications and content present and retrieve user input using (broadly) two technologies. The first is the actual network protocol that is hidden from the user. This protocol is widely known as the Hypertext Transfer Protocol, or simply ‘HTTP’. The format of the information flowing over HTTP and subsequently accepted by a web browser, or conversely a web server is Hypertext Markup Language, or simply ‘HTML’.
A very important part of web technology is that it functions over a public network. Thus almost anyone can, from a home computer, view web pages that are actually published all over the world. From a business perspective, the ‘publicness’ of the World Wide Web has removed a big barrier for moving data between companies and companies, and between companies and individuals. Heretofore, if two companies wanted to exchange business information, they would typically construct a private network, then agree on a protocol, then implement software that supported the agreed upon protocol, then transfer the data. Usually that process took months or years of work. Today, using the World Wide Web, each company simply publishes the data to their web server, and any of its authorized partners can see the data visually using a web browser.
As the web grew in the 1990's billions of web pages were produced. In addition, thousands of web applications were produced. While on the whole this has been a positive development for technologist, it has created a unique problem as regards integration of content and data from these many web pages and applications. Web technology today focuses on human beings being able to interact with a web browser to view and modify web based information. However there is a great need for machine to machine communication over the Internet, as contrasted with human to machine which is what constitutes the bulk of all activity on the Internet today.
Machine to machine communication over the internet using HTTP and HTML is problematic. The fundamental reason for that is that HTML is more suitable as a publishing format, rather than a data exchange format. For example, it is trivial for a human to use a web browser to go, to a banking site and check his/her account balance, it is much harder to write a piece of software that will, from a computer, access a bank web site, login, and query for an account balance, returning the data, not in HTML, but in a more ‘machine friendly’ format like the Extensible Markup Language (XML).
Programs that access web sites, that are normally accessed by human users sitting at a computer using a web browser, are call variously ‘software robots’, ‘robots’, ‘software agents’, ‘agents’, ‘programs’ or other. Programs that access web sites appear from a network and data presentation format, to the web site as an end user would. The web site or web application receives and sends information exactly as it would to a real human user, but instead the information is received, processed, and responded to by a piece of software running on a computer somewhere on the Internet, rather than a live human user using a web browser. In these applications, the web site, or web application does not know it is communicating with anything other than a web browser, when in fact it is communicating with a piece of custom written software.
Today the majority if not all of these robotic applications are hand written by skilled software engineers. New web sites and applications are being created at a rate that is much more rapid than the ability of software engineers to write corresponding robotic applications. So in general web applications today immediately allow human users to perform some function, but do not support machine to machine communication.
No invention exists today that automatically generates software robots that can manipulate existing web sites and web applications. The net effect of the invention is a dramatic reduction in the amount of time necessary to implement machine to machine communications that utilize existing web sites and web applications.
Moreover, no application or program exists today that uses a system of monitoring and analyzing functions placed to a distinct web browser or other network based application to produce an extract of network transactions that can be manipulated by software to perform the desire operation automatically.
Further, no system of application exists for a program to emulate the transactions of the network based application and mimic the transactions in later access of the same network based applications or other relevant network abilities.