Terminology
Terminology related to the invention is explained in the following.
Navigation Command
Basic instructions that perform navigation from one document (web page) to another, or prepare data for such navigation. The command may be one of: (i) hyperlinks embedded in the page; (ii) form fields contained in online forms; (iii) submission of basic authentication credentials; (iv) clicks of HTML element to cause invocation of client-side JavaScript embedded in the web page.
Script.
A recorded or otherwise created sequence of navigation commands.
Starting Document.
The first document (web page) in a sequence of documents (web pages) visited under the control of a script. The starting document is identified by a static web address, such as URL. For password-protected session-based documents, this is usually the page where user ID and password is entered.
Target Document.
The last document (web page) in a sequence of documents (web pages) visited under the control of a script. If a script has more than one command in it, then the target document is dynamically generated and will have a dynamic address.
Control Element.
Control elements represent a command or a group of commands. Control element of a online document may be clicked to cause navigation from one online document to another online document, or from one control element to another. Control elements can be one of: (i) hyperlinks, (ii) form Submit button, (iii) clickable HTML element that cause navigation JavaScript to run, (iv) etc.
Logical Tree (Document Tree).
The data structure that describes the structure of an online document or relationships between control elements. A logical tree consists of nodes and nodes are connected by edges. If there is an edge leading from node a to node b, it means that node b represents the part of online document that is a subset of the part of the document represented by node a such that b covers the biggest element that still is a subset of a. One node that has no edge directed to it is called a root node of the tree. A logical tree can be obtained for each control element in an online document, such as hyperlinks, form submit button, clickable HTML element that cause navigation JavaScript to run, etc., describing the relationship between the control element and other control elements and online documents.
Document Object Model (DOM).
A standardized way of representing online document as a tree. In DOM, every HTML tag is represented by a tree node and the top-level HTML tag is represented by the document tree root. More details are provided below.
Structured online documents, such as HTML and XML documents are widely available on the World Wide Web (WWW). While some web pages are linked using static online addresses, such as URL (Universal Resource Locator), some pages are linked to a starting page using dynamic addresses. These pages are dynamically generated by a server computer and as such, every page received by a browser has a unique address and a unique set of links in it. Such documents usually contain data that may be periodically updated, wherein such update does not substantially change the format of presentation of such data.
These documents are usually dynamically generated by web servers. Data contained in the documents may be introduced from certain data sources, such as online databases. The documents may contain dynamic data, whose content changes from time to time. However, since these documents are automatically generated by computers, the document structure remains substantially the same for a relatively long period.
Oftentimes, documents that contains dynamic data or are security-sensitive are not identified by a static (does not change with time) online address, such as Universal Resource Locator or URL. Instead, a computer user must start from a starting document that has static address and go through several intermediate documents linking to the starting document before the user can finally retrieve the target document from the server. On some of these intermediate documents, the user must enter authentication data, such as a user ID and password. On some other intermediate documents, users must make choices presented to him as a variety of links or as an online form that collects data needed to continue navigation.
In many cases, servers deliberately generate web pages with globally unique addresses and links in these pages in order to increase the security of online transactions and make the regenerate attacks hard or impossible. As a result, every time users want to access the target document, they must navigate through a series of documents and enter numerous commands before they can access the target document.
Examples of online documents as described above include (1) shipment tracking information provided by couriers: users typically have to start from the starting document that contains a tracking request form, fill out the tracking request form, submit required information, and further link to several pages before reaching the page containing tracking results. (2) Bank account balance for an individual or corporation from its bank web site: users typically are asked to enter User ID and Password on the starting page, and then brought to the Welcome page. The users are then requested to make an account selection and brought to the page that contains the selected account details. (3) Stock trading accounts provided by online stock brokerage firms: users typically are asked to login with User ID and Password on the starting page. Then the users are brought to the Welcome Page and then to the Stock Trade page, where the users are asked to fill out an online form that describes his desired transaction. The users are then brought to the confirmation page and asked to confirm the transaction.
These browsing procedures are often troublesome. Therefore, there is a need for retrieving target documents by automatic generation of navigation command sequences. There is also a need to automatically regenerate the sequence of online document navigation commands that lead from a starting document that has static address to a desired target document that is dynamically generated and has dynamically generated addresses.
Another need exists for generating navigation command sequences in an unattended mode and in a non-GUI environment. For instance, a user may use a wireless device such as a cell phone to instruct a server to run a bank account extraction or stock trade script (sequence of commands) for him. There is another need to eliminate the need to re-enter verification information, such as User IDs and Passwords, every time a user wishes to retrieve a target document. There is also a need to automate the navigation steps and validation information entries.