In the early stages of the phenomenal growth of the Internet, fairly simplistic web sites predominated, permitting the web "surfer" to simply and quickly access information displayed on a graphical user interface screen, and then log off. Such early web sites were relatively unsophisticated. However, as the richness and utility of the Internet began to cause its well-chronicled and explosive growth, the potential of the Internet began to be recognized by web site designers, and commercial enterprises. Thus, not only did web sites increase in complexity in terms of such things as hot links and nesting of pages, but additional richer features began to appear, such as the ability to post and get data downloaded to files and the dynamic changing of the web site as a function of user interaction or profiles whereby custom web pages would appear.
As the complexity and need for dependability of applications running on web servers increased (such as with the advent of commercial on-line banking and shopping transactions), the need arose to be able to rigorously test these more complex applications for their integrity and robustness. Turning to FIG. 1, depicted therein is a simplified view of the Internet environment in which such an application resides, including the mechanisms typically utilized prior to the subject invention for effecting such testing and verification of the Internet application. Typically a web server 10 is provided on which one or more web applications such as the banking application 12 are running. The web server 10 is interconnected to end users 16 and a core controller 14, the latter, in turn facilitating access to information, for example, of a financial institution 18 reposing on its respective databases 20. In such a typical Internet application as the illustrated banking application, this system thereby enables huge numbers of end users to quickly, efficiently, and in a secured fashion, transact their financial business with the financial institution over the Internet.
With the advent of these more serious and complex applications of the Internet and with the corresponding increasingly dire consequences which might arise from misbehavior of the various components of FIG. 1 and, in particular, those consequences arising from faults with the application 12 and server 10, a need arose to rigorously test these web server applications 12. One can readily appreciate the disaster awaiting a commercial concern such as an airline ticket reservation company or bank handling large volumes of transactions and dollar amounts on a daily business, should it be found that various logical program design flaws exist in the application 12. It is precisely for these reasons, that a great deal of research and technology development was directed at ways to test these applications for integrity prior to their introduction as a live application into the Internet. The prior manner of doing so is further illustrated in FIG. 1. In effecting such testing, it was conventional to provide for one or more testers 22 who would manually exercise the given application 12, often employing "screen scrapers" so as to capture and store in storage 24, various screen images delivered to the tester 22 from the application 12 as the tester(s) traversed the various links of the application.
Several problems are associated with this approach to testing of web server applications. First, the process was extremely slow due to the associated screen draws and saving of GUI data as well as the manual nature of the process. This manual aspect of the testing further led to failure to fully test the application due to links missed either by mistake or the sheer number of such links involved, failure to exhaust all possibilities of data input/output, inability to retest and compare data, and the like.
Attempts to automate the screen scraping tasks yielded some improvement but did not address even more serious flaws with the approach to validating Internet applications. The data which was being tested and saved often was browser dependent and really resulted in merely testing the browser front end and the images being returned thereby. However, what was really needed was an efficient and thorough way to exercise all of the APIs and code itself of the application running on the web server. In a typical application such as a banking application there may be literally hundreds of APIs and reference pages associated therewith for performing various functions such as login, account summaries, and so forth. This "back-end" of a given web server application therefore has associated with it an immense amount of program logic and data "gets" and "puts" such that it would be extremely desirable to be able to test and verify not only whether images from a particular browser are as expected, but that this data being entered and returned as a result of traversing the web site and its associated links was in fact correct. Thus a system was needed which, in an automated fashion, could test and verify the logic and data and associated myriad permutations and combinations of APIs and reference pages associated with a web server application rather than merely testing and debugging the browser and associated GUIs per se. Such a system was needed which, unlike the prior systems, could avoid saving data such as these GUI images which were not critical to testing this "back-end" logic and information developed by the server application.
As the sophistication of web server applications increased, yet an additional problem surface in facilitating the testing of the web applications. The ability to fashion dynamically configuring web server application pages soon developed in the maturation of the web which could build HTML pages on the fly. While it was known in the art to fashion static test cases to test web pages either in a manual or automated fashion, a difficulty arose in essentially providing for dynamic test cases which could test these changing web pages. Even in a static sense, making changes from a prior test (such as adding fields, changing fonts, moving images, resizing screens, etc.) would break test pages, and necessitate the rewriting of such tests. Moreover, when these changes could occur in a dynamic way as is currently the state of the art, this even compounded further the ability to test and verify web server applications in a meaningful way.
A related problem to this dynamic nature of web pages is that it was highly desirable to perform repeated tests on server applications whereby comparisons could be made to prior data. However, in the case of prior art, the data which was being saved (e.g., screens) was not the data critical to testing of the integrity of the underlying data. Rather, it was associated with the browser per se and thus was not the data and logic of paramount importance in verifying the web application itself. Thus, it was highly desirable to provide a mechanism for verifying data from repeat visits by comparison to previously acquired data.
In summary then, a need existed to be able to efficiently and automatically request, capture, store, and verify data returned from web servers disassociated from the particular browser itself, wherein such data related more importantly to the actual underlying data and logic of the server application. An ability was needed to run in an automated fashion to avoid errors and lack of thoroughness associated with manual traversal of web sites. Further, an ability was needed to discard data now critical to such verification and testing. Moreover, an option was highly desired for saving and reusing this returned data for the testing of subsequent transactions. Still further, it was highly desirable to avoid use of a conventional browser per se, so as to avoid the browser interfering with the tests which were of real interest (due to compatibility issues and the adverse affect on application performance attributable solely to the browser). A web server application testing mechanism was further needed which could accept links automatically from a data file as well as from a GUI edit field, save the returned pages, and further have the ability to verify these pages automatically and tally the results--all without user intervention.
In achieving the foregoing advancements, mention has already been made one inherent downside to the prior art technique of manual traversal of web sites, namely that due to the sometimes incredible number of permutations and combinations of links provided in web applications, it was frequently virtually impossible for any such manual traversal to exhaust, particularly in a reliable manner, the number of such links. Thus, a serious associated problem with providing for the aforementioned automated web server application verification and testing was the problem of devising a mechanism for extracting all known links on a given plurality of HTML pages in an automated fashion, and to format such link data so that it might subsequently used in the verification and testing of the application.
As will be hereinafter detailed, not only was it highly desirable to provide the aforementioned automated verification and testing of web server applications, but further to do so in a manner in which the particular web server and associated application could be stressed. It was necessary to invoke multiple instances of such testing so as to simulate real world conditions of multiple users accessing a web application in the same timeframe. Not only did the prior art fail to provide an efficient mechanism for extracting all such known links, but there was further no known comprehensive way to employ these automatically generated links in a common input which could be employed in combination by both (1) the web application verification and testing as well as (2) the web straining functionality just described.
In addition to the need for an efficient and reliable means for extracting links to be utilized for the aforementioned testing and verification, a need existed for a way to employ such links in a manner whereby they might be readily used and formatted in a manner so as to facilitate the testing of the transactions in question. Previously it was known to manually write test cases for various transactions of interest and further to manually transform these into HTML or Javascript pages which could thereafter be utilized in such testing.
However, due to the complexity of the links and transactions, such efforts were often futile, error prone, and not comprehensive. This complexity taught away from the possibility of an automated mechanism now made possible by the invention to traverse a large group of web transactions so as to build a cohesive set of HTML/Javascript pages which could in fact be employed in testing such transactions. Test cases were needed for testing transactions which could be stored in a definition file and run in conjunction with a tool to create web pages with all data needed to run tests and create setup files which could in turn be utilized by the other aforementioned automated tools for exercising the application APIs and straining the web server with replications of the virtual browser testing the web application and web server in question.
Yet an additional problem remained in providing a technology for testing a web server and associated web application in a realistic manner so as to know in advance in the real world how they will perform. Not only was there the problem of acquiring all relevant links, efficiently fashioning them into a format that could be utilized to test the server and application, and thereafter providing the mechanism for doing so which could test for the integrity of the real data of interest (rather than merely capturing GUIs, testing for browser inadequacies, and the like), but it was further necessary to ensure in the testing that the server was being tested in a realistic manner as might be expected in a real environment. In the prior art, test vehicles most assuredly existed for simulating a user's traversal of a web site. However, the actual behavior of a web site in real world conditions, wherein multiple users might be hitting the server in the same time interval, was such that the behavior characteristics of such a server and corresponding application might differ radically from the case in which the server/application are being tested by a single test program.
A conventional solution to this problem of more realistically simulating the real world environment in server-application testing was to, in a brute force manner, simply provide in real time for a multiplicity of human test users who, at the same time, might access the same server/application in order to "stress" it. Obvious limitations in availability of trained test personnel resulted in inadequate testing by a number of simultaneous users far less than might be expected in real world conditions. This thereby resulted in unreliable test results not mirroring what was to be expected in the actual environment in which the server/application would reside.
Accordingly, a more effective technology was sorely needed which could provide the ability to stress and exercise a web server by simulating multiple users accessing the server simultaneously or in a staggered fashion. This need included the ability to perform such web stressing employing fast, non-stop posts and gets from the web server, such a requirement not being met by conventional browsers which would require up to perhaps 50 or more manual users.