Providing good quality of service (e.g., low response times) to end-users of distributed information systems is essential for e-commerce, among other applications. An important step in performance management of such systems is modeling the end-user behavior. A realistic end-user model allows to: (a) better quantify end-user perception of performance; (b) create representative workloads; (c) provide better resource management; and (d) improve the system's security by recognizing potentially dangerous end-user behavior patterns.
A first step in building an end-user model is to characterize end-user transactions (EUTs). EUTs comprise a sequence of commands that end-users issue to their workstation such as, for example, opening a database, opening a view, reading several records and closing the database. In distributed systems, these commands typically cause remote procedure calls (RPCs) to be sent from the user's workstation to one or more tiers of servers that process the RPCs. To illustrate the foregoing, we use the Lotus Notes e-mail system. Common RPCs include OPEN_DB, READ_ENTRIES, and FIND_BY_KEY. Given a time ordered sequence of such RPCs from the same end-user, we want to identify the beginning and end of EUTs and label each type. Examples of the EUTs in Lotus Notes include: replication, search for a note, update notes, and resort view.
Because end-user workstations are so numerous and since they are often not the responsibility of the administrative staff, there is often little opportunity to collect information about EUTs from the workstation itself. Rather, it is at the servers where EUT information is obtained in the form of RPC sequences. Unfortunately, little information about end-user transactions is present at the server. In principle, client-server protocols could be instrumented to mark the beginning and end of user interactions. However, this is not sufficient to identify EUTs since users often view a sequence of application interactions as a single unit of work. In existing practice, this quandary is addressed either by using surrogates for EUTs (e.g., synthetic transaction generated by probing stations) or labeling EUTs manually for post-processing. The former often leads to incorrect assessments of service quality. The latter is extremely time consuming.
Therefore, it is highly desirable to have an automated system for recognizing EUT using the RPC sequences recorded on servers.