The present invention relates generally to an integrated simulation tool, and more particularly to a method of an integrated simulation tool for estimating performance of complex computer systems using business patterns and scripts.
As e-business and its related requirements grow at xe2x80x9cWeb speedxe2x80x9d, a critical issue is whether the IT infrastructure supporting the Web sites has what it needs to provide available, scalable, fast, and efficient access to the company""s information, products, and services. More than ever, CIOs (Chief Information Officers) and their teams struggle with the challenges to minimize downtime and network bottlenecks and maximize the use of the hardware and software that comprises their e-business infrastructure.
Capacity planning and performance modeling of complex computer systems generally require detailed information about the workload assumed to be running on those systems. For detailed performance studies of processors, a trace of the workload is typically used. This, combined with the right model of the processor hardware can be used to accurately estimate the average number of cycles used per instruction. Combining this with the processor cycle time leads to an accurate estimate for the processor MIPS (Million Instructions Per Second).
For higher-level system modeling where the user throughput rate is to be estimated, the processor MIPS rate is typically taken as an input assumption for the model. This, combined with the path length (i.e. number of instructions executed by a typical user) can be used to estimate the system throughput in terms of the number of users per second that can be served (i.e. user arrival rate). Additional factors, such as the average number of network bytes transferred, or disk I/O operations done per user can also be factored into the calculations.
A simple capacity planning can be done by calculating the number of users per second that can be processed without exceeding the utilization requirements of any of the system resources (i.e. processors, disks, network). More detailed estimates that also project the overall response time per user (factoring in queuing effects on various resources) can also be made using well known Mean Value Analysis techniques.
While these types of system analysis do not require detailed instruction traces, they still require path length, disk IO, and network data rates for the average user. Often times this information can be obtained from measurements or traces. However, for many studies of new workloads in the rapidly emerging world of web serving and e-Business, such data often does not exist due to the newness of the workloads, or because projections are needed for an application that has not yet been developed.
Studies of actual web site operations show that most applications can be grouped into a small number of basic xe2x80x9cbusiness patternsxe2x80x9d. The business patterns include patterns such as: user-to-data, user-to-business, user-to-online-buying, business-to-business, and user-to-user. Within each of these business patterns, it has been found that the most user behavior is dominated by a small number of basic usage patterns. These basic usage patterns are referred to here as user scripts or just xe2x80x9cscriptsxe2x80x9d. Each script consists of a number of basic steps or page visits. In this view of user activity, a user enters the system, executes one of these scripts, and leaves the system. The probability of choosing each script, along with the definition of the steps within each script is some of the key parameters defining a business pattern.
Studies done by IBM have identified several workload business patterns. These business patterns characterize the functions of most large web sites and e-business systems. Those of ordinary skill in the art will recognize that the present invention is not limited to those workload business patterns identified by IBM. The present invention will equally apply to the business patterns identified by others or yet to be identified. The workload business patterns which are identified by the IBM studies, are mapped to basic business patterns and defined as follows:
Publish/Subscribe (user-to-data). Sample publish/subscribe sites include search engines, media sites such as newspapers and magazines, and event sites such as those for the Olympics and for the championships at Wimbledon. The content of these sites changes frequently, driving the changes to page layouts. While the search traffic is low volume, the number of unique items sought is high resulting in the largest number of page visits of all site types. As an example, the Wimbledon site successfully handled a peak volume of 430,000 hits per minute using IBM""s WebSphere Edge Server. Transaction traffic is lowest and security considerations are minor compared to other site types.
Online Shopping (user-to-online-buying). Sample sites include typical retail sites where users buy books, clothes, and even cars. The site content can be relatively static, such as a parts catalog, or dynamic where items are frequently added and deleted as, for example, promotions and special discounts come and go. Search traffic is heavier than the publish/subscribe sites, though the number of unique items sought is not as large. Transaction traffic is moderate to high and always growing. Typically between 1% and 5% of the traffic are buy transactions. When users buy, security requirements become significant and include privacy, no repudiation, integrity, authentication, and regulations.
Customer Self-Service (user-to-business). Sample sites include online banking, tracking packages, and making travel arrangements. Data comes largely from legacy applications. Security considerations are significant for home banking and purchasing travel services, less so for other uses. Search traffic is low volume; transaction traffic is low to moderate, but growing.
Online Trading (user-to-business). Of all site types, online trading sites have the most volatile content, the highest transaction volumes (with significant swing), the most complex transactions, and are extremely time sensitive. Nearly all transactions interact with the back end servers. Security considerations are high, equivalent to online shopping, with an even larger number of secure pages. Search traffic is low volume.
Business-to-Business (business-to-business). Data comes largely from legacy systems. Security requirements are equivalent to online shopping. Transaction volume is moderate, but growing; transactions are typically complex, connecting multiple suppliers and distributors.
Those of ordinary skill in the art will recognize that the present invention is not limited to the above workload business patterns identified by IBM. The present invention will equally apply to the business patterns identified by others or yet to be identified. For example, although the business patterns such as Reservation System and Inventory Management System are not discussed here, the present invention can be used to simulate performance of such systems as well.
A mixture of one or more predefined user scripts can characterize each of these business patterns. The user scripts consist of a number of basic steps or page visits that incur by a user within a given business pattern. Those of ordinary skill in the art will recognize that the present invention is not limited to the user scripts discussed herein. Rather, the present invention applies to any script of user behavior within a given business pattern. To illustrate by way of an example, a xe2x80x9cuser-to-online-buyingxe2x80x9d business pattern may have user behavior dominated by the following three predefined scripts of user behavior:
Part of the definition of a business pattern is specifying the relative proportion or xe2x80x9cfrequencyxe2x80x9d of scripts in the customer mix. A new user may have equal probabilities of executing any scripts, or some very common scripts may get executed with a high frequency. Changing the relative frequency of scripts within the mix also allows different browse/buy ratios to be specified.
Given a set of predefined scripts, measurements of their path lengths, disk I/O operations, and network transfers can be made on an existing system. The actual content of each script, as well as the measured parameters may tend to gradually change over time as usage patterns and software characteristics change. Often the exact content of the scripts and their measured parameters will be considered a trade secret.
The present invention discloses a method, system and article of manufacture for estimating the performance of a computer system. Initially, a business pattern representative of the expected usage of the computer system is identified. Then, for each parameter associated with each predefined script, which corresponds to the identified business pattern, a value is established. The computer system hardware characteristics and performance objectives are identified next. The performance estimate is then calculated utilizing the established parameter values, identified hardware characteristics and performance objectives. To calculate the performance estimate, the script measurements data is read from a table of previously measured values, and a weighted average number of page visits per user, a weighted average visit rate and a weighted average service time for each target device in the computer system are calculated. A total response time and a system throughput are calculated by varying each target device queue length and user arrival rate until the performance objectives are reached.
The present invention provides an integrated simulation tool (i.e. a modeling tool) for projecting system performance without detailed knowledge of the workload characteristics being required. The present invention uses xe2x80x9cbusiness patternsxe2x80x9d and xe2x80x9cscriptsxe2x80x9d for typical computer installations to define the relevant workload characteristics. The xe2x80x9cbusiness patternsxe2x80x9d describe the type of work that a computer installation will be used for (e.g. on-line shopping, on-line trading, etc.). The xe2x80x9cscriptsxe2x80x9d describe typical operations within a business pattern (e.g. browse a catalog, buy an item, get a stock quote, etc.). Both the collection of business patterns and scripts are defined based on detailed studies of actual customer operations.
The user of the modeling tool in accordance with the present invention can define a workload by specifying a business pattern and the relative frequencies of scripts within that pattern that best match the workload on some current or future computer system. The modeling tool will then construct the needed description of a composite workload for the performance estimates based on a weighted average of previous data collected from actual measurements for various scripts on various hardware or software combinations. Abstracted data from previous measurements are kept in database tables within the integrated modeling tool.
This information is then used in an integrated analytic simulation model that employs variations of Mean Value Analysis techniques to produce performance estimates for a computer system.