An activity that an access history to Web sites is analyzed to utilize the analysis result to sales strategies and the like is widely carried out, and a Uniform Resource Locator (URL) parameter analysis function is used as an analysis function of dynamically generated Web pages (e.g. certain types of pages such as cgi, jsp, asp, php and the like).
On the other hand, a Web system is a multi-layer server system as depicted in FIG. 1, typically. Namely, a load distribution apparatus is connected to the Internet or the like, and under this apparatus, a Web server as a first layer, an application server as a second layer, and a DB server as a third layer are provided. Then, when a message is transmitted from, for example, a user terminal to the Web server through the Internet and the load distribution apparatus by the Hyper Text Transfer Protocol (HTTP), the Web server carries out a processing A and transmits a request to the application server by the Internet Inter-ORB Protocol (IIOP). The application server carries out a processing C, and further transmits a SQL request to the DB server by the IIOR The DB server carries out a processing D1, and further transmits the result to the application server. The application server receives the processing result from the DB server, carries out a processing D2, and transmits the result to the Web server. The Web server receives the processing result from the application server, carries out a processing B, and transmits the final processing result to the user terminal through the load distribution apparatus, the Internet and the like.
At this time, the HTTP message sent from the user terminal includes an IP address corresponding to hogehoge.com, which is a machine name, for example, and “/ugogo.cig?sid=1000&target=1”, which represents path and parameter portion, (“/ugogo.cgi” represents the path, and “sid=1000&target=1” after “?” or “;” represents the parameter portion.). In this application, in order to make it easy to understand, the following expression may be used, in which the IP address is replaced with the machine name and the machine name is connected with the path and parameter portion.
http://hogehoge.com/ugogo.sig?sid=1000&target=1
Incidentally, “http” represents a scheme.
As depicted in FIG. 2, what processing is carried out is unclear only from a static portion until the path in the URI, and it is possible to distinguish, according to the parameter portion, which is a dynamic portion, especially, according to a value of a target variable, which of the processing X and Y is carried out in response to the HTTP request. Thus, the parameter portion has a meaning as the selection means of the processing content or a meaning as data.
However, a conventional Web log analysis tool merely lists the variables or the like, and merely displays the appearance frequencies or ratios. Accordingly, it cannot carry out the analysis paying attention to the meanings or functions of the variables.
In addition, even if it is possible to distinguish which of the processing X and Y is executed, it may be difficult to identify which is a HTTP message corresponding to the processing X and which is a HTTP message corresponding to the processing Y. Basically, as depicted in FIG. 1, the time interval in the upper-layer server from the processing start time to the processing end time includes the time interval in the lower-layer server from the processing start time to the processing end time. However, when only logs in each server are observed especially from the Web server side, there is a case where the HTTP message cannot be associated with the business processing, correctly.
As described above, the conventional technique cannot appropriately extract the relationship between the business processing and the message from log data of the multi-layer server system, and cannot extract the path and parameters, which characterize the business processing.
More specifically, a technique to extract the feature portion of the URI, which is included in the message to the upper-layer Web server, and characterizes the business processing to be carried out, does not exist.