(1) Field of the Invention
The present invention relates to a troubleshooting technique for quickly diagnosing a cause of occurrence of an exception occurring in a network computing environment constituted of a plurality of computers.
(2) Related Art of the Invention
There has been heretofore constituted a large scale network computing environment, making use of a plurality of computers. Such a network computing environment exemplarily includes a client/server system. Further, in the network computing environment which is expected to be popularized soon, those distributed applications developed by different types of machines, OS""s, and languages are linked to one another to conduct a single processing.
Meantime, it is impossible to perfectly prevent an exception from occurring in an application program, in the presently available technique. As such, it is presently typical to record error logs for each application program, and these error logs are analyzed by an expert to diagnose the reason of occurrence of an exception, if any.
However, since the error logs are conventionally specific to respective application programs, there are problems in the network computing environment, as follows:
(1) To specify an application program in which an exception has occurred, it is necessary to examine error logs dispersed in a plurality of computers, thereby requiring great endeavor for exploration.
(2) Error logs are noway recorded with a state such as a communication state at the time the exception has occurred. This forces an expert to presume the state at the time of occurrence of exception, based on those error logs specific to respective application programs, thereby also requiring great endeavor for such investigation.
The present invention has been carried out in view of the conventional problems as described above, and it is therefore an object of the present invention to provide a troubleshooting technique capable of diagnosing a cause of occurrence of exception, by bundlingly collecting the exceptions that have occurred in a network computing environment.
It is a further object of the present invention to enable those to readily constitute a troubleshooting apparatus, who have obtained a recording medium recorded with a troubleshooting program according to the present invention, by distributing such a medium.
To attain the above object, the present invention provides, as a first solution, a troubleshooting apparatus in a network computing environment, the apparatus comprising: a plurality of application program executing devices by which predetermined application programs are executed, respectively; and an exception information collecting device for collecting occurrence information of an exception occurred in the application programs, wherein the exception information collecting device comprises: an exception information accumulating device for accumulating the occurrence information; an exception occurrence detecting device provided in the application program executing device, so as to detect occurrence of an exception in the application programs; and an exception information transmitting device for transmitting an occurrence information to the exception information accumulating device when occurrence of an exception is detected by the exception occurrence detecting device.
According to such a constitution, when an exception has occurred in an application program constituting the network computing environment, the exception occurrence is detected by the exception occurrence detecting device. Upon detection of the exception occurrence by the exception occurrence detecting device, the exception information is transmitted from the application program executing device to the exception information accumulating device. The thus transmitted exception information is accumulated into the exception information accumulating device. Namely, even in a system where a plurality of cooperative application programs are linked to one another to conduct a single processing, the occurrence informations are not accumulated in the respective application program executing device, but bundlingly collected into the externally attached exception information accumulating device. As such, it is unnecessary to review the occurrence informations of the distributed respective application programs one by one, so that the time and effort for diagnosing the cause of occurrence of exception can be remarkably reduced.
The occurrence information may include: an exception occurrence place information for specifying an application program executing device by which the exception occurring application program has been executed; an exception occurrence time point information for specifying the time point at which the exception has occurred; and an information exchange destination information for specifying the other application program executing device which has conducted information exchange with the application program executing device by which the exception occurring application program has been executed.
According to such a constitution, the occurrence information accumulated in the exception information accumulating device includes the exception occurrence place information, exception occurrence time point information and information exchange destination information. Thus, by analyzing the bundlingly collected occurrence informations, it becomes possible to understand as to, at which application program the exception has occurred, in what communication state the exception has occurred, or when the exception has occurred. Therefore, it becomes unnecessary to analyze each one of the error logs recorded in the respective application program executing device, so that the analysis of exception occurring cause can be readily conducted.
The troubleshooting apparatus may further comprise: a first time point obtaining device for obtaining the respective time points at the plurality of application program executing devices; and a second time point obtaining device for obtaining the unified time point in the network computing environment; wherein the exception occurrence time point information includes the time point obtained by the first time point obtaining device, and the unified time point obtained by the second time point obtaining device.
According to such a constitution, the exception occurrence time point information includes two types of time points, i.e., the exception occurrence time point at the application program executing device and the exception occurrence time point in the network computing environment. As such, even if the time points at the respective application program executing devices are different from one another, the occurrence informations are aligned based on the unified time point in the system, by sorting the occurrence informations by treating the exception occurrence time point in the network computing environment, as a key. Thus, it becomes possible to relate the exception occurrences to one another in time series, thereby enabling easier analysis of the exception occurring cause.
Further, the second time point obtaining device may comprise: a third time point obtaining device for obtaining the time point at the exception information accumulating device; a time-point return request transmitting device provided in the application program executing device, for transmitting a time-point return request to the exception information accumulating device when exception occurrence is detected by the exception occurrence detecting device; a time-point returning device provided in the exception information accumulating device, for transmitting the time point obtained by the third time point obtaining device to the application program executing device which has transmitted the time-point return request, when the time point returning device has received the time-point return request; a returning time length measuring device for measuring the time length from transmission of the time-point return request up to the time at which the time point is returned; and a unified time point calculating device for calculating the unified time point based on the time point returned by the time-point returning device and the time length measured by the returning time length measuring device.
According to such a constitution, the exception occurrence time point in the network computing environment can be calculated based on: the time length from transmission of the time-point return request up to the time when the time point is returned; and the time point returned in response to the time-point return request. As such, it is unnecessary to newly provide a clock server, for obtaining a unified time point of the system. Thus, the cost for constituting the system is further reduced.
The troubleshooting apparatus may further comprise: an information filtering device for filtering the occurrence informations collected by the exception information collecting device, by treating at least the exception occurrence place information, the exception occurrence time point information and the information exchange destination information of the occurrence informations, as a key; and an information display device for displaying the occurrence informations filtered by the information filtering device.
According to such a constitution, the occurrence informations collected by the exception information collecting device are automatically filtered by the information filtering device, by treating the exception occurrence place information, exception occurrence time point information and information exchange destination information, as a key. The thus filtered occurrence informations are displayed such that a user can understand by viewing the same. Namely, by appropriately setting the exception occurrence place information, exception occurrence time point information and information exchange destination information before operating the information filtering device, the occurrence informations are automatically filtered and the result is displayed. Thus, the analysis of the exception occurring cause can be automatically conducted.
The present invention provides, as a second solution, a troubleshooting method in a network computing environment, the method comprising: a plurality of application program executing processes by which predetermined application programs are executed, respectively; and an exception information collecting process for collecting occurrence information of an exception occurred in the application programs, wherein the exception information collecting process comprises: an exception information accumulating process for accumulating the occurrence information; an exception occurrence detecting process provided in the application program executing process, so as to detect occurrence of an exception in the application programs; and an exception information transmitting process for transmitting occurrence information to the exception information accumulating process when occurrence of an exception is detected by the exception occurrence detecting process.
According to such a constitution, when an exception has occurred in an application program constituting the network computing environment, the exception occurrence is detected by the exception occurrence detecting process. Upon detection of the exception occurrence by the exception occurrence detecting process, the exception information is transmitted from the application program executing process to the exception information accumulating process. The thus transmitted exception information is accumulated into the exception information accumulating process. Namely, even in a system where a plurality of cooperative application programs are linked to one another to conduct a single processing, the occurrence information is not accumulated in the respective application program executing process, but bundlingly collected into the externally attached exception information accumulating process. As such, it is unnecessary to review the occurrence informations of the distributed respective application programs one by one, so that the time and effort for diagnosing the cause of occurrence of exception can be remarkably reduced.
The present invention provides, as a third solution, a recording medium recorded with a troubleshooting program in a network computing environment, the program being adapted to perform: a plurality of application program executing functions by which predetermined application programs are executed, respectively; and an exception information collecting function for collecting occurrence information of an exception occurred in the application programs, wherein the exception information collecting function comprises: an exception information accumulating function for accumulating the occurrence information; an exception occurrence detecting function provided in the application program executing function, so as to detect occurrence of an exception in the application programs; and an exception information transmitting function for transmitting occurrence information to the exception information accumulating function when occurrence of exception is detected by the exception occurrence detecting function.
It is noted that the term xe2x80x9crecording mediumxe2x80x9d means things which are capable of assuredly recording with various informations and from which the recorded information can be assuredly taken out as required, and concretely, such as paper card (punched card), paper tape, magnetic tape, magnetic disk, magnetic drum, IC card, and CD-ROM are applicable.
According to such a constitution, the recording medium is recorded with the troubleshooting program in the network computing environment for performing the application program executing function and the exception information collecting function. Further, the exception information collecting function is constituted to include the exception information accumulating function, exception occurrence detecting function and exception information transmitting function. Thus, by simply providing a recording medium recorded with the program for performing the respective functions, it becomes possible to render such as a general computer to have the respective functions so that the troubleshooting apparatus according to the present invention can be readily constructed.