1. Field of the Invention
The present invention generally relates to capturing and sending of software trace information over the Internet and particularly to systems involving complex software control that requires monitoring of processing.
2. Background Description
Complex real-time software-based systems are often controlled by extensive programming logic that requires developers and technicians to ascertain detailed information about the internal workings of a system as it is performing its intended functions. The systems involved can be of various natures. Telephone systems, airline reservation systems, banking transactions, air traffic control systems, communication centers, and the like all have large, complex real-time software programming. This programming has extensive internal and external messaging and processing demands placed upon it. Often these systems are placed in a network of other similar systems or support systems in which shared or distributed control inter-action is expected to exist smoothly and reliably amongst the systems.
Technical personnel require information of the internal operation of systems for varying reasons. It may be just to verify expected behavior of system, the internal software structures, view hardware characteristics, gather usage patterns, acquire metric information, gather statistical information or to track faults.
Scrutiny and ongoing verification of valid and correct operation of complex software based systems is essential particularly when such systems are engaged in mission-critical operations. Telephone systems, for example, are expected to operate with near flawless performance. Often, the degree of performance and level of confidence of these systems are best acquired through detailed observation techniques.
Sometimes these systems do not perform as expected and the source of the problem must be located and resolved in an efficient manner. To accomplish this, detailed internal information concerning software operations must be made available to technical personnel.
These exemplary systems typically are controlled by one or more microprocessors running a software program or series of programs depending on the nature of the system. Every microprocessor in a system may be susceptible to faults either because of hardware faults or by software processing errors. Software errors can be caused by a plethora of reasons such as, for example, illogical conditions, incorrect messages from other systems, invalid operation control parameters, user action, extreme demands requiring excessive processing time, inadequate logic design, etc. It is these types of fault that designers, engineers, or technicians require detailed information about in order to remedy the situation.
Typically, these systems must also control extensive hardware interfaces such as line and trunk interfaces in a telephone switch, banks of operator positions in an airline reservation center, disk drives, motors, network interfaces, etc. The variations of possible types of hardware are quite extensive. The hardware that provides information to the system, or is the recipient of commands from the system processors, may itself be faulty and cause undesirable or illogical impacts on the system. Often these faults are transitory and of little consequence; however, even what seems to be a minor fault can compound and induce inappropriate system performance or even serious service disruptions.
These systems often have a critical mission to perform and whenever issues or problems arise, engineers, designers, technicians, or operations personnel require a system and method to locate or isolate useful information to debug the situation.
It is common for systems to have built into the operating system software a debug or tracer capability whereby technical personnel can request specific detailed information concerning the performance of the system. These tracer capabilities vary in capability and flexibility. Often, only simple processor register and memory status can be obtained. In more sophisticated tracer and debug facilities, extensive software instruction sequences can be requested including the software subsystem environment, automatic data snapshots, processor stack histories, message histories, etc.
In real-time systems, tracer output is typically written into the system memory for storing, directed to a printer, or written to a local storage device. If system memory is the storage medium, limits are quickly imposed on the amount of data that can be captured. If a local printer is used, it can be a significant issue because the system may not be able to dispose of tracer information fast enough causing loss of information. If a local storage device is employed, the information is kept near the system and creates accessibility and management problems.
However, the debug or tracer capability in large complex systems can potentially produce enormous amounts of output, particularly if engineers must track system operations for an extended period of time; sometimes for days or weeks. This magnitude and nature of the output is generally related to such things as the type of system involved, the nature of the internal message structures, the architecture of the system, the type of fault being tracked, how much processing activity is demanded of the system, or what type of tracer information has been requested by technical personnel, etc.
In situations such as a telephone system network where there may be more than one system under analysis, the potential debug and tracer logging management is very demanding and can potentially be overwhelming. It is this type of situation where the tracer output from one or more systems could potentially be very large, that storage and flexible access to the tracer information is problematic.
An engineer or technician must often correlate processing activities such as message flow and user activity from more than one system over a significant period of time. Proper data capturing methods and storage techniques are needed so that efficient record management is available. Maintaining extensive logs of tracer output in a manner suitable for easy retrieval by one or more technical personnel is crucial. To complicate matters, engineering personnel, who are required to analyze complex tracer information, are often not in one geographic location or even in the same vicinity of the system to be analyzed. Therefore, ease of access by more than one person in any location is desirable.
A suitable system and method to initiate, collect, and retrieve for analysis large amounts of tracer or debug information from multiple targeted systems in a way suitable for technical personnel to retrieve the information from any location is an ongoing problem.