The present invention relates generally to telephone switching systems, and in particular, to a method and apparatus for analyzing the state of a telephone switching system immediately prior to a software upgrade to determine whether the software upgrade should be performed.
Modem telephone switching systems are large-scale, highly complex systems incorporating one or more switching elements cooperatively controlled and supervised by one or more computing means. One commercial exemplar of a modem telephone switching system is the 5ESS ELECTRONIC SWITCHING SYSTEM, from Lucent Technologies Inc., 600 Mountain Avenue, Murray Hill, N.J. 07974. The 5ESS electronic switch is a distributed switching system. Both the switching system capabilities and the control, supervision and administration capabilities are distributed. Each of the computing facilities associated with these distributed capabilities includes appropriate computer programs or software to achieve he desired operation of the switching elements and other components of the switching system.
Periodically the software or computer programs used to control the components of the switching system are replaced by different software. This replacement of computer software is referred to as an upgrade or retrofit. The process of retrofitting a telephone switching system is complex. The complexity stems from the number of distributed computing facilities; the amount of software code involved; and the fact that the switching system availability must not be completely compromised for a retrofit. A typical retrofit of a switch may take from 10 to 12 hours. Resources must be employed days in advance of the retrofit for preparation. And of course, additional resources are required during the retrofit.
Problems occurring during a retrofit are obviously undesirable. Unfortunately, problems do occur. Some of these problems are readily fixed. Other problems prevent successful upgrade of the switch. Regardless of the nature of the problem, early detection of the problem is desirable. Early detection allows for early correction or rescheduling to avoid wasted resources.
A report data stream is produced by the 5ESS switching system containing text messages regarding the current state and recent operations of the switching system. The report data stream includes messages that (1) describe the state of the switch hardware; (2) report automatic actions taken by the switch; (3) report operations entered by a switch operator; (4) report results of routine or scheduled diagnostics; and (5) indicate non-routine events, abnormal conditions, errors or alarms. The report data stream is typically supplied to a xe2x80x9cread-onlyxe2x80x9d printer or xe2x80x9cROPxe2x80x9d via a serial port. Hence, the report data stream is often referred to as the xe2x80x9cROPxe2x80x9d or xe2x80x9cROPxe2x80x9d report. The report data stream is voluminous, sometime generating 4 to 5 megabytes of data for storage in a day. Therefore, the ROP is often stored on a computer to permit searching and review for problem solving. In addition, a telecommunications switch owner often desires to have all ROP output from its switches collected in a single location for review.
A successful software retrofit requires a switch to be in a certain state. For example, faulty hardware or an incomplete upgrade of hardware components may prevent a successful software retrofit. The state of the switch required for retrofit is not typically the same as the required state of the switch for normal operation. In particular, telecommunications switches typically have redundant or fault-tolerant components and subsystems that permit normal operation in spite of some faults. Therefore, an audit of the state of the switch, beyond the typical audits for normal operation, is required prior to a software retrofit. These audits traditionally are conducted manually and begin several days or sometime weeks in advance of a scheduled retrofit.
The ROP, or its equivalent, is typically reviewed manually, or with the assistance of a computer, as a part of an audit prior to a software retrofit. However, given the voluminous nature of the ROP, especially when considering multiple switches and the vast period of time for which auditing is required, this method of auditing a switch prior to a retrofit is error prone and can be inefficient. Moreover, this method requires substantial subject matter expertise from a person manually reviewing the ROP.
In addition, no matter how much auditing is done in advance of the scheduled retrofit, problems may appear immediately prior to the retrofit that place the switch in an undesirable state, thereby preventing a successful retrofit. For example, a hardware component may fail within hours of the scheduled retrofit. Since the retrofit consumes a considerable amount of time and resources, it is desirable to manage problems that may prevent a retrofit as closely as possible, including immediately prior to a retrofit, that is, within hours of a retrofit. This requires real time observation of the switch, including the ROP, which real time observation is not feasible by manual means.
Therefore, a need exists for a more efficient and reliable method and apparatus for auditing the state of a switch, including immediately prior to a software retrofit.
In accordance with the present invention, a method is provided for determining whether to proceed with a software upgrade on a telecommunications switch. First, a report stream from the switch is received. The report stream includes messages associated with a state of the switch. In particular, the messages relate to the state of the hardware components of the switch. The report stream is searched as it is received for predetermined messages. The predetermined messages found in the search form a set of identified messages. Each predetermined message has a numerical value associated therewith. As predetermined messages are found in the report stream, an accumulated value is calculated by totaling up the numeric values for each identified message in the report stream. If the accumulated value exceeds a predetermined threshold, a user interface notifies a person that the accumulated value exceeds the predetermined threshold. This indicates that the software upgrade should not proceed based on the present state of the switch.
Preferably, a date and time for each occurrence of an identified message is stored and each identified message is also stored. The identified messages are grouped in relation to subsystems of the switch. The user interface provides hierarchical views of data relating to the determination of whether the accumulated value associated with the identified messages exceeds the predetermined threshold. At a top level in the user interface, a designated area is colored a predetermined color to indicate whether the accumulated value exceeds the predetermined threshold. Selecting the designated area reveals a second designated area, which indicates the groups or subsystems for which identified messages were found. Selecting a particular subsystem reveals a third designated area that lists the identified messages and the value each message contributed to the accumulated value. Selecting an identified message causes the user interface to reveal another designated area showing the date and time for each occurrence of the identified message. Selecting a date and time stamp for an occurrence reveals yet another designated area that shows the text stream from the report stream from the switch that produced the identified message.
In accordance with another aspect of the invention, the report stream is received for an actual period of time. The actual period of time is compared to an expected period of time. If the actual period of time does not exceed the expected period of time, then a user interface is updated to reflect this determination. In particular, the predetermined expected period of time is selected to ensure a sufficient audit of the report stream occurred. In other words, if the actual period of time does not exceed the expected period of time, then a favorable indication may not be accurate, due to a lack of information.
An apparatus in accordance with an aspect of the present invention includes a report receiver, a processor and a user interface. The report receiver receives a stream of messages from a telecommunications switch and produces a received stream of messages. Included within the received stream of messages are messages reflecting the state of the telecommunications switch. The processor is coupled to the report receiver. The processor stores the received stream of messages and searches the received stream of messages for predetermined messages to produce identified messages. Each identified message has a numerical value. The processor accumulates the numerical values for each identified message found in the report stream. The accumulated value is compared to a predetermined threshold. A user interface is coupled to the processor to reflect whether the accumulated value exceeds the predetermined threshold. Preferably, the processor compares the actual period of time that the report receiver receives the stream to an expected period of time. The user interface reflects whether the actual period of time exceeded the expected period of time. The expected period of time is a minimum measure of time for making a reliable decision from the report stream. The user interface provides hierarchical viewing of the identified messages as described above with respect to the method.