Maintenance, including the reliable troubleshooting of complex systems, is a common issue in various industries, especially under time and economic constraints. In the aircraft industry, for example, maintenance of an aircraft is of paramount importance to ensure the continued safe and efficient operation of the aircraft. Aircraft maintenance can occur in several different manners. For example, scheduled maintenance generally includes a number of specific tasks, inspections and repairs that are performed at predetermined intervals. These events are scheduled in advance and rarely result in aircraft schedule interruption. In contrast, unscheduled maintenance is performed as required to maintain the aircraft's allowable minimum airworthiness during intervals between scheduled maintenance. Unscheduled maintenance is usually performed while the aircraft is on the ground between flights. However, unscheduled maintenance may be performed during a scheduled maintenance check if a mechanic identifies a problem that was not anticipated. Minimum ground time between flights is desirable to maximize airplane utilization and to meet the established flight schedules. Therefore, the time allocated to unscheduled maintenance is often limited to the relatively short time that the aircraft is required to be at the gate in order to permit passengers to unload and load, to refuel and to otherwise service the aircraft, all of which may take approximately 20 to 120 minutes on average depending on the aircraft type and route to be flown next. Clearly, if an unscheduled problem arises that cannot be addressed on the ground, the flight will be delayed or cancelled, or a replacement aircraft found. Also, many of the non-critical or deferrable problems are addressed at the end of the day and during the night when there is more time to do so.
As explained below, it is oftentimes difficult to complete the unscheduled maintenance in this timeframe, thereby leading to flight delays and/or cancellations. These flight delays and/or cancellations are extremely costly to an airline, both in terms of actual dollars and in terms of passenger perception. In this regard, an airline typically begins accruing costs related to a flight delay following the first five minutes of a delay, with substantial costs accruing if the flight must be cancelled. Moreover, as all air passengers are aware, airline dispatch reliability is a sensitive parameter that airlines often use to distinguish themselves from their competitors.
Notwithstanding the critical importance of properly performing unscheduled maintenance in both an accurate and timely manner, mechanics who perform the unscheduled maintenance on the flight line face a daunting challenge, given the complexity of an aircraft system. In this regard, in addition to the time pressures described above, these mechanics are generally required to troubleshoot the aircraft based upon a limited amount of information that has been provided by the flight, cabin or maintenance crew or by onboard computers, sensors, maintenance messages or the like. While troubleshooting any system based upon this limited information would be difficult, troubleshooting an aircraft which is an extremely large and complex system comprised of many interconnected subsystems is particularly difficult. In this regard, each subsystem is also typically comprised of many Line Replaceable Units (LRUs) that are designed to be individually replaced. An LRU may be mechanical, such as a valve or a pump; electrical, such as a switch or relay; or electronic, such as an autopilot or a flight management computer. Many LRUs are, in turn, interconnected within a particular system. As such, the symptoms described by flight deck effects or other observations may indicate that more than one LRU can explain the presence of the observed symptoms. At that point, there is ambiguity about which LRU(s) have actually failed. Additional information will be needed to disambiguate between the possibilities.
A mechanic must therefore troubleshoot the problem to one or more suspect LRUs, with the number of LRUs preferably being minimized to prevent an excessive number of LRUs that are functioning properly from being replaced. A mechanic must then decide if the suspect LRU(s) must be immediately repaired or replaced prior to further flight of the aircraft or, if the repair or replacement of such LRU(s) may be safely deferred until the completion of the day's flights for the aircraft in order to avoid further delay of the aircraft. In this regard, a minimum equipment list (MEL) is generally maintained for each model of aircraft. The MEL indicates which components must be functioning properly in order for the aircraft to be cleared for takeoff. As such, a mechanic generally determines if any of the suspect LRUs are on the MEL and, if so, must repair or replace each suspect LRU that is on the MEL. If a suspect LRU must be immediately replaced, the mechanic removes the LRU, obtains a replacement LRU and installs the replacement LRU. If the subsystem is capable of being tested while the aircraft is on the ground, the mechanic then generally tests the subsystem to insure that the problem is corrected by the replacement LRU. Unfortunately, the more ambiguity there is between the suspect LRUs, the more difficult it is to single out the truly faulty LRUs and the more prone a mechanic is to want to replace all suspect parts, rather than continue to troubleshoot to disambiguate and narrow the field of suspect LRUs.
Following departure of the aircraft, the LRUs that have been removed are generally tested to determine if the LRUs are defective and, if so, which component(s) of the LRU failed. These tests frequently determine that many of the LRUs that are replaced are actually functioning properly. However, a mechanic, in his/her haste to return an aircraft to service, may skip tests that are necessary to refine the troubleshooting from a handful of suspect LRUs to a specific one or two suspect LRUs since the time required for the tests may cause the upcoming flight to be delayed or cancelled. As will be apparent, however, the replacement of LRUs that are actually functioning properly increases the costs to maintain the aircraft, both in terms of the cost of the parts and the labor. Additionally, the replacement of LRUs that are functioning properly may cause an excessive number of LRUs to be maintained in inventory, thereby also increasing inventory costs associated with the maintenance of the aircraft. As such, it would be desirable to improve the diagnosis ability and knowledge of ground crew mechanics and their ability to make decisions about how to streamline the diagnostic decision making process, i.e., by having a good understanding of the most cost effective and informative actions to take under various different circumstances.
A mechanic may be notified of a problem with an aircraft either while the aircraft is still in route or once the aircraft has landed. If a mechanic is notified while the aircraft is in route, the mechanic is provided with a description of the problems and other observations or symptoms noted by the flight or cabin crew, i.e., flight deck effects, so that the mechanic can begin the troubleshooting process prior to the arrival of the aircraft at the gate, thereby somewhat reducing any delays associated with the repair. More commonly, however, a mechanic is notified once the aircraft arrives at the gate that a problem has been identified by the flight, cabin or maintenance crew and is provided with a list of any observations or symptoms noted by the crew. In some instances, the mechanic may be able to obtain additional information related to the problem from various onboard computers, sensors or the like.
In a few instances involving common or repeated problems, an experienced mechanic may be able to immediately identify the suspect LRU based only upon the problem and the accompanying symptoms. Normally, however, the mechanic must work through a fairly complicated troubleshooting procedure which attempts to identify the suspect LRU(s) based upon the problem and the accompanying symptoms and, in many instances, based upon the results of one or more additional tests that are performed in an attempt to isolate the suspect LRU.
Since the aircraft includes a large number of interconnected systems containing LRUs, and since the propagation of any fault through the system is equally complex, fault isolation manuals (FIMs) have been developed for a number of different aircraft models to guide a mechanic through the troubleshooting process. Similarly, airline maintenance manuals (AMMs) have been developed that include guidance for a mechanic through troubleshooting processes. Unfortunately, these manuals are voluminous, and oftentimes include a number of supplements or updates that must be cross-referenced in order to appropriately troubleshoot the aircraft. Thus, it is a daunting task to quickly browse through the documents and locate the most relevant information in a timely manner. Further, these manuals are oftentimes maintained in a central repository or technical library at the airport and are not immediately available to a mechanic who is repairing an aircraft at the gate. As such, a mechanic must sometimes copy the pages of the manual that seem to be most relevant and proceed to the gate to repair the aircraft. If, however, the troubleshooting process proceeds in a manner not anticipated by the mechanic, the mechanic may have to return to the library to reference or copy additional pages of the manuals, thereby further slowing the troubleshooting process. As such, portable electronic maintenance aids have been developed in order to maintain a portable library of maintenance documents for mechanics. However, it would be desirable to provide still additional information to the mechanic at the gate to facilitate the troubleshooting process.
Even with the appropriate manuals to guide the troubleshooting process, a mechanic may have difficulty troubleshooting a problem and may need to contact a representative of the aircraft manufacturer for more assistance or information regarding the latest updates, thereby further delaying the troubleshooting process. In addition, experienced mechanics oftentimes know tricks of the trade or other unwritten rules which greatly expedite the troubleshooting process, especially in instances in which the faults are multiple, intermittent, repeating or cross-system in nature or in instances in which problems with one LRU are actually attributable to another faulty LRU that is connected, directly or indirectly, to the LRU experiencing the problem. As such, it would be desirable to provide all mechanics with the knowledge and information including the tricks-of-the-trade and the other unwritten rules that have been developed over the years by experienced mechanics to streamline the troubleshooting process.
Once the mechanic identifies one or more suspect LRUs, the mechanic determines if the LRUs are to be repaired or replaced. If the aircraft has completed its operations for the day, the mechanic typically determines if the LRU can be repaired or should be replaced. If the LRU is to be replaced, the mechanic determines if replacement LRUs are available in inventory. This determination generally involves the mechanic's review of a listing of the LRUs in inventory. If the LRUs that are to be replaced are in inventory, the mechanic obtains the necessary LRUs and replaces the suspect LRUs with LRUs from inventory. If, however, the aircraft has additional flights scheduled for later in the day, the mechanic generally determines if the suspect LRU(s) are necessary for continued operation of the aircraft by consulting a minimum equipment list (MEL). If the MEL indicates that the LRU is necessary for continued operation of the aircraft, the mechanic continues, as described above, by determining if the LRU can be repaired and, if not, by determining if the LRU is available in inventory and, if so, obtaining a replacement LRU and swapping the replacement LRU for the suspect LRU. However, if the MEL indicates that the LRU is not necessary for continued operation of the aircraft, the mechanic may defer replacement of the suspect LRU until completion of the operations of the aircraft for the day in order to prevent further delay or cancellation of the aircraft's remaining flights.
As will be apparent, aircraft maintenance is of critical importance for a number of reasons. Moreover, the performance of aircraft maintenance, especially unscheduled maintenance, in a reliable and timely fashion is desirable in order to minimize any delays or cancellations due to maintenance work. Additionally, it is desirable to fully troubleshoot a problem such that a minimum number of suspect LRUs is replaced in order to reduce the maintenance costs and to permit inventory to be more closely controlled. As described above, maintenance operations, especially unscheduled maintenance operations, include a very complicated troubleshooting process which oftentimes requires a mechanic to reference one or more manuals that outline the process and, even if performed correctly, may require an aircraft to be on the ground in repair for an undesirably long period of time.
In order to address at least some of these shortcomings with conventional maintenance operations, an improved diagnostic system and method for identifying the faulty components of an aircraft or any of a wide variety of other complex systems was developed as described by U.S. Pat. No. 6,574,537 which issued Jun. 3, 2003 to Oscar Kipersztok et al., the contents of which are incorporated in their entirety herein. The improved diagnostic system includes an interface for receiving inputs relating to observed symptoms indicative of one or more failed components. The diagnostic system also includes a processing element for correlating the inputs relating to the observed symptoms with at least one suspect component that is capable of causing the observed symptoms upon failure. The diagnostic system further includes a display for presenting information to the user relating to the suspect component(s). The processing element preferably correlates the input relating to the observed symptoms with the suspect component(s) in accordance with a diagnostic model, such as a Bayesian network model, that is constructed using systemic information relating to the components and their input-output relationships, experiential information relating to direct relationships between component failures and the observed symptoms, and factual information relating to component reliability. As such, the processing element includes a number of the tricks of the trade and unwritten rules known by the most experienced mechanics to permit the diagnostic system to perform the troubleshooting process in the most efficient manner.
A diagnostic model, such as a Bayesian network, includes a plurality of nodes interconnected by a number of arcs in a manner defined by the systemic information and the experiential information. The model may include nodes representing the components, the observed symptoms, and tests to be performed to further isolate the problem. Each node representing an LRU (usually root or parentless nodes), has at least two (mutually exclusive and collectively exhaustive) states (e.g., normal or failed) and the diagnostic model assigns a probability to each state of a node based upon estimates that may be derived from component reliability data (factual). If such data are not available, a subject matter expert (SME) may be able to make an educated guess for such estimate based on their systemic, or experiential understanding of the system. Other nodes in the model represent indications or outcomes of tests performed. These are nodes that represent observations of symptoms or test results needed in the process of singling out the failed components causing the problem. Other, intermediate nodes, may represent functional parameters or quantities that describe the system and can help tie the causes (LRU nodes) to the effects (observation nodes) in the model. All nodes that are not root nodes, have states and assigned to the states are probability distributions conditioned on all permutations of the states of the parent nodes. These probabilities can be obtained from data (factual information) or inferred by an expert from systemic or experiential knowledge. Using such a diagnostic model, the processing element can determine the probability of failure for each suspect component, such as each LRU, that may have caused the observed symptoms.
According to U.S. Pat. No. 6,574,537, the processing element correlates the input relating to the observed symptoms with a plurality of components from one or more subsystems that are capable of causing the observed symptoms upon failure. The processing element prioritizes the suspect components based upon the relative likelihood of causing the failure, i.e., that the respective suspect components are causing the observed symptoms to occur. The processing element is also capable of identifying test(s) to be performed in order to refine the prioritization of the suspect components. The display can present a prioritized listing of the plurality of suspect components and a recommendation of the test(s) to be performed in order to refine the prioritization of the plurality of suspect components.
The processing element is preferably capable of prioritizing the tests based upon at least one predetermined criteria. The processing element is also preferably capable of reprioritizing the tests based upon revised criteria. For example, the tests may be prioritized based upon the time required to perform the tests, or the amount of information obtained from the test. Regardless of the manner in which the plurality of tests are prioritized, the processing element is capable of receiving and analyzing data from a test and reprioritizing the suspect components based upon the outcome of the tests.
According to U.S. Pat. No. 6,574,537, the processing element may also be capable of identifying additional information relating to at least one suspect component. Typically, the additional information relates either to component availability, the time to repair or replace the suspect component, or the cost to repair or replace the suspect component. The display also preferably presents the additional information for review by the mechanic. This diagnostic system may therefore include at least one database for storing the additional information. The database can include schematic images of the suspect component that can be displayed during replacement or repair of the suspect component. While displaying the schematic images, the display may also indicate the relative likelihood of component failure. In addition, the database can include a minimum equipment list. The display will then be capable of indicating the respective suspect components that are on the minimum equipment list. With this information, a mechanic can quickly determine if the suspect component must be repaired or replaced or if the repairs can be deferred. The database may include an inventory of components. The display will then be capable of indicating the respective suspect components that are in inventory and therefore available to the mechanic. Similarly, the database can include text descriptions of the suspect components which can also be presented upon the display for review by the mechanic.
In correlating the observed symptoms with one or more suspect components, the processing element may identify suspect components of a plurality of different subsystems, i.e., suspect subsystems. As such, the display may be capable of listing the suspect subsystems and the interface may be capable of receiving input indicative of the respective suspect subsystem to be further analyzed. Based upon this input, the processing element can prioritize suspect components of the selected subsystem and the display can present the prioritized listing of the suspect components of the selected subsystem as described above.
Following completion or deferral of the repair or replacement of the suspect component, the diagnostic system permits the mechanic to record whatever remedial action was taken. In this regard, the interface is capable of receiving data relating to remedial actions undertaken with respect to at least one suspect component. If maintenance actions were deferred, this data may be used to notify others of the need to perform the deferred maintenance. This data can be included within the airplane's maintenance log.
According to U.S. Pat. No. 6,574,537, an improved diagnostic system and method are therefore provided to troubleshoot a complex system, such as an aircraft. The diagnostic system and method permit the suspect components to be reliably identified and to be ranked based upon the probability that a respective suspect component caused the problem. In addition, one or more tests that would be useful to refine the prioritization of the suspect components can be identified. As such, a mechanic can quickly identify components that must be repaired or replaced. By linking to additional information, the mechanic can also quickly determine if the suspect components are on the minimum equipment list and are in inventory, as well as additional information, such as schematics and textual descriptions of the suspect components. The diagnostic system and method of U.S. Pat. No. 6,574,537 should therefore permit troubleshooting to occur more quickly and accurately such that the overall time for repair is reduced which correlates, in the aircraft industry, to a corresponding reduction in the number of flights that are delayed or cancelled due to unscheduled maintenance and a reduction in unnecessary component replacement. By reliably troubleshooting an aircraft, the diagnostic system and method may also insure that a greater percentage of the components that are replaced are actually faulty, thereby decreasing maintenance costs and improving inventory control relative to conventional troubleshooting processes that oftentimes replace components that are still operational. In short, aircraft maintenance and troubleshooting involves complex knowledge and experience as well as complex processes including complex analysis and decision making. Robust and rapid diagnosis provides clear economic advantage and thus increases an airline's competitiveness. A diagnostic system as described by U.S. Pat. No. 6,574,537 not only provides rapid access to the relevant information, but also provides the assistance that facilitates decision making for a proper diagnosis.
The diagnostic model or network utilized by the diagnostic system and method of U.S. Pat. No. 6,574,537 relies upon systemic information and experiential information that are provided by experts in the field. As noted above, the systemic information is typically related to the system components and the input-output relations of the system subcomponents that are connected. With respect to the aircraft industry, for example, the systemic information is typically gathered through interviews with system engineers or the like for the aircraft manufacturer who have significant experience in the design and development of the aircraft and its attendant systems in a relationship of the various subsystems. Likewise, the experiential information that defines the direct relationships between component failures and the observed symptoms is also typically provided by experienced mechanics or engineers who have extensive experience troubleshooting a particular model of aircraft and have a wealth of information relating to the typical types of failures and the symptoms exhibited by an aircraft having each type of failure, including those particularly troubling faults that are multiple, intermittent, repeating or cross-system in nature. This type of knowledge also includes other type of observations such as distinct sounds, smells and visual cues pointing to the presence of a problem, which only can be obtained through years of experience.
Unfortunately, gathering the systemic, experiential and other information that is desired to construct the diagnostic model has been largely a manual process, which can be inconsistent, extremely time-consuming and, in some instances, impractical. First of all, the experts who could provide the systemic, experiential or other information are not always available. Moreover, the information provided by the experts may be somewhat inconsistent. For example, various experts may have had somewhat different experiences with the complex system that is being modeled. Additionally, different experts may express the same or similar concepts in somewhat different terms leading to slightly inconsistent information. Accordingly, while the diagnostic system and method of U.S. Pat. No. 6,574,537 are a significant advance in the art with respect to troubleshooting complex systems to identify one or more components that has failed, the construction of the diagnostic model or network utilized by the diagnostic system and method can be quite time-consuming and expensive and may require the involvement of a number of experts in the field. Accordingly, it would be desirable to develop a diagnostic system, method and computer program product that streamline and optimize the information gathering and model building process, and thus allow consistent and efficient model construction without requiring as much involvement by experts in the field, while still maintaining the accuracy and integrity of the diagnostic model.