Man has yet to invent a useful machine or a vehicle that can function throughout its designed useful life without some kind of maintenance or repair being performed. In fact, the lack of reasonable routine maintenance or repair will shorten the useful life of any asset, particularly for complex systems such as aircraft and manufacturing systems.
When a useful asset suffers a casualty in the field, there are a number isolation tests that may be applied to disambiguate and to isolate the failure mode (“FM”), and then to narrow repair options down to a finite group of corrective actions (“CA”). Or conversely, to establish that the group of CAs will not fix the FM. A CA may include either an isolation procedure (“I”) or a repair procedure (“R”). Each isolation procedure and each related repair procedure have an estimated execution time cost and a material cost necessary to complete the isolation procedure or the repair procedure.
With complex systems, such as aircraft, a casualty may have a number of potential FM's that could be the underlying cause of the casualty. Each FM may have a particular probability of being the cause of the casualty. As a non-limiting example, an inoperative radio casualty may be caused by three probable FMs: a lack of electric power, a faulty circuit board or maybe a faulty squelch switch. Each FM may have an expected or a historical probability of causing that particular casualty. The probabilities of causing a particular casualty may be determined over time by testing or by historical performance and may be stored in a database for later use.
Further, it will be appreciated by those of ordinary skill in the art that some isolation procedures and repair procedures may be capable of identifying or correcting multiple FMs simultaneously, whether the FMs are related or otherwise. Therefore, each repair procedure and isolation procedure has a probability of correcting or identifying the failure mode. Because one of a set of FMs may have caused a casualty, the set of FMs is referred to as an ambiguity group.
The traditional way of handling ambiguous failures in the field has been to order parts and tools for carrying out the FM isolation tests until the FM is isolated. Once the failure mode is isolated, the parts and tools needed for the repair procedure are then ordered. This would be an optimum solution when all the possible repair actions are expensive (i.e. high parts cost, high execution time, and long wait times) compared to the time and costs of executing all the associated isolation tests. As such, the conventional maintenance philosophy required a field maintenance facility to place at most two requisitions, a primary requisition for all of the isolation tools and a secondary requisition for the specific repair parts and tools as determined by the isolation procedures.
Therefore, in a conventional method, one isolation tool requisition is made for all probable isolation procedures. This means that when parts with a short wait-time are mixed with parts that have a long wait-time in the same order, the longer wait time parts will delay the parts that could be available earlier. There may also be a return penalty for parts that are ordered and not used. As such, the maintenance planning problem and the parts order scheduling problem are mixed and cannot be solved independently as in a traditional setting. For example, if it is assumed that all parts and tools are always on hand, there will be no need for any parts requisition and one could determine the optimal sequence of operations by applying a cost a cost function to expected repair procedures. However, the same sequence of operations may not be optimal if different parts have different wait-times.
However, the conventional way is not always the best in practice. Some repair procedures could be done without first performing an isolation procedure if the repair is inexpensive. Such may be the case even if the probability of success is small. For example, if there is a 1% probability that a failure may be caused by either a $0.25 faulty light bulb or a 99% probability that the failure is caused by a defective $100,000 line replaceable unit (“LRU”), then one would opt to replace the bulb without conducting the associated isolation procedure for the bulb since the probability weighted cost of changing the bulb is deminimus The small probability cost that the bulb is the cause of the FM makes it cost effective not to do the isolation test for the bulb. Regardless of whether or not the casualty was actually fixed by the replacement bulb, a maintenance technician would have already determined whether or not the bulb was the problem and may have repaired the problem at the same time.
As such, finding an optimal solution to the maintenance problem involves knowledge of the future choices. The only way to know all possible future choices is to do an exhaustive search for all possible combinations of corrective actions and parts ordering schedules and apply the results to a cost function. However, the time required for executing an exhaustive search increases exponentially with the number of FMs in the ambiguity group and makes it impractical for field execution. Such a computationally intensive process may take days to provide a solution for a casualty that may involve only a modest handful of probable corrective actions.
A global search process that determines all possible sequences of repair procedures and isolation procedures is a global search system algorithm (“GSA”) utilized by Honeywell International, Inc. The GSA determines every combination and permutation of repair procedure and isolation procedure related to a set of probable FMs in an ambiguity group. The GSA then examines each of the possible combinations and permutations of the ambiguity group to determine the aggregate direct costs, wait time costs and probabilities of repair success of each sequence of CAs. The GSA system then computes the total cost for each sequence that does not end with an isolation procedure to determine the lowest cost sequence as the optimal maintenance plan sequence. However, the computing power and time required for such an exhaustive optimization is high.
Alternative sequencing systems may include the system disclosed by Felke in U.S. Pat. No. 6,748,304. Felke uses a two step process whereby only repair procedures are sequenced in their order of probabilistic cost in a first step. After sequencing, a total cost is evaluated for each sequence. In a second step, a single isolation test chosen from a list of all the associated isolation tests is added to the evaluation, in a round robin fashion, until all of the isolation procedures have been worked into a sequence. The lowest cost maintenance plan is then determined amongst the “repair procedure only” sequence and all of those sequences that also included only one isolation procedure. To achieve a near optimal result with the Felke system, further sequencing requires manual intervention to rerun the procedure with those procedures not selected to be the initial repair procedure in the proceeding sequence.
Accordingly, it is desirable to have a maintenance plan system that quickly determines the lowest cost maintenance plan sequence for a given set of failure modes even if such a determination is absolutely sub-optimal but yet close to optimal. In addition, it is desirable to obtain a near to optimal maintenance plan that has high accuracy relative to an optimal solution. The obtained maintenance plan should be no worse than the traditional plan of isolate first then repair. Furthermore, other desirable features and characteristics of the present invention will become apparent from the subsequent detailed description of the invention and the appended claims, taken in conjunction with the accompanying drawings and this background of the invention.