Manufacturers in the United States are making extensive use of computerized machine control systems including programmable logic controllers (PLCs), personal computers, computer numerical control (CNC), and other systems utilizing digital input/output signals to control machinery and processes. These systems have permitted the application of complex logic to achieve automation and control objectives. The very power of the control systems themselves has lead to a problem of considerable magnitude for American industry.
Although productive in their own right, these complex control systems have given rise to a significant problem in that the diagnosis of malfunction on machinery so controlled may be very difficult. This has created a manufacturing environment where expectations of uptime on expensive assets have dropped to very low values. It is not uncommon for manufacturers to get, and be satisfied with, asset utilization of only 50%. This low realization of asset utilization is prevalent even among the more capable manufacturers such as in the automotive industry.
The underlying problem with the extensive downtime of many machines controlled by control systems is that the logic in the control system is so extensive and so complex. Even experienced and capable electricians and technicians have great difficulty in finding the root cause of an operating malfunction. It is not uncommon for even a simple machine to have thousands of lines of code in its control system.
In a large and complex factory, such as an automotive transmission plant, there may be several hundred thousand lines of code in each of hundreds of individual control systems. Electricians and technicians can not possibly become familiar with all of this program logic. When they are confronted with an operating problem on a given machine they may spend hours trying to understand enough of the control system logic so that they might find the source of the problem. It has become common knowledge in industry that the largest single component of downtime on machinery is the diagnostic time. It is not unusual to experience hours of diagnostic time only to find that the actual repair takes just a few minutes.
Programs to educate electricians have not been fruitful because the body of information that they need to understand is so large and, even worse, is constantly changing. Engineering changes and new equipment keep the body of control system logic in any given plant in a state of flux. Even further, the electricians and technicians are frequently moved from one department to another exacerbating the problem of education. It is estimated that the United States problem on downtime due to the diagnostic period on programmable controlled machinery is in the order of more than one hundred billion dollars annually.
I. How PLCs Operate
Programmable logic controllers are devices that get input signals from the real world and use these inputs in the solution of Boolean logic. The results of the solution are then applied to one or more outputs that affect the behavior of some machinery or process. The inputs are most often in the form of discrete digital signals, but sometimes they may be analog in nature. The outputs are usually electric motors, electric valves, signals to other devices, or alarm devices.
The programmable logic controller is a special purpose computer that rapidly does this job of getting inputs and then turning outputs off or on according to some predetermined pattern of logic. The controllers often are capable of going through a cycle of getting inputs, solving logic, and updating outputs more than twenty times per second.
The most common use of the programmable logic controller is to take a given machine through some repetitive sequence of operations. An example is a simple drill press where the controller operates outputs to make the following sequence of operations:                1. Load the part        2. Clamp the part        3. Start the drill spindle        4. Feed the drill to the required length        5. Retract the drill        6. Stop the drill spindle        7. Unclamp the part        8. Remove the partThe above example is very simple and almost trivial, but it illustrates the sequential nature of most control applications.        
A programmable controller operating the above drill press would need a series of inputs to tell it when to go to the next phase of the machine cycle. Once the part was loaded in step one for instance, the controller might get a signal from a proximity switch telling the controller that indeed there now is a new part in place ready to be clamped. Once the controller had the signal from the proximity switch it would proceed to turn on the output that operated the clamping device. Once the clamp had extended to the desired position to have the part securely clamped, a signal might be sent from another proximity switch to tell the controller to start the spindle.
A position transducer might send a continuous signal to the controller to tell it that the drill spindle is extending the drill. When the controller saw the correct value of the analog signal telling it that the drill had extended to the proper length it would turn on an output to start the retraction of the drill.
If the above drill press was operating in a full automatic cycle, making one part after another with no human intervention, the successful completion of operation number eight (part removal) would send a signal to the controller to initiate another cycle. This signal is usually called the cycle start signal.
II. How Problems Occur
When one of the input devices in the above example fails, the machine will stop sequencing. It is then said that the machine has dropped out of automatic cycle. In a very large percentage of the automatic cycle interruptions an input has failed to come on at the desired time, or an output has failed to operate and has not sent its signal to the controller. The problem the diagnostician is faced with is that many machines have hundreds of inputs and outputs operated on by several thousand lines of program code.
For example, in the above simple drill press example, if the part sensing switch did not sense a part at the end of operation one, no signal would be sent to the controller and operation two would not begin. This could be caused by a faulty proximity switch or by a mechanical malfunction whereby no part was actually loaded.
Let us suppose that the part is actually present, but the proximity switch that senses the part in place has failed. The machine operator observes that the machine is no longer cycling and tries to get the machine into automatic cycle by manually removing the part and pressing the manual cycle start button. The machine will again fail to make a cycle so the operator informs the supervisor that the machine is not working. The supervisor then puts in an order to have an electrician look at the machine and repair it.
Sometime later an electrician arrives at the machine. Often the operator has already been assigned to another machine that is operating and is no longer around. The electrician finds the machine in an idle mode and maybe there is a part in the machine fixture, or maybe not. Some machines may have an error message screen that may help define the problem but most do not. Even the machines that have better error messaging may have several messages on screen by the time the electrician arrives.
The electrician may see a part in the clamping fixture but it is not possible to determine where the automatic cycle actually stopped. It could have been that the part had been clamped and the drill spindle did not start. It could have been that the spindle started by it could not advance. The usual case is that there are many possibilities, and in the typical mode of operation the person responsible for the diagnostics has no way of determining which is the most probable source of the problem and which is the least probable cause.
Programmable logic controllers do not keep any history of their operations so the electrician has little information to tell why the machine stopped. Even further, in most factories the electrician may not even be sure of the nature of the sequence of operation of the machine. The electrician probably does not know the address of the inputs and outputs so it is not possible to do a simple inspection of the machine to tell why it has stopped operating.
Even if the machine has an error messaging system, it often has hundreds of possible messages many of which do not give a clear description of the nature of the error. The diagnostic process is often trial and error. This explains the long diagnostic time actually experienced in almost all factories using programmable logic controllers.
Frequently the diagnostician begins the search for the problem by examining the current state of the logic in the controller. This approach has two significant shortcomings. First, the logic is usually very complicated, much more complicated than the simple example above, and second, the state of the inputs and outputs in the controller at the time of diagnostic examination is not the same as the state that existed at the exact time when the machine stopped. The latter point is exemplified by looking at the input and output table of the drill press example described above.
Most controllers keep the current status of the inputs and outputs in a table in the controller's memory, usually in a bit fashion. If an input is on, the bit assigned to that input is a one and if an input is off, the bit is zero. A similar table is kept for the outputs. Physically, the input cards for the controllers usually come in multiples of eight. An input table for the simple drill press above might look as follows:
Input Number01234567Status00000000Where the inputs are defined as follows:                0. Cycle start        1. Part sensed in fixture        2. Clamp fully extended        3. Spindle Operating        4. Clamp fully retracted        5. Drill fully retracted        6. Coolant flow sensor        7. UnusedIn addition to the above discrete inputs, there would be one analog input from the drill position transducer.        
The output table for the simple example would look like:
Output Number01234567Status00000000Where the outputs are defined as follows:                0. Signal to robot to place part        1. Part clamp forward        2. Part clamp retract        3. Spindle motor        4. Drill feed motor forward        5. Drill feed motor retract        6. Coolant valve open        7. Signal to robot to remove part        
When the machine is operating normally the status of the input table at the exact time the part was successfully placed would look like:
Input Number01234567Status01000100In this part of the sequence, when input one came on, the controller would turn on output one to move the part clamp forward. When the part sensor proximity switch failed the input table would look like:
Input Number01234567Status00000100Note that there is a unitary probability that input one would be on at this part of the cycle. If the electrician knew exactly where the machine had dropped out of automatic cycle, and if the electrician knew that at the same time input one was a zero and not a one, the cause of the problem would be clearly defined.
There have been various attempts to provide a diagnostic tool to help diagnosticians determine where an error or failure within a system has occurred. Most of these require that additional error detecting logic be added to existing controller code or that a user enter and establish relationships/rules between inputs and outputs. For example, U.S. Pat. No. 5,953,226 (hereby incorporated by reference) describes a diagnostic system that is implemented within the main program, i.e., ladder logic control program, itself. Specifically, the diagnostic system is a special control module that is added to the application program in the form of additional ladder logic within the main program or as a subroutine. Either method requires that the logic within the main program that is to be monitored be marked thereby allowing monitoring and allocating memory to save diagnostic results. Marking of ladder logic is performed by inserting an instruction mark that is comprised of two inputs and two outputs, which indicate the occurrence of certain events.
U.S. Pat. No. 5,949,676 (hereby incorporated by reference) describes a diagnostic system wherein a diagnostic engine detects timing patterns comprised of a trigger event, result event, and time duration between trigger and result, wherein a relationship between the trigger event and result event has been previously defined. Statistical analysis is performed on the timing patterns to produce a diagnostic rule that is updated per the timing patterns detected.
U.S. Pat. No. 5,870,693 (hereby incorporated by reference) describes a diagnostic system wherein a diagnostic computer is connected to a PLC running a ladder logic program. While running, the PLC generates at least three types of data: (1) change of time information CNT, whose value is incremented in response to a change of any one of the relays corresponding to a plurality of outputs monitored by the PLC; (2) output data OUT, which is comprised of 1 bit that represents the relevant output corresponding to each relay; and (3) channel change frequency CM, which is the number of times of changes of each output and is incremented in accordance with the output data OUT. In use with the diagnostic computer, the PLC communicates the CNT, CM and OUT data for one complete cycle. Upon the machine being controlled reaching a halt, the user may press a key to download the CNT, CM and OUT data at the time of halt. The diagnostic computer then performs various analysis steps. First, it compares the stored CM data with the halt CM data to see if they are the same. If they are the same, it means that the PLC program has failed to move to the next step and then, by comparing the halt CM data with the stored CM data at the next data change time point, it can be determined which output has failed to change state. If the stored CM data is different from the halt CM data, an output failed to change state. To determine which output the halt CM data is compared to the stored preceding change time CM data. See Cols. 9 and 10 for a detailed explanation of diagnosis operations.
While each of the above-described patents has been somewhat useful in determining the cause of an error or failure within a machine or process, they present the user with a complex system that is, of its ownself, difficult to learn and implement.
As such, there is a need for a diagnostic tool that will provide the diagnostician a clear view of a machine or process at precisely the time that a failure occurs as well as provide a clear description of the normal sequence of events for the current mode of operations so that an error may be easily detected. The diagnostic tool will preferably provide these features without requiring the addition of logic to an existing control program and without requiring the establishment of multiple relationships between inputs and outputs.