The present invention relates generally to an error check and correction (ECC) system and method for memory systems associated with industrial controller applications. In particular, the present invention provides ECC in conjunction with standard memory devices which do not normally support ECC.
Memory integrity is a critical factor that distinguishes industrial control systems from general purpose computer systems. Memory errors, which affect memory integrity, are a significant concern in control system applications because they may affect an operation being controlled. For example, control programs and input/output (I/O) decisions are typically stored in RAM and precisely direct physical operations of the system. If an input bit were to suddenly change due to a memory error, the control program may react to the changed input by turning on or off a key output in response thereto. Depending on the nature of the output change, undesirable consequences may occur. Likewise, if a control program bit were to change unpredictably, the industrial controller may execute a random and/or unpredictable control sequencexe2x80x94this again may lead to undesirable control results. Thus, for robust control systems design, memory error detecting systems are generally necessary to ensure memory integrity.
In general, industrial controllers (e.g., Programmable Logic Controllers (PLCs), and Small Logic Controllers (SLCs)) provide parity and/or error check and correcting ECC systems to help ensure reliability of memory systems which control industrial processes. Parity bits allow for error detection of inadvertent changes in one or more bits of stored data. Parity may be provided as an extra bit of storage per byte of data written to memory, for example. Thus, for a controller employing eight-bit memory devices, nine bits of storage are required for each memory address.
In an ECC based system, codes (e.g., multiple bits) are computed and stored in conjunction with desired data. If an error is detected when memory is read, correcting algorithms are applied to the faulty data in conjunction with stored ECC codes, and in some cases, data may be restored. On such class of ECC correcting algorithms include utilization of xe2x80x9cHamming Codesxe2x80x9d which are employed to detect and correct errors that may have occurred.
Traditionally, static random access memory (SRAM) systems have been employed by industrial controllers due in part to ease of parity implementation, and that 9 and 18 bit devices were readily available. Industrial controllers, however, would benefit greatly if commercial memory devices could be utilized such as for example synchronous dynamic random access memory (SDRAM) devices. This benefit is due in part to higher densities and lower costs than associated with conventional SRAM systems. Unfortunately, SDRAMs do not support parity due to cost pressures related to commercial PC markets. Additionally, SDRAMs do not readily support FCC due to the synchronous nature of the devices. In particular, synchronous memory devices which have been started on a sequential stream of accesses, do not lend themselves to stopping and correcting errors on the fly.
Consequently, there is a strong need in the art for a system and/or method for employing SDRAM technology in conjunction with industrial control systems. Moreover, there is a strong need for an ECC system which operates with SDRAMs and/or other memory systems to alleviate the aforementioned problems associated with conventional systems and/or methods.
The present invention provides a system and method for applying ECC to SDRAM and/or other memory systems employed in industrial control applications. An ECC interface system provides memory integrity by detecting and alerting an external processor of errors associated with a commercially available SDRAM system while seamlessly enabling the processor to communicate and correct the errors. The unique architecture of the present invention utilizes the ECC interface in conjunction with a methodology for error correction to provide a low cost and high performance memory system as compared to conventional ECC systems.
In particular, the ECC interface of the present invention, when data is written to an SDRAM array configured for a 32 bit data bus for example, interfaces to at least one additional SDRAM to store ECC codes. The ECC codes (7 bits) are generated on writes for 32 bits of data to the memory array. The ECC codes are checked on reads by the ECC interface, and single bit errors are corrected as they are read from the memory array before being communicated to the processor. However, single bit errors are not corrected immediately within the SDRAM memory array when initially corrected and communicated to the processing system. Instead, a single bit error flag is provided to the processor and a faulty data address is captured or latched. Additionally, if earlier errors have been detected, a second status bit may be set indicating multiple locations in the memory array containing errors. If multiple bit errors occur, which are not correctable within a single address location, a third status bit which may be configured to initiate an immediate shut down of the industrial control system.
A routine initiated by the processor is employed to periodically test the status of the single bit error flag. If a single bit error is detected, data is read from the latched error address and corrected data is re-written by the processor during background operations. The processor may then re-read the location to determine whether or not the problem was a hard (e.g., stuck bit) or soft (e.g., noise induced) error. If the error persists after the re-write, a hard error may be determined, otherwise, a soft error may be logged.
The present invention also provides a read-modify-write (RMW) operation for detecting byte (8 bits) or word (16 bits) writes and updating the SDRAM accordingly. This is accomplished by reading 32 bits from the SDRAM array, modifying the 32 bits with byte or word data intended for the SDRAM array, and re-writing the modified 32 bits back to memory with a new ECC code for the modified data.
The combined methodology and system approach as described above provides many benefits over conventional systems. For example, single bit errors are corrected by the ECC interface when read from the memory array by the processor. Overall system performance is increased since the processor is not required to wait while actual memory locations are corrected since memory locations are corrected at a later timexe2x80x94during lower priority operations of the processing system. Secondly, the need for complex external memory correction circuits is mitigated by capturing faulty memory addresses and allowing the processor to correct the faulty memory location. Additionally, processor monitoring of error status bits enables the system to determine hard versus soft errors during the correction process and to determine, if so desired, an error rate for a particular memory array. If multiple bit errors are detected as described above, a fatal flag may be set and the system may be safely shutdown if desired.
In accordance with an aspect of the present invention, an error check and correction (ECC) system is provided. An ECC interface stores ECC codes in a first memory system and stores data in a second memory system. The ECC interface corrects errors in the data received from the second memory system utilizing the ECC codes received from the first memory system. The ECC interface asserts at least one error flag upon detecting errors in the data. A processor monitors the error flag and corrects the data stored in the second memory system.
In accordance with another aspect of the present invention, an error check and correction (ECC) system employing standard synchronous dynamic random access memory SDRAM is provided. The system includes a means for storing ECC codes in a first SDRAM and a means for storing data in a second SDRAM. The system also includes a means for correcting errors in the data received from the second SDRAM utilizing the ECC codes received from the first SDRAM. The system provides a means for asserting at least one error flag upon detecting errors in the data and means for monitoring the error flag and correcting the data stored in the second SDRAM.
In accordance with yet another aspect of the present invention, a methodology for error check and correction (ECC) is provided. The methodology includes the steps of storing ECC codes in a first synchronous dynamic random access memory (SDRAM); storing data in a second SDRAM; correcting errors in the data received from the second SDRAM utilizing the ECC codes received from the first SDRAM; asserting at least one error flag upon detecting errors in the data; and monitoring the error flag and correcting the data stored in the second SDRAM.
In accordance with still yet another aspect of the present invention, an industrial controller is provided. The industrial controller provides an ECC interface for storing ECC codes in at least one synchronous dynamic random access memory (SDRAM) system and storing data in at least one other SDRAM system. The ECC interface corrects errors in the data received from the at least one other SDRAM system utilizing the ECC codes received from the at least one SDRAM system. The ECC interface asserts at least one error flag upon detecting errors in the data. A processor monitors the error flag and corrects the data stored in the at least one other SDRAM system.
In accordance with still further yet another aspect of the present invention, an error check and correction (ECC) system in an industrial controller is provided. The system includes an ECC interface for storing ECC codes in a first synchronous dynamic random access memory (SDRAM) system and storing data in a second SDRAM system. The ECC interface corrects errors in the data received from the second SDRAM system utilizing the ECC codes received from the first SDRAM system. The ECC interface asserts a first error flag upon detecting errors in the data, and the ECC interface asserts a second error flag upon detecting errors associated with multiple memory locations. The ECC interface asserts a third error flag upon detecting a multiple bit error. A processor monitors the first error flag and corrects the data stored in the second SDRAM system, and the processor faults the industrial controller if the third error flag is asserted.