1. Field of the Invention
The present invention generally relates to a computer memory interface and architecture and, more particularly, to a simple yet powerful and flexible memory interface and architecture for a memory card which can be used in different system configurations, with different array types, and different memory sizes. The invention contemplates a "smart" memory card having improved performance and function including fast access without sacrifice of reliability enhancement functions and a full range of direct and partial store operations in a manner transparent to the system.
2. Description of the Prior Art
The basic architecture of a typical computer system includes a central processor unit (CPU) connected by a bus to a random access memory (RAM) Devices other than the central processor, such as input/output (I/O) devices, access the RAM for the purpose of reading and writing data. Early computer systems required these devices to raise an interrupt level to request the CPU to allow access to the RAM. This caused delays in processing time by the computer system, and therefore direct memory access (DMA) controllers were developed to allow access to RAM without interrupting the CPU.
These computer systems, whether micros, minis or mainframes, are generally characterized by a modular construction wherein the RAM is composed of a plurality of memory printed circuit cards which plug into the bus. These memory cards typically comprise an array of integrated circuit (IC) chips and little else, all the control being exercised off the card by the CPU and/or DMA controller. Such cards for the purposes of this disclosure may be referred to as "non-intelligent" memory cards The subject invention belongs to a class of memory cards which are referred to herein as "intelligent" memory cards; that is, these memory cards include various data flow and control logic which perform functions that are performed off the card in non-intelligent memory cards.
In order to put the invention in context, reference is first made to FIG. 1A which shows in high level block diagram form the prior art non-intelligent memory card and processor interface. In this system, the processor 10 communicates via a parity bus 11 to a memory controller 12 that includes error correction code (ECC) functions. The controller in turn communicates via an external memory bus 13 with a plurality of memory cards, only one of which is shown at reference numeral 14. Each of these memory cards includes drivers and receivers 15, which provide buffering between the external memory bus 13 and the internal memory bus 17. The internal memory bus is used to address the arrays 16 of random access memory (RAM) chips on the card.
The architecture of an intelligent memory card is shown by way of contrast in FIG. 1B, wherein like reference numerals indicate the same or corresponding circuits. It will be noted that the controller 12 and the external memory bus 13 have been eliminated. The functions of the controller have been incorporated into the buffer logic 18. As will be described in more detail, incorporation of these functions into the on-card circuits provides a significant enhancement to machine performance and is the basis for characterizing the memory card as an intelligent or a "smart" memory card.
The subject invention is specifically an improvement on the memory architecture and processor interface used in the IBM 9370 series computers. These memory cards belong to the class of intelligent memory cards. FIG. 2 is a high level block diagram of the IBM 9370 series of smart memory cards. These cards comprise a memory array 21 composed of eighty 512K.times.2 arrays of dynamic random access memory (DRAM) chips configured in two banks of forty. This particular memory card has an 8-byte wide internal memory data bus 27 and 28 which interconnects the on-card logic 23 to the two memory array banks 21 and two 2-byte wide external memory buses 24 and 25. Uni-directional control buses 22a, 22b and 22c respectively connect array control redrive logic 26a, 26b and 26c to the memory arrays. These buses supply the array addresses, array selects (RAS and CAS), read/write control, and data input/output controls. A central controller 29 is divided into two parts, an array control communicating with the array control redrive logic 26a, 26b and 26c and a data flow control communicating with bidirectional data flow logic 30 and 31. The bidirectional data flow logic 30 and 31 each include two 4-byte ECC halves.
The use of a "smart" memory card eliminates the need for an extra bus between the memory arrays and the point of use of the array data, thus eliminating the associated bus delay on this performance critical path. Normally, the bus for I/O and cache is a parity bus such that the ECC logic does not have to be replicated on each cache and I/O interface. A controller with ECC logic connects this parity bus to one or more memory cards without ECC on them. However, logic on the memory cards would still have to buffer data between the array and off-card memory bus. By including the ECC in the memory card buffer logic, the memory card is permitted to sit directly on the parity bus, thus permitting faster memory access. In a synchronous environment, this provides at least one clock savings on each fetch or store transfer by eliminating one or more latch states.
As on card logic for ECC enhances performance by eliminating some bus delay, self contained logic for operations like extended ECC, soft fail scrubbing, and read-modify-write (RMW) enhance performance by reducing cycle times for these operations. Cycle times are reduced since there is no need for multiple off-card bus crossings during these operations. The automatic nature of these operations eliminates system control overhead and the associated time that would be required. One example is extended ECC (XECC). The system only initiates one or a series of array fetches. If extended ECC needs to be performed, then the memory card holds the BUSY line active while XECC occurs internally. This consists of an inversion of the initially fetched data containing the detectable but uncorrectable (by ECC alone) errors, a store back to the same memory location, another data fetch to that location and a subsequent inversion of that data before processing by the ECC logic. If the XECC operation is successful, the corrected data is restored to that memory location. If unsuccessful, the initial data with the detectable error is restored. The successful XECC operations are then followed by any remaining subsequent fetch transfer(s) to the system bus. Bus crossing delays for these three internal operations would increase the net cycle time for the operation if they were done externally.
Another example is read-modify-write (RMW) A read-write-modify operation is a partial store, partial in that one or more of the 4-byte ECC words will only have one or more, but not all, of its bytes overwritten. To ensure that the ECC word(s) have correct data and correct associated check bits, the data stored in memory that is to be partially overwritten must first be fetched from memory, referred to as a prefetch, and run through the ECC to correct any single bit errors. Multiple bit errors could invoke the XECC operation. The system sends a store command with associated field length and starting address, and the memory card determines if a direct write or RMW is required. If a RMW is required, then for certain cases, the card stores all system transfers in a buffer while initiating a prefetch to memory. Thus, the system store and card prefetch run concurrently. Once the prefetch occurs and memory errors are corrected, the card overwrites the appropriate prefetched bytes and stores the result back to memory. Again, operation cycles and system logic overhead is minimized.
Card memory refresh and associated soft error scrubbing are also somewhat transparent to the system. For normal refresh operations, the system activates he refresh line and waits for BUSY to go away. The memory card handles the refresh address count and the array controls If active and triggered, soft error scrubbing occurs at active refresh time by replacing the appropriate refreshes. The scrub operations are basically a zero byte RMW. Data is fetched from memory, on-card ECC corrects any single bit errors, and then data is restored to memory. If the single bit error was related to a soft fail, then the restore puts good data in place of the bad data bit that was the soft fail. Refresh is also accomplished during this operation.
The IBM 9370 processor has evolved to incorporate high performance functions and this, in turn, required an improved design for the memory cards for the next generation system. The improved design and function are the basis for this application. The next generation design objective included the ability to obtain up to eight 8-byte data transfers from the memory card every 27 ns after the initial access. This 27 ns transfer rate had to include the time required for Error Correction Code (ECC), parity generation, and other reliability enhancement functions.
In the process of designing the new memory card, it was another design objective to pack all the function required into a minimum number of integrated circuits (ICs). The problem here was to provide all the needed function and still stay within the chip/module I/O limitations set by cost constraints and the capabilities of current technology.
A third design objective was to define an interface and architecture that is both simple and highly flexible, allowing the memory card to be used with a broad range of hardware technologies and system uses.