This invention relates generally to the field of synchronous dynamic random access memory subsystems and more particularly provides a command to activate and open multiple memory banks.
The endeavor for faster and faster computers have reached remarkable milestones since their inception and coming of age during the past sixty years. The beginning of the computer age was characterized by connecting vacuum tubes with large coaxial cables for wiring analog logic. If a new problem was to be solved, the cables were reconfigured. Today, coaxial cables have been replaced with high speed data buses; vacuum tubes have been replaced with high speed logic having transistors of new semiconductor materials and designs, all of which are limited only by the laws of physics. Initially, the slowest subsystem of computers was the processor subsystem. As processors became more efficient, the limiting function of the computer became the time required to transfer data to and from sources outside the computer.
To improve the performance of memory subsystems, memory was brought closer to the processor in the form of cache hierarchies. Cache hierarchies which are limited volume high-speed memories were incorporated into the same integrated circuit as the processor. Thus, data would be available immediately to the processor but the bulk of the data and operating programs was still stored in a larger memory within the computer, referred to as main memory. Efforts were directed toward accessing this main memory subsystem faster and more efficiently. New faster semiconductor materials were developed and used in the RAMxe2x80x94random access memory. More efficient circuits and methods of row and column addressing of the RAM memory were developed. Memory was connected to the processor and other I/O devices through more efficient buses and sophisticated bus command logic. Soon memory control logic became almost as complicated as the logic within the central processing unit having the processor and the cache hierarchy. Memory refresh circuits were developed to maintain the xe2x80x9cfreshnessxe2x80x9d and hence the accuracy of the data within memory. Compression/decompression engines were developed to efficiently rearrange stored data in memory banks.
Still other techniques to improve memory subsystem performance include overlapping and interleaving commands to the memory devices. Interleaving to route commands on different memory buses or different memory cards was improved by providing additional memory in the form of multiple devices and multiple memory banks within each device. Increasing the amount of data processed with each access to memory connected across a memory bus improved memory bus bandwidth.
Improvements in the speed of dynamic RAMs (DRAMs) have historically come from process and photolithography advances. More recent improvements in DRAM performance, however, have resulted from making changes to the base DRAM architecture that require little or no increase in die size. A bursting feature to allow multiple bits to be accessed from memory was developed. To implement the bursting feature, the starting address, the burst length and burst type are defined to the DRAM so the internal address counter can properly generate the next memory location to be accessed. The burst type defines whether the address counter will provide sequential ascending page addresses or interleaved page addresses within the defined burst length. Common processing in microprocessors access DRAM data in a four-bit burst length in either a sequential or interleave fashion. Optimum system performance can be achieved when data is fed to the processor at a rate as close as possible to the system clock but typically the processor is at a higher clock rate. Data accessed from a L1 cache comes close to meeting this requirement. When the requested data is accessed directly from the DRAM rather than the L1 or L2 cache, however, the burst rate is significantly degraded. Assuming the burst is from a DRAM memory page that is already open, the achievable burst rate at 66 megahertz using sixty nanosecond DRAMs is 5/3/3/3 cycles for fast page mode DRAMs which means that five clock cycles are required to access the first bit and three clock cycles are required to access each of the remaining three bits. The achievable burst rate is 5/2/2/2 for extended data output DRAMs. Clearly, there is a DRAM bandwidth bottleneck and the bursting feature alleviates the bottleneck in this and other scenarios concerning graphic applications which typically burst long streams of data. Once the first page address is accessed, the DRAM itself provides the address of the next memory location to be accessed. This address prediction eliminates the delay associated with detecting and latching an address provided externally to the DRAM.
Synchronous DRAMs (SDRAMs) and burst extended data output (BEDO) memory chips are two memory chip architectures that implement the bursting feature. A state of the art memory chip is the double data ratexe2x80x94II synchronous DRAM (DDR-II) which comprises high speed CMOS memory devices that can be organized as, for example, 256 megabytes, 512 megabyes, 1 gigabyte or more, into varying rows, columns, and banks that has its own memory clock. The DDR-II is fast; its memory control clock may run speeds up to 400 megahertz or higher but it also has a set burst length of four words which means that for bursts of longer than four words, a new column address select (CAS) command must issue every other clock cycle. Typically designers issue commands every other clock cycle on some slower memory subsystems so the voltage on the address and control lines are allowed to drop before a point-to-point chip select (CS) signal establishes the command. Under these circumstances as illustrated in the timing diagram of FIG. 1, a seamless burst cannot be achieved between multiple memory banks. For simplicity only the clock signal, CLK, of the memory control bus is shown as the uppermost signal and the CLK signal, although present, is not illustrated. Immediately below is the command of the action to be performed in memory. At time T0, a signal to activate bank B0 and an address of row m are given but according to the DDR-II specification, an empty or dead cycle must occur at time T1 to allow the voltages on the control lines to settle. Thus at time T2, an address for column 1020 and the command to read data from bank B0 are issued. There is then another undesirable dead cycle at time T3. At time T4 a command is issued to activate the next memory bank. The pattern repeats itself so that there is always a dead cycle between an activate command and a read command. Given a read latency of four clock cycles, reading data from row m, columns 1020-1023 of Bank B0 has completed at time T8 but because of the required dead cycle at time T3 to open up another column, there exists a latency when reading bursts across boundaries of different banks of the same row.
There is a need to eliminate latencies that currently exist when reading data bursts across memory bank boundaries of synchronous DRAM subsystems at high speeds without additional cost and additional hardware while still maintaining high bandwidth and high memory bus utilization.
These needs and others that will become apparent to one skilled in the art are satisfied by a method to access one or more memory banks in a synchronous dynamic random access memory system which comprises the steps of reading a single command to open of a plurality of synchronous dynamic random access memory banks, opening one of the memory banks, and then providing an option to open more than one of the memory banks. A key feature of the invention is to provide the option to open one or more memory bank with a single command.
The method may further comprise the steps of determining that more than one the memory bank is to be opened and then opening another of the memory banks. All of the banks to be opened may be opened at the same time; alternatively, all of the banks may be opened in a sequential, staggered manner. If the memory banks are to opened sequentially, then the banks could be opened according to a deterministic time delay which could either be synchronous or asynchronous to the memory device clock. A nop command or a next chip deselect command may execute during that time delay.
The method may further comprise incrementing/decrementing a bank address before opening another memory bank. The method may also comprise decrementing a bank counter while opening another memory bank. The method, moreover, provides that a row may be incremented/decremented before opening another memory bank. The method further provides that the same or a different column may be accessed if multiple memory banks are opened.
The method may further comprising bursting data to/from the open memory banks. The burst data may be to/from the same row of the open memory banks. Similarly, the burst data may be to/from the same column of the open memory banks. Each subsequent bank can be opened during the step of bursting data to/from the open memory banks.
The invention is further envisioned as a method to open one or more multiple data banks of a memory system, comprising the steps of reading a command to open one of a plurality of the memory banks; opening one of the plurality of memory banks; determining that more than one of said plurality of memory banks is to be opened and generating a single command to open more than one of the memory banks; incrementing a bank address; opening another of the memory banks; and bursting data to/from the same row of the open memory banks.
The invention may also be considered a computer system, comprising a computer processor, a memory connected on a bus to the processor, the memory comprising a memory controller connected on a memory bus to a plurality of synchronous dynamic memory banks, the memory controller to generate a single command having the option to open one or more of the memory banks; and a plurality of bus units connected to the processor and/or said memory via an external bus, the processor and/or one of the bus units to request access to an address in the memory banks. The computer processor may be an I/O processor/adapter and the memory controller may be a storage controller and one of said bus units might then be a second computer processor connected to the storage controller and further connected to an external bus which might be a peripheral computer interface (PCI) bus. The storage controller may further be connected across a small computer system interface (SCSI) bus to a larger memory device.
The invention is also a memory system apparatus for the storage of and retrieval of binary data, comprising a means to access one or more synchronous dynamic random access memory banks, a means to decode a plurality of commands on a memory command bus to access the plurality of synchronous dynamic random access memory banks; a means to provide the option to open one or more of the synchronous dynamic random access memory banks; and a means to input/output data continuously from the opened synchronous dynamic random access memory banks.
Further scope of applicability of the present invention will become apparent from the detailed description given herein. It should be understood, however, that the detailed description and the specific examples, while indicating preferred embodiments of the invention, are given by way of illustration only. Various changes and modifications within the spirit and scope of the invention will become apparent to those skilled in the art upon review of the detailed description.