1. Field of the Invention
The present invention generally relates to data processor memory architectures and, more particularly, to a queue structure which provides the performance of individual bank queues at the approximate cost of a single basic storage module queue.
2. Description of the Prior Art
In most large data processing systems, the processor speed is many times faster than the memory cycle time. To decrease the average memory latency, the memory is typically partitioned into basic storage modules (BSMs) which are further subdivided into many banks. The effective number of BSMs limits the number of storage accesses allowed per processor cycle. The degree of banking within a BSM is determined by the ratio of memory to processor cycle times. As one example, there might be 64 BSMs with 128 banks each.
In most high-performance designs, the address bits are divided into the following fields:
______________________________________ chip select bank select BSM select byte index and address within chip ______________________________________
The first three of these fields select a memory word, while the lowest-order bits select bytes within the word. In an effort to achieve a uniform distribution of storage requests across BSM units, the BSM selection field is usually the lowest-order bits which select words. The bank selection bits are the next higher-order set of bits. The remaining bits select locations within a chip and chips within a bank. For purposes of illustration, a base memory design which uses four-byte words and 64 BSMs and 128 banks in each BSM is assumed.
Since many distinct banks exist, a given request is quite likely to be destined for a "ready" bank. Queues are maintained to buffer memory requests for banks which are still busy (or active) from previous requests. IBM S/370 architectural requirements, regarding the sequence of storage references, could be satisfied by placing individual First-In-First-Out (FIFO) queues in front of each bank. In systems with a large number of banks (e.g., 128 banks in each of 64 BSMs), hardware costs make individual bank queues impractical. Therefore, a single FIFO queue per BSM is usually implemented, thus greatly reducing hardware cost at the expense of performance. When the head entry has to wait due to its required bank being active, all subsequent memory requests in the queue are also forced to wait. In a system which has many banks per BSM, most of the waiting requests are to banks which are ready. However, due to the trend of increasing memory access times, in terms of processor cycles, a single blocked entry at the head of the queue may delay many requests for many cycles.