Heretofore, computers used a memory architecture optimized to move large amounts of data between an external memory and an internal cache. For example, conventional Dual In-Line Memory Modules (“DIMMs”) may have a data width of 64 or 72 data bits. Memory chips used in these DIMMs may have a minimum burst size of 4 or 8 words. Burst size may generally be considered as a minimum number of memory reads or writes that may be done in one operation. This combination of data width and minimum burst size generally means that each memory burst access reads or writes 32 to 72 bytes of data. This may be efficient when moving a large quantity of data between processor external memory, such as for example an external Dynamic Random Access Memory (“DRAM”), and processor internal cache memory. However, this may be very inefficient when reading or writing a few bytes of data at random locations in such external memory.
Cache memory relies on the fact that many computer programs access data from a predictable and restricted range of addresses so information can may be fetched from an external memory ahead of when it is to be used and stored temporarily in cache memory (“cache”). Generally, reading or writing data that is in cache is fast in comparison to reading or writing data that is in system DRAM. Along those lines, data to be written to external DRAM for example may be stored temporarily in cache and then written to such DRAM some time later, for example during what would otherwise be an idle time or some other more convenient time. This data handling sequence may allow a processor, such as a Central Processing Unit (“CPU”), to continue to operate at high speed, such as a CPU rated speed, without waiting and/or having to slow down for reading or writing to external DRAM.
Some algorithms, however, do not have predictable and localized memory access. Search index preparation and relational database processing are examples of algorithms that may not have predictable and localized memory access for use of cache as previously described. However, there are many more examples of algorithms that may have processes, in whole or in part, having random reads and/or writes which are not suitable for or do not derive significant performance improvement by caching information. For example, executing these types of algorithms on a conventional computer system may involve a CPU spending significant amounts of time waiting for external DRAM, and thus such computer system may generally operate slowly and inefficiently with respect to execution of such types of algorithms.
Serial memory interfaces such as Fully Buffered DIMM (“FB-DIMM”) and Serial Port Memory Technology (“SPMT”) address reducing the number of signals between memory and an associated memory controller. However, such serial memory interfaces may not provide significant improvement for execution of such described-above algorithms having random reads and/or writes.
Hence, it would be desirable and useful to provide a memory architecture that overcomes one or more limitations of conventional memory architectures with respect to random access, including without limitation random access involving small quantities of information.