1. Field of the Invention
The present invention relates, in general, to a method and system to be utilized in data processing systems. In particular, the present invention relates to a method and system to be utilized in data processing systems wherein, for non-limiting example, a memory controller is utilized.
2. Description of the Related Art
Data processing systems are systems that manipulate, process, and store data and are notorious within the art. Personal computer systems, and their associated subsystems, constitute well known species of data processing systems. Personal computer systems in general and IBM compatible personal computer systems in particular have attained widespread use for providing computer power to many segments of today""s modem society. A personal computer system can usually be defined as a desk top, floor standing, or portable microcomputer that includes a system unit including but not limited to a system processor and associated volatile and non-volatile memory, a display device, a keyboard, one or more diskette drives, one or more fixed disk storage devices, and one or more data buses for communications between devices. One of the distinguishing characteristics of these systems is the use of a system board to electrically connect these components together. These personal computer systems are information handling systems which are designed primarily to give independent computing power to a single user (or a relatively small group of users in the case of personal computers which serve as computer server systems) and are inexpensively priced for purchase by individuals or small businesses.
A computer system or data-processing system typically includes a system bus. Attached to the system bus are various devices that may communicate locally with each other over the system bus. For example, a typical computer system includes a system bus to which a central processing unit (CPU) is attached and over which the CPU communicates directly with a system memory that is also attached to the system bus.
In addition, the computer system may include a peripheral bus for connecting certain highly integrated peripheral components to the CPU. One such peripheral bus is known as the Peripheral Component Interconnect (PCI) bus. Under the PCI bus standard, peripheral components can directly connect to a PCI bus without the need for glue logic. Thus, PCI is designed to provide a bus standard on which high-performance peripheral devices, such as graphics devices and hard disk drives, can be coupled to the CPU, thereby permitting these high-performance peripheral devices to avoid the general access latency and the band-width constraints that would have occurred if these peripheral devices were connected to a low speed peripheral bus. Details on the PCI local bus standard can be obtained under the PCI Bus Specification, Revision 2.1, from the PCI Special Interest Group, which is hereby incorporated by reference in its entirety.
Two relatively high-bandwidth types of traffic that are communicated to and from system memory over the PCI bus are 1394 device traffic and networking traffic. The 1394 device traffic originates within a high speed serial device which communicates with a PCI bus through and over a Southbridge. The networking traffic originates within a network card which is reading network traffic information, regarding one or more networks of which the data processing system is a part, from a network buffer.
Relatively recently, techniques for rendering three-dimensional (3D) continuous-animation graphics have been implemented within PCs which have exposed limitations in the originally high performance of the PCI bus. The AGP interface standard has been developed to both (1) reduce the load on the PCI bus systems, and (2) extend the capabilities of systems to include the ability to provide 3D continuous-animation graphics with a level of quality previously found only on high-end computer workstations. The AGP interface standard adds an additional bus to data processing systems: the AGP Interconnect. The AGP interface standard is defined by the following document: Intel Corporation, Accelerated Graphics Port Interface Specification, Revision 1.0 (Jul. 31, 1996).
The AGP interface standard reduces the load on PCI bus systems and extends the capabilities of systems to include the ability to provide 3D continuous-animation graphics via a rather indirect process. Under the AGP interface standard, a CPU independently processes the geometric and texturing data (geometric and texturing data are data necessary to properly define an object to be displayed) associated with each object to be displayed in a scene. Subsequent to processing the geometric and texturing data, the CPU writes the geometric and texturing data back into system memory. Thereafter, the CPU informs a graphics processor that the information is ready, and the graphics processor retrieves the information from the system memory.
In current industry architectures, each preceding discussed bus (e.g., the system bus, the AGP interconnect, and the PCI bus) independently articulate with the system memory through a device known as the Northbridge. The various communications with, or accesses of, system memory are generally controlled by a device within the Northbridge known as a xe2x80x9cmemory controllerxe2x80x9d.
A memory controller controls system memory which is typically a collection of Direct Random Access Memory chips (DRAMs). The computer system memory, composed of DRAMs, can store data, but there is conventionally no intelligence in the system memory. The intelligence concerning how data is going to be stored, where the data is going to be stored, how the data is going to be read or written, etc., is provided by the xe2x80x9cmemory controllerxe2x80x9d.
The memory controller controls access to system memory, which as has been noted is typically composed of DRAMs. A DRAM can be thought of as a collection of cells, or storage locations, wherein data is stored. For simplicity it will be assumed here that each cell stores a byte, but those skilled in the art will recognize that other storage sizes are possible.
When a memory access, such as a read cycle, is engaged in, the memory controller is given an address by another device, such as a graphics controller. That address needs to correctly specify one of the cells where data is actually stored. Ordinarily, cells within DRAMs are arranged in row and column format (i.e., the cells are arranged like a matrix).
Consequently, an address, which for sake of illustration will be assumed to be 16 bits long, customarily is conceived of as being composed of two parts: a first 8-bit portion of the address which is associated with a row address, and a second 8-bit portion which is associated with a column address (again, the bit lengths are hypothetical and merely utilized here for illustrative purposes). This fragmentation of the address into row and column portions allows the address to correctly specify a storage location, or cell, by its row and column.
Conventionally, a DRAM has at least two buses, or at least hypothetically what can be treated as two buses: a data bus, and an address bus. To minimize DRAM hardware, it is customary that the address bus be only eight bits wide, in order to minimize the number of pins on the DRAM, which those skilled in the art will recognize is a major constraint or limiting factor on how small one can make a DRAM chip. Due to this limitation on the width of the address bus, memory access is typically achieved by first placing the row portion of the address on the address bus, which will select the appropriate row, and second, a short time later, placing the column portion of the address on the address bus, which will select the appropriate column. This then correctly specifies the row and column location of the storage location that is desired. At some time after the row and column information have both been specified, the data from the memory location specified by the row and column address appears on the DRAM data bus.
From the foregoing, it can be seen that in order to make a single memory access there are three phases: a row address phase, a column address phase, and a data retrieval phase. In the past, it was noticed that typical programs tend to operate sequentially, so if there is a memory address accessed, it is likely that the next memory address accessed will be the very next cell, which means that the column address is likely to change, while the row address is not likely to change. Consequently, typical DRAMs are structured such that once the row address has been driven, thereafter the DRAM responds to new addresses on the address bus as if those addresses are column indicators, and thus will use such addresses as column addresses within a current row until the DRAM is notified that a new row address will be appearing on the address bus, or the extent of the columns within the row is exceeded and a page fault occurs. DRAM devices using this scheme (driving the row once and then operating upon columns within the row) are known in the art as xe2x80x9cpage modexe2x80x9d DRAMs.
In light of the foregoing, in the event that a memory controller has several memory accesses to be done sequentially, then once a page is open it would make sense (but it is not currently done in the art) from an efficiency standpoint to examine pending as well as current memory accesses in order to determine which of those pending memory accesses will be to memory locations that are within a currently open page (that is, the row of the request is the row from which a memory controller is currently reading within a DRAM). In other words, assuming a page X is open, if there are four memory accesses A, B, C, and D, waiting to be performed, and assuming the first access A is to page Z, the second access B is to page X, the third access C is to page Y, and the fourth access D is to page W, it is preferable from a memory efficiency standpoint that the data access (i.e., access B) appropriate to the page that is open (i.e., page X) be made first.
Current memory controllers do not typically xe2x80x9clook aheadxe2x80x9d to see if certain pending memory accesses are destined for currently open pages. Furthermore, at any given time, typically more than one page of memory is generally open and in future systems this will become more likely. For example, under the Direct RDRAM scheme (not currently available, but expected to be available in the near future), it is expected that up to 8 pages per RDRAM chip will be open simultaneously. Thus, if a system has eight RDRAM chips (a reasonable assumption), it will be possible to have up to 64 pages open simultaneously.
Controlling memory access via the use of xe2x80x9clook aheadxe2x80x9d would be undeniably valuable. Furthermore, as the foregoing has shown, the prospective ability of the memory controllers to schedule memory access on the basis of look ahead is likely to become even more important in that future system memories are likely to be able to provide a very large number of open pages of memory simultaneously. It is therefore apparent that a need exists in the art for a method and system which will provide data processing systems, having memory controllers, with the ability to look ahead and intelligently schedule accesses to system memory utilizing information gained from such looking ahead.
In addition to the foregoing, it has been noted that multiple devices (e.g., one or more CPUs, PCI bus devices, 1394 devices, and network devices) communicate over various different buses in order to access data processing system memory through a memory controller. Different types of devices have different types of memory access needs as do different data buses. At present, current data processing system memory controllers do not recognize and/or utilize the differing memory access requirements of the various devices, or the different access requirements of the buses over which they communicate, in order to efficiently schedule data processing system memory access. It is therefore apparent that a need exists for a method and system which will provide data processing systems, having memory controllers, with the ability to recognize and take advantage of the varying needs of differing devices and/or the needs of the various data buses through which such devices communicate with data processing system memory.
It has been discovered that a method and system can be produced which will, among other things, provide data processing systems having memory controllers with the ability to intelligently schedule accesses of system memory. The method and system provide a memory controller having a destination-sensitive memory request reordering device. The destination-sensitive memory request reordering device includes a centralized state machine operably connected to one or more memory devices and one or more reorder and bank select engines. The centralized state machine is structured such that control information can be received from at least one of the one or more reorder and bank select engines over the one or more control lines. The centralized state machine is structured such that memory status information can be received from at least one of the one or more reorder and bank select engines over the one or more memory status lines, or such that memory status information can be determined by tracking past memory related activity. Additionally, the centralized state machine is structured to accept memory access requests having associated origin information. The centralized state machine executes the memory access requests based upon the associated origin information and the memory status information. Other embodiments function analogously, with the addition that the centralized state machine incorporates one or more device arbiter and state engines which faction as autonomous units generally dedicated to one specific system memory device. The device arbiter and state engines receive inputs similar to those discussed for the centralized state machine, except that typically at least one device arbiter and state engine is dedicated to one particular memory device, and thus generally receives memory status from the memory device with which it is associated via its dedicated memory status line.
In one embodiment, the destination-sensitive memory request reordering device includes a centralized state machine operably connected to one or more memory devices and one or more reorder and bank select engines. In another embodiment, the operable connection is achieved via one or more control lines connecting the centralized state machine to at least one of the one or more reorder and bank select engines. In another embodiment, the centralized state machine is structured such that control information can be received from at least one of the one or more reorder and bank select engines over the one or more control lines connecting the centralized state machine to the at least one of the one or more reorder and bank select engines. In another embodiment, the centralized state machine is structured to accept memory access requests having associated origin information. In another embodiment, the structure of the centralized state machine can be such that one or more specific inputs to the centralized state machine are associated with one or more specific origins of the one or more specific memory access requests. In another embodiment, the structure of the centralized state machine can also be such that the one or more specific inputs to the centralized state machine are associated with one or more best-choice registers where the one or more best choice registers are associated with one or more specific origins of one or more specific memory requests. In another embodiment, the origins can be one or more buses over which specific memory access requests traveled, or specific sources of the memory access requests. In another embodiment, the centralized state machine can be further structured such that it accepts memory requests having source information, such as the initiator, or source, of a request, the ordinal number of the request, the priority of the request, etc. In another embodiment, the centralized state machine is connected to the one or more system memory devices via memory status lines such that the state of the system memory device may be obtained. Another set of embodiments function analogous to those described above, with the addition that the centralized state machine incorporates one or more device arbiter and state engines which typically function as autonomous units generally dedicated to one specific system memory device. In another embodiment, the device arbiter and state engines receive inputs similar to those discussed for the centralized state machine, except that typically at least one device arbiter and state engine is dedicated to one particular memory device, and thus generally receives memory status from the memory device with which it is associated via its dedicated memory status line.
In an embodiment of the method and system, one or more origin-related memory access requests are received, and the one or more origin-related memory access requests are executed. In one embodiment, the one or more memory access requests received are associated with one or more specific origins. In another embodiment, the one or more memory access requests are received from one or more reorder buffers associated with the one or more specific origins. In another embodiment, the one or more memory access requests are received from one or more reorder buffers associated with one or more specific buses over which the one or more memory access requests traveled. In another embodiment, the one or more memory access requests are received from one or more reorder buffers associated with one or more specific sources from which the one or more memory access requests originated. In one embodiment, at least one of the one or more origin-related memory access requests is executed on the basis of the one or more specific origins. In another embodiment, at least one of the one or more origin-related memory access requests is executed on the basis of origin-related information. In another embodiment, the at least one of the one or more origin-related memory access requests is executed on the basis of at least one source-related informant selected from the group including at least one source indicator associated with the one or more origin-related access requests, at least one ordinal indicator associated with the one or more origin-related access requests, and at least one tag associated with the one or more origin-related access requests wherein the at least one tag includes at least one tag selected from the group including a tag indicative of the priority of the one or more origin-related memory access requests and a tag indicative of a speculative nature of the one or more origin-related memory access requests. In one embodiment, the at least one of the one or more memory access requests executed on the basis of origin-related information further entails executing at least one of the one or more speculative memory access requests in response to the status information from one or more memory devices. In another embodiment, the status information from the one or more memory devices is received from one or more DRAMs. In another embodiment, the status information from the one or more memory devices received from one or more banks of memory. In one embodiment, executing the one or more speculative memory access requests in response to the status information from one or more memory devices further entails executing the at least one of the one or more speculative memory access requests in response to the status information from one or more memory devices and the contents of the one or more memory device buffers.
The foregoing is a summary and thus contains, by necessity, simplifications, generalizations and omissions of detail; consequently, those skilled in the art will appreciate that the summary is illustrative only and is not intended to be in any way limiting. Other aspects, inventive features, and advantages of the present invention, as defined solely by the claims, will become apparent in the non-limiting detailed description set forth below.