The present invention relates generally to memory subsystems in electronic devices. More particularly, the present invention relates to reducing latency in memory subsystems.
Computer systems typically comprise at least one processor, a memory subsystem, at least one system controller and one or more peripherals (such as PCI devices) operably connected by various buses, including a host bus operably connected between the processor and the system controller. The processor may include an internal level one (L1) cache. The memory subsystem typically comprises system or main memory external to both the processor and the system controller and a level two (L2) cache internal to the system controller. Together, the L1 cache and the memory subsystem (L2 cache and main memory) comprise a memory hierarchy.
The system controller includes logic for, in conjunction with the processor and peripheral devices, controlling the transfer of data and information between the processor and peripheral devices and the memory subsystem. For example, if a processor issues a read transaction, the processor will determine whether the requested data is stored in the L1 cache. If the read request is a xe2x80x9cmissxe2x80x9d in the L1 cache, during a subsequent clock cycle, the system controller will determine whether the requested data is stored in the L2 cache. If the read request is a miss in the L2 cache, during yet another subsequent clock cycle, the system controller will attempt to access the requested data in the main memory. At this point, given the relatively larger size of main memory, the slower speed of main memory, and the distance of main memory from the CPU, a number of clock cycles may be required decode the address of the read request and access the requested data in the main memory.
Thus, when accessing main memory (after L1 and L2 cache misses), the computer system experiences a relative degree of latency. This latency may be increased in multi-processor/multi-controller systems, wherein each processor and each system controller may have a respective L1 and L2 cache. In order to preserve coherency between the respective L1 and L2 caches and the main memory, respective L1 and L2 cache controllers must monitor buses within the computer system (typically the host bus) to determine if another processor or peripheral device has modified data in an L1 cache, L2 cache or main memory. If modifications have been made, the caches and main memory must be updated accordingly. Monitoring the memory hierarchy in this manner may be referred to as snooping. A snoop operation requires at least one clock cycle to perform, thus adding to the relative degree of latency within these types of computer systems.
To deal with the latency (i.e., to prevent transactions that may xe2x80x9cinterferexe2x80x9d with the memory access request until the memory access request has been completed), the computer system may interrupt, stall or insert a number of wait states into various operations and transactions. This results in a relatively slower computer system with relatively slower processing and reduced computer system throughput. Operating such a computer system is relatively time consuming and costly.
Thus, there exists a need in the art for apparatus and methods for reducing the inherent latency in accessing memory subsystem.
In still other computer systems, a system controller may have an internal or xe2x80x9cembeddedxe2x80x9d peripheral. In these computer systems, the embedded peripheral is an integral component of the system controller. The embedded peripheral may be a xe2x80x9csecondaryxe2x80x9d processor (i.e., a processor without the power, capabilities and intelligence of the main or external processor) and may be utilized to relieve the computational burden on the main processor. Because these embedded peripherals lack the sophistication of the main processor (or, for that matter, most external peripherals), in current computer systems, the embedded peripheral cannot access the memory subsystem. As such, in current computer systems, the embedded peripheral must be provided with a dedicated memory exclusively utilized by the embedded peripheral. In current computer systems, this embedded peripheral dedicated memory is external to the system controller or xe2x80x9coff chipxe2x80x9d. Providing this dedicated memory xe2x80x9coff chipxe2x80x9d adds latency to embedded peripheral""s memory accesses and consumes valuable space within the computer system. Additionally, the exclusivity of the dedicated memory decreases the versatility of the computer system.
Thus, there exists a need in the art for apparatus and methods for reducing latency in embedded peripheral dedicated memory accesses and for increasing the versatility of embedded peripheral dedicated memory.
The present invention relates to a method in a computer system, for configuring a memory subsystem, comprising selecting a subset of main memory, integrating the subset of main memory within the computer system such that the subset is physically distinct from the main memory and configuring the subset of main memory as noncacheable memory.