Many computer systems, such as personal computers, often need to communicate and otherwise interact with external devices or systems, such as networks, printers, and other I/O devices. Peripheral systems, such as network interface controllers, I/O controllers, etc., are accordingly provided within the computer system to provide an interface between the host computer processor and the external devices. Such peripheral systems and other components are typically situated to transfer data via one or more buses, where the peripheral systems are typically cards or circuit boards in the host computer system. The host processor and the peripheral systems exchange data in order to facilitate interfacing software programs running in the host processor with external I/O, including networks.
Host-computing systems, such as personal computers, are often operated as nodes on a communications network, where each node is capable of receiving data from the network and transmitting data to the network. Data is transferred over a network in groups or segments, wherein the organization and segmentation of data are dictated by a network operating system protocol, and many different protocols exist. In fact, data segments that correspond to different protocols can co-exist on the same communications network. In order for a node to receive and transmit information packets, the node is equipped with a peripheral network interface controller, which is responsible for transferring information between the communications network and the host system. For transmission, the host processor constructs data or information packets in accordance with a network operating system protocol and passes them to the network peripheral. In reception, the host processor retrieves and decodes packets received by the network peripheral. The host processor performs many of its transmission and reception functions in response to instructions from an interrupt service routine associated with the network peripheral. When a received packet requires processing, an interrupt may be issued to the host system by the network peripheral. The interrupt has traditionally been issued after either all of the bytes in a packet or some fixed number of bytes in the packet have been received by the network peripheral.
Many computer systems include a peripheral bus, such as a peripheral component interconnect (PCI or PCI-X) bus for exchanging data between the host processor and high throughput peripheral and other devices, such as memory, network interfaces, display, and disk drives. The host processor and memory can be directly or indirectly connected to the PCI bus along with other devices, such as graphic display adapters, disk controllers, sound cards, etc., where such devices may be coupled directly or indirectly (e.g., through a host bridge) to the PCI or PCI-X bus. In other configurations, the peripheral systems and the main host system memory are connected to the PCI-X bus, wherein a peripheral systems may operate as PCI-X bus master capable of direct memory access (DMA) operations to transfer data to and from the host memory. The host processor interacts with the PCI-X bus and main host system memory via a memory controller, and the host system may further include a cache memory for use by the host processor.
Direct transfer of data between a host processor and a peripheral across the host system bus is generally costly in terms of host processor utilization or efficiency and cache management. For example, during processor I/O read operations, the host processor must wait idle for the read result. Where the peripheral interrupts the host processor, for example, the host must read interrupt information form the interrupting peripheral via an I/O read operation across the system bus to retrieve interrupt information for servicing the interrupt. In many computer systems, some data is passed between the host processor and the peripherals using a shared memory in the host system, such as a main host memory connected to the system bus. In transferring certain data, the host processor and the peripheral access predetermined locations in the shared memory. In this manner, the host processor and the peripheral need not communicate directly for all information exchange therebetween.
However, in conventional computer systems, the processor and peripheral still use direct communications across the system bus to exchange a certain amount of information, such as data relating to the locations of shared memory data buffers and control and status information. In addition, in conventional shared memory designs, updating control information used to facilitate the transfer of data through the shared memory causes excessive cache data line transfers. A network controller or other peripheral system typically needs to write status or other information to the shared host system memory for consumption by the host processor. For example, a network controller peripheral may write information to a receive status location in the shared memory, which indicates the status of frame data received from a network by the peripheral.
Even where a descriptor system allows the network peripheral to write the receive status information to the system memory without host CPU intervention (e.g. using direct memory access (DMA) techniques), the status information for a received frame is generally much shorter than the length of a cache line. For instance, the receive status entry for a particular frame may be only 8 bytes long, whereas a cache line can be 64 bytes long or more. In such a case, each cache line will hold multiple status entries, and writing a single status entry results in a partial cache line write operation. In a partial cache line write, a memory/cache controller has to read the entire cache line (e.g., 64 bytes) out of the system memory, merge the new data in (e.g., merge in the 8 bytes status entry while preserving the non-updated portions of the cache line), and write the entire updated cache line (e.g., 64 bytes) back into the system memory.
Because a memory read operation, a data merge operation, and a memory write operation are required, partial cache line writes are particularly costly in terms of overall system efficiency and bus bandwidth. Partial cache line writes also disturb the normal pattern of memory access operations. In normal operation of the host system, the host processor reads and writes full cache lines. For a CPU read, if the desired data is not in the cache, then the memory/cache controller reads it from the system memory and puts it in the cache. The CPU writes data into the cache, and the cache data may be subsequently copied to the system memory when the cache is flushed to make room for other data. This type of CPU/memory interaction typically results in a stream of memory reads and a stream of memory writes. However, the read/modify/write operations associated with a partial cache line write from a peripheral disturb the normal pattern of read and write streams. This interruption in the normal data flow between the host processor and the system memory adversely impacts the system bandwidth. Therefore, it is desirable to minimize or avoid partial cache line write operations from the peripheral to the system memory.
One possible solution to avoid partial cache writes is to make the status information as large as the cache line size. However, this approach results in large amounts of wasted memory. Accordingly, there remains a need for improved data transfer methods and systems to facilitate improved bus bandwidth utilization and to improve system efficiency in transferring data from peripherals to a host memory in computer systems.