This invention relates to read prefetch buffers in PCI bus systems, which contain blocks of prefetched data to be read, and, more particularly, to flushing unread stale data from the PCI bus system prefetch buffer to prevent the reading of the stale data.
The Peripheral Component Interconnect (PCI) bus system is a high-performance expansion bus architecture which offers a low latency path employing PCI bridges through which a host processor may directly access PCI devices. In a multiple host environment, a PCI bus system may include such functions as data buffering and PCI central functions such as arbitration over usage of the bus system.
The incorporated ""610 application describes an example of a complex PCI bus system for providing a connection path between a secondary PCI bus, to which are attached a plurality of hosts or host channel adapters, and at least one primary PCI bus, to which are attached a plurality of peripheral devices, and allows the prefetching of data for read transactions of multiple channel adapters in parallel. The incorporated ""610 application additionally defines many of the terms employed herein, and such definitions are also available from publications provided by the PCI Special Interest Group, and will not be repeated here.
Major computer systems may employ PCI bus systems to provide fast data storage from hosts, such as network servers, via channel adapters and the PCI bus system, to attached data storage servers having storage devices, cache storage, or non-volatile cache storage.
A channel adapter (an adapter coupling a host system to a secondary PCI bus) attempts to read large amounts of data at once from a data storage or memory device (a non-volatile store or a data storage device adapter or controller processor coupled to a primary PCI bus), such as transactions of a contiguous string of a number of blocks of data. The PCI bus architecture does not define the total amount of data to be accessed in an operation, and does require that read operations be broken up to allow access by other agents on the bus. This is because read operations require time for the command to pass through the bus system, time to access the data at the source of the data, and time for passage of the data back through the bus system. To allow a read operation to monopolize the bus for a sequence of reads to complete access to the total amount of the data at one time would be very inefficient and would substantially reduce the effective bandwidth of the PCI bus system.
Hence, the PCI bus architecture, e.g., of the incorporated ""610 application, limits access to the PCI bus. Typically, the read operation requires a delay before the data is accessed from the source device and supplied to the prefetch buffer. Additionally, not all of the data may be accessed in a single transaction, and only a portion of the data is loaded into the prefetch buffer, leading to a delay. Hence, the requesting agent is disconnected after the delay exceeds a predetermined time, and another agent allowed access, even though part of the data is loaded in the prefetch buffer.
As discussed in the incorporated ""610 application, a problem was that, while an original agent was disconnected to allow access by another agent, a read operation by the other agent would be for data having a different address, and the PCI bus system would flush any as yet unread prefetched original data as having an incorrect address, so as to make room for the desired data of the other agent. Thus, in the incorporated ""610 application, read transactions of multiple hosts (via channel adapters) are allowed to be processed in parallel by allowing prefetched data blocks for different requesting agents to be stored in different parallel FIFO buffers without being flushed. The prefetched data remains buffered for later retry of a read request by the original requesting agent even if another agent issues an intervening read request. After being disconnected, the original agent then retries the read access at its allocated parallel buffer, and will be granted access after all the other requesting agents coupled to the secondary bus have been granted access. The prefetch buffer allocated to the original agent will retain the data that had been incompletely loaded and will continue to be loaded as the data is accessed. Upon the grant of one of the retries by the original agent, and completion of building the block of data, the block is transferred to the requesting agent.
As discussed above, the PCI bus architecture does not define the total amount of data to be accessed in an operation, and does require that read operations be broken up to allow access by other agents on the bus. Thus, the number of blocks remaining of the complete transaction to be prefetched and read must be continuously tracked. In order to track the blocks of data that are accessed to build the full amount of requested data, a prefetch count is established by a prefetch counter and stored in a storage memory. The prefetch counter decrements a remainder count as blocks of data are accessed, updating the remainder count stored in the storage memory. As one example, the count comprises the number of 512K byte blocks remaining to be read for the complete transaction, and is decremented by one as each 512 byte block is read. Upon completion of the complete read transaction, the remainder count is xe2x80x9czeroxe2x80x9d, and the FIFO is empty.
However, the requesting host is often working with and updating the data as it is being read. Further, in PCI bus systems, writes are processed much more quickly than reads, in view of the wait time for reads. Thus, a first prefetch may have loaded prefetch data into the FIFO that has been updated in a parallel write operation by the requesting agent before it (the now stale data) has been read from the FIFO. It is thus possible that the requesting agent will issue a new read request for the updated data using the same beginning address as the stale prefetched data in the FIFO. Thinking that the prefetched data is the updated data, the requesting agent will read the stale data instead.
Further, the desired data may not end on the same boundary as the prefetched data, and the agent will only read the desired data, leaving the unread prefetched data still in the prefetch buffer. The requesting agents typically process data in contiguous sequence, and the original agent may subsequently request data starting at the address of the data remaining in the prefetch buffer. This data will also be stale and may have been updated by another host or by the same host via another channel adapter.
One approach for flushing stale data from a prefetch buffer was to do an extra read before doing an actual read. The extra read would have to be at an unrelated address to throw out any stale data that may have the desired address. However, as discussed above, read operations are notoriously slow, and, as the result, the actual read is slowed by an extra operation that is a read operation, hurting the performance of the PCI bus system.
An object of the present invention is to flush unread stale data from a PCI bus system prefetch buffer efficiently, avoiding performance degradation resulting from extra reads.
A system and method are disclosed for flushing stale data from a read prefetch buffer of a PCI bus system which transfers data in the form of data streams comprising a plurality of contiguous blocks. The PCI bus system comprises a plurality of PCI busses, at least one PCI data destination (such as a channel adapter) coupled to a first of the plurality of PCI busses, and at least one PCI data source (such as a data storage device) coupled to a second of the plurality of PCI busses. A prefetch buffer stores the blocks of data read from the PCI data source at a prefetch location, the blocks associated as one of the data streams in response to a read command. The prefetch buffer location may comprise one of a plurality of parallel buffers that was assigned at initialization to the requesting data destination (channel adapter). A prefetch counter posts the number blocks of the data stream to be read in response to the read command and transferred from the PCI data source to the PCI data destination, the prefetch counter posting the prefetch count at a storage location of a storage memory, the storage location of the storage memory mapped, also at initialization, to the prefetch location in the prefetch buffer.
The system for flushing stale data from the prefetch buffer comprises a transaction control key detector coupled to one of the PCI busses for sensing an unique identifier of a prefetch count write command. The transaction control loads the desired prefetch count in an addressed storage location in the storage memory. The key detector identifies the prefetch count write command by the unique identifier, and data path logic coupled to the key detector responds to the sensed unique identifier of the prefetch count write command, determining the prefetch location of the prefetch buffer as mapped from the addressed storage location of the prefetch count write command, and flushing any prefetch data at the determined prefetch location of the prefetch buffer.
In one embodiment, the prefetch count write command unique identifier comprises at least one bit of a PCI address of the prefetch count write command. The bit is outside the decode range of addresses employed for addressing the target storage memory in the PCI address, and the key detector senses the bit of the PCI address to identify the prefetch count write command.
For a fuller understanding of the present invention, reference should be made to the following detailed description taken in conjunction with the accompanying drawings.