Commonly assigned U.S. Pat. application Ser. No. 09/275,610 is incorporated for its showing of a PCI bus bridge system for processing requests from multiple attached agents.
This invention relates to read prefetch buffers in PCI bus systems which transfer blocks of prefetched data to be read, and, more particularly, to the use of a prefetch counter in tracking and controlling the data at the PCI bus system prefetch buffers.
The Peripheral Component Interconnect (PCI) bus system is a high-performance expansion bus architecture which offers a low latency path employing PCI bridges through which a host processor may directly access PCI devices. In a multiple host environment, a PCI bus system may include such functions as data buffering and PCI central functions such as arbitration over usage of the bus system.
The incorporated ""610 application describes an example of a complex PCI bus system for providing a connection path between a secondary PCI bus, to which are attached a plurality of hosts or host channel adapters, and at least one primary PCI bus, to which are attached a plurality of peripheral devices, and allows the prefetching of data for read transactions of multiple channel adapters in parallel. The incorporated ""610 application additionally defines many of the terms employed herein, and such definitions are also available from publications provided by the PCI Special Interest Group, and will not be repeated here.
Major computer systems may employ PCI bus systems to provide fast data storage from hosts, such as network servers, via channel adapters and the PCI bus system, to attached data storage servers having storage devices, cache storage, or non-volatile cache storage.
A channel adapter (an adapter coupling a host system to a secondary PCI bus) attempts to read large amounts of data at once from a data storage or memory device (a non-volatile store or a data storage device adapter or controller processor coupled to a primary PCI bus), such as transactions of a contiguous string of a number of blocks of data. The PCI bus architecture does not define the total amount of data to be accessed in an operation, and balances the need for high throughput for a given channel adapter with the need for low latency between access by other agents by requiring that read operations be broken up. This is because read operations require time for the command to pass through the bus, time to access the data at the source of the data, and time for passage of the data back through the bus system. To allow a read operation to monopolize the bus to complete access to the total amount of the data at one time would be very inefficient and would substantially reduce the effective bandwidth of the PCI bus system.
One method for reducing latency is the prefetch operation in which data is read from a data source in anticipation that a requesting data destination will need the data. The data is read from the data source and stored in a prefetch buffer and then is read from the prefetch buffer by the requesting data destination device. Typically, the original agent requesting data is disconnected from the PCI bus after the data request and during the prefetch operation, and another agent is granted access to the PCI bus. As discussed in the incorporated ""610 application, a problem was that, while an original agent was disconnected to allow access by another agent, a read operation by the other agent would be for data having a different address, and the PCI bus system would flush any as yet unread prefetched original data as having an incorrect address, so as to make room for the desired data of the other agent. Thus, in the incorporated ""610 application, a plurality of parallel FIFO buffers are provided. Thus, read transactions of multiple agents, such as channel adapters, are allowed to be processed in parallel by allowing prefetched data blocks for different requesting agents to be stored in different parallel FIFO buffers without being flushed. The prefetched data remains buffered for later retry of a read request by the original requesting agent even if another agent issues an intervening read request. After being disconnected, the original agent then retries the read access at its allocated parallel buffer, and will be granted access after all the other requesting agents coupled to the secondary bus have been granted access. The prefetch buffer allocated to the original agent will retain the data that had been incompletely loaded and will continue to be loaded as the data is accessed. Upon the grant of one of the retries by the original agent, and completion of the major block of data, the major block is transferred to the requesting agent.
However, a prefetch operation for a lengthy stream of data may still tend to monopolize the secondary PCI bus. Hence, the PCI bus architecture, e.g., of the incorporated ""610 application, may limit single read data transfers to major blocks of data, such as eight prefetch requests, each of small blocks or groups of 512 bytes, or a total of 4K bytes.
In the prior art, the major block limit on a read prefetch may be implemented by the channel adapter, which counts the data received and stops the read request, or may be implemented by the PCI bus arbiter, which counts the transferred data and terminates the grant. In either case, a prefetch counter is required to track the amount of data remaining to be prefetched and read.
A problem is that the channel adapter typically queues a series of operations for the host system, and, if one operation is interrupted, the channel adapter sequences to the next operation, only later returning to the original operation. The incorporated ""610 application handles this problem by assigning address windows to the sources of data attached to the bus system. Thus, a separate prefetch counter may be provided for each read operation of a channel adapter, as based on the address window of that operation. Typically, during the prefetch operation, the original requesting agent will access the prefetch buffer and begin reading the contents of the prefetch buffer. If the prefetch and the read operations are conducted smoothly and continuously, after a 512 byte increment is read, a new 512 byte prefetch is begun, maintaining the queued 2K bytes of prefetched data. This will continue until all the data that has been set up in a prefetch counter has been prefetched, or the read operation is interrupted.
However, typically, the prefetch has loaded data into the prefetch buffer when the read operation is interrupted, leaving part of the data loaded in the allocated prefetch buffer.
Also, the read operation requires a delay before the data is accessed from the source device and supplied to the prefetch buffer. Additionally, not all of the data comprising a major block of data may be accessed in a single transaction, and only a portion of the data is loaded into the prefetch buffer, leading to a delay. Hence, the requesting agent is disconnected after the delay exceeds a predetermined time, and another agent allowed access, even though part of the data is loaded in the prefetch buffer.
The problem this creates is that the channel adapter may start another read operation at the same prefetch buffer. This will cause the prefetch logic to flush all the queued and unread data and start prefetching data for the new read operation. The adapter will get prefetched data for the new read operation, and then may continue with the previous read. Again, the prefetch logic will flush all the queued data and start prefetching data that was prefetched previously. As can be seen, this thrashing causes unwanted read prefetch operations and results in reduced efficiency.
As an alternative, each read could be set for only a major block of data and a new prefetch count established for each new read operation. However, establishing a prefetch count is itself an added transaction in which the prefetch count is written to a storage location of a prefetch counter. As can be seen, the unwanted transactions of establishing a new prefetch count for each major block also results in reduced efficiency.
An object of the present invention is to increase the efficiency of read operations relating to data streams of a plurality of contiguous blocks by reducing unwanted read prefetches.
A system and method are disclosed for tracking and controlling the prefetching of blocks of a data stream stored at a prefetch buffer in a PCI bus system. The PCI bus system has a plurality of PCI busses, and at least one PCI data destination which requests the read operation is coupled to a first of the plurality of PCI busses. Additionally, at least one PCI data source is coupled to a second of the plurality of PCI busses, and a prefetch buffer is provided for storing blocks of data prefetched from the PCI data source in response to a read command by the requester. The blocks of the data stream are stored at the prefetch buffer for transfer to the data destination, and the data comprises a data stream that is capable of being grouped into major blocks comprising a fixed plurality of contiguous blocks.
First and second associated prefetch count storage locations are provided for storing, respectively, first and second counts. Prefetch initialization logic initializes the first count which represents the number of blocks of data comprising up to a major block of the data, and no more than the total number of the blocks of the data stream. The prefetch initialization logic sets the second count as representing the total number of the blocks of the data stream to be prefetched and stored in the prefetch buffer, less the initialized number of blocks of the first count.
A prefetch counter is coupled to the first and the second prefetch count storage locations. As each block of data is prefetched and stored at the prefetch buffer, the prefetch counter decrements the first count by a number representing the block of data. Prefetch count logic coupled to the prefetch counter, responds to the prefetch counter decrementing the first count to zero, and stops the prefetch, allowing completion of the transfer of the prefetched stored data to the data destination. Thus, the second count represents the next remaining number of the blocks of the data stream to be prefetched, stored and transferred.
The requester can thus maintain its request for the data at the end of one major block, knowing that the next major block will be prefetched. The requester will therefore avoid requesting a different read operation and will avoid unnecessary prefetches.
Additionally, the prefetch counter, upon the completion of the transfer of the prefetched stored data to the data destination, refreshes the first count representing the number of the blocks of data comprising up to a major block of the data, and no more than the remaining second count. Then, the prefetch counter decrements the second count by the first count, the second count thereby representing the next remaining number of the blocks to be prefetched and stored. Then, again, upon the PCI bus system prefetching and storing each block of data at the prefetch buffer, the prefetch counter decrements the first count by a number representing the block of data. The prefetch count logic, again, upon the prefetch counter decrementing the first count to zero, stops the prefetch, allowing completion of the transfer of the prefetched stored data to the data destination.
For a fuller understanding of the present invention, reference should be made to the following detailed description taken in conjunction with the accompanying drawings.