The invention relates to an I/O (Input/Output) controller and to a method for operating an I/O controller. The I/O controller is coupled to a processing unit, e.g., a CPU, and to a memory. The I/O controller includes an I/O link interface, an address translation unit and an I/O packet processing unit.
Following the trend for virtualization in processor cores, virtualization is finding increasing adoption in the I/O space as well. Together with the trend for network adapters to provide user-level-like queue based interfaces to the consumers, mainly used for providing each virtual machine running on the system with at least one private queue for interaction with the network device, I/O virtualization support in the I/O root complex, which is usually a Peripheral Component Interconnect (PCI) Express root complex, gains increasing importance. This requires the PCI Express Host Bridge (PHB) to provide address translation capabilities, such that different physical or virtual functions of a device can access their own virtual address space safely. This is becoming an increasing challenge with the increasing line speeds of PCI Express and the high parallelism used by I/O devices that creates little spatial locality in the requests from the device and thus increases the pressure on the root complex address translation unit.
At the same time, the translation caches of the root complex need to be small in order to be able to fit multiple root complexes on a processor to support a large number of links with different link configuration. The caches can also not be shared easily between PHBs as the attached devices usually do not share the same virtual domains and therefore require their own translations and caches. In addition, as mentioned above, virtualized devices in general show little spatial and temporal locality that would improve the efficiency of the translation unit cache.
U.S. Pat. No. 7,487,297 B2 describes a method and an apparatus for performing just-in-time data prefetching within a data processing system comprising a processor, a cache or prefetch buffer, and at least one memory storage device. The apparatus comprises a prefetch engine having means for issuing a data prefetch request for prefetching a data cache line from the memory storage device for utilization by the processor. The apparatus further comprises logic/utility for dynamically adjusting a prefetch distance between issuance by the prefetch engine of the data prefetch request and issuance by the processor of a demand (load request) targeting the data/cache line being returned by the data prefetch request, so that a next data prefetch request for a subsequent cache line completes the return of the data/cache line at effectively the same time that a demand for that subsequent data/cache line is issued by the processor.
JP 2010-217992 shows a cache controller, a cache control method and a cache control program.
Further, timing local streams for improving timeliness in data prefetching are shown in the reference, Timing Local Streams: Improving Timeliness in Data Prefetching; Huaiyu Zhu, Yong Chen and Xian-He Sun; Department of Computer Science; Illinois Institute of Technology Chicago, Ill. 60616.