In typical complex data processing systems, many different processing tasks or operations are being performed by multiple sub-units of the system at any one time. Usually, all of the operations of the system are under the control of a host processor, or central processing unit (CPU) that runs a particular operating system (OS). The CPU communicates with various input/output (IO) units of the system through one or more buses, such as a peripheral component interface (PCI) bus or a PCI Express (PCIE) bus for example.
The host processor manages operations in the system, including execution of various tasks by various IO units. The IO units must necessarily communicate a lot of information with the host processor in a manner dictated by the OS. For example, when a task is executing on an IO unit and becomes stalled or interrupted, perhaps because of lack of particular data for example, the IO unit generates an interrupt to the host processor in such a way that the host processor can identify the IO unit and the cause of the interrupt, “fix” the cause of the interrupt, and signal the IO unit that the problem is fixed and that it can proceed with the task. Traditionally, interrupts are signaled by the IO unit writing a value to an interrupt status register. According to one scheme, an interrupt status register may contain multiple bits, any one of which signals an interrupt when it is a particular value, such as a logic “1” (also referred to as a logic high). The values in the interrupt status register are OR'd and if any of the values is high, an interrupt signal is sent to the host processor, which must perform a register read to determine which bit or bits are high. The host processor also must perform a register write to clear interrupts after fixing their causes. This interrupt scheme is sometimes referred to as level-based, because the level of the interrupt bit and line remains at the level set by the IO unit until the interrupt is cleared by the host processor.
IO units are often sophisticated special-purpose processors such as graphics processing units (GPUs). As greater processing speeds are demanded for complex functions, such as graphics processing or video processing, techniques for maximizing the use of system resources constantly evolve. System resources include processing units (such as GPUs), and memory. One common technique for maximizing the use of processing system resources is multi-threaded processing, which requires context switching between tasks. Resources capable of context switching encounter an event that causes the pause or stall of one task, and in response, store the current state of the stalled task (also referred to as switching out the current context) and proceed with another task (or switch in another context) while waiting for the cause of the stall to be cured, or fixed. The host processor must be informed of the stall, for example by the resource issuing an interrupt to the host, and the host is responsible for fixing the cause of the interrupt, and informing the resource that it can switch the previous context back in and continue with the operation that had been interrupted. Such interrupts are not level-based, but are typically signaled by a pulse. In addition, such interrupts do not require storage of interrupt status information in a register, or register reads and writes by the host processor.
In a context-switching environment, the traditional register-based signal-clear mechanism no longer works. This is because the interrupt status register becomes context-sensitive. Consider the following scenario: an IO unit client under a particular context generates an interrupt; a moment later the context is switched away. When an interrupt service routine (ISR) of the host comes to service the interrupt, it reads the interrupt status register which now belongs to a different context. There are race conditions in such a scenario. An interrupt delivery mechanism that is functional in context-switching systems is thus required. One mechanism for supporting context-based interrupts over a PCIE bus is called message signaled interrupt, or MSI. Another mechanism for supporting context-based interrupts over a PCIE bus is MSI-X, which is another version of MSI. MSI allows up to 32 different interrupt sources to have their unique messages without the need for the ISR to read hardware registers. MSI-X expands the number of interrupt vectors to up to 2048.
But traditional level-based interrupts should also be supported for transitional systems or system components that support only level-based interrupts and do not support MSI or MSI-X.
In order to support context-switchable environments, one cannot associate a fixed vector with a given interrupt source. One reason is that an interrupt source can be associated with different contexts. Since all interrupt events must be reserved, if an interrupt source is assigned to a fixed vector, it is possible for a vector to be overwritten by a second interrupt before the ISR processes the first interrupt associated with the first context.
Another reason the traditional register-based mechanism is impractical for complex IO units, regardless of whether context switching occurs or not, is that the number of interrupt sources keeps growing. For example, the number of different interrupt sources for a complex GPU exceeds what a single traditional 32-bit interrupt status register can accommodate. If the ISR has to read the interrupt status register, it may need to read more than one register to locate the interrupt source. Register reads from software (the OS) are slow by nature, but the ISR may have hundreds of interrupts per second to service. This creates performance concerns.
In the drawings, the same reference numbers identify identical or substantially similar elements or acts. To easily identify the discussion of any particular element or act, the most significant digit or digits in a reference number refer to the Figure number in which that element is first introduced (e.g., element 102 is first introduced and discussed with respect to FIG. 1).