The present invention relates in general to processing systems, and in particular to servicing of interrupts for a multiprocessor subsystem.
Many computer systems include co-processors that support various computationally intensive features, such as graphics processing. Co-processors generally operate as slaves that receive and execute commands from a driver program executing on a central processing unit (CPU) or other master processor of the computer system. Co-processors are often operated asynchronously; that is, the CPU issues a command to the co-processor and proceeds with other operations without waiting for the co-processor to execute the command. In the course of executing commands, a co-processor may require additional services from the system (e.g., from driver programs or other programs executing on the CPU). In that event, the co-processor sends an interrupt signal to the CPU to request the needed services.
Upon detecting an interrupt, the CPU invokes a critical-priority procedure to identify the source. For example, the CPU may call interrupt servicing routines (ISRs) of various device driver programs. An ISR, as is known in the art, may be implemented as a driver program function call that tests the hardware device for which the driver is possible to detect an interrupt setting. The CPU may invoke the ISRs of various driver programs sequentially until one of the ISRs returns a signal indicating that the source has been identified or until all ISRs have been executed. (If all ISRs execute without detecting a source, the CPU may simply reset the interrupt and resume normal processing.) Typically, the CPU masks or disables all interrupts from all system components while the ISRs are executing. This effectively stalls any system component that generates an interrupt before the source of a previous interrupt has been identified.
To minimize adverse effects on system performance, the ISRs provided in hardware device driver programs are usually designed to have minimal functionality. For example, an ISR may simply identify the source of the interrupt and instruct the operating system to schedule an appropriate procedure (known in the art as a deferred procedure call, or DPC) for servicing the interrupt, then exit. The DPC, which runs in accordance with operating system scheduling rules, services the interrupt without disabling interrupts from other system components.
Recently, there has been increased interest in developing subsystems with multiple co-processors. For example, in the field of graphics processing, continually increasing demands for higher resolution and enhanced realism (e.g., for video games) has led to development of graphics processing cards that incorporate multiple graphics processing units (GPUs). These GPUs operate in parallel to render an image.
In a multi-processor graphics subsystem, each GPU typically generates interrupts independently, which tends to increase the rate at which interrupts occur. For example, two GPUs will usually generate approximately twice as many interrupts per frame as one GPU. Further, when the GPUs are performing similar operations in parallel on different data, they tend to generate simultaneous, overlapping, or duplicate interrupts. Interrupts “overlap” when a second interrupt is generated before the first interrupt is serviced by a DPC. Interrupts are “duplicates” when two GPUs generate the interrupt for the same reason (e.g., both GPUs require the same executable code). As the number of GPUs increases, so does the number of interrupts and the likelihood of simultaneous, overlapping, or duplicate interrupts.
Conventional ISRs are not scalable to multi-GPU systems. For example, the same ISR is generally invoked regardless of which GPU generated a particular interrupt. The ISR is required to identify which GPU generated the interrupt and schedule an appropriate DPC for that GPU. This increases the complexity, and therefore the execution time, of the ISR and can have an adverse effect on overall system performance because longer execution time of an ISR generally increases the likelihood that other system components will generate interrupts while the ISR is executing and be stalled. In addition, GPUs operating in parallel may tend to issue interrupts at around the same time; by the time the ISR finishes handling the first interrupt, it may immediately be needed again to handle another interrupt from another GPU. Since the ISR is invoked and executes at critical priority, delays in other (normal-priority) processing functions can be compounded, and system performance can deteriorate significantly.
Therefore, it would be desirable to provide an improved technology for handling interrupts from multiple co-processors in a more efficient, scalable manner.