Many modern consumer electronic devices are actually miniaturized computers. The hand-held smart phone, music player, tablet, video game system or other consumer electronic device that you may be carrying in your pocket or purse has more computing power than bygone era so-called “mainframe” computers that occupied entire rooms. The trend of packing even more computing power into smaller and smaller packages is likely to continue as even more electronic components can be further miniaturized and placed onto the same integrated circuit.
In response to user demand, many computing devices now have advanced computer graphics generation and display capabilities. Such graphics capabilities permit the computing device to generate complex images in real time. From video streaming to video games, from geographical mapping to photography, graphics capabilities have become essential.
While it is theoretically possible for a general purpose central processing unit (“CPU”) to generate complex graphics given enough time, real time graphics imposes timeliness constraints that have caused system designers to include additional graphics processing units (“GPUs”) specialized for graphics functions. Such GPU's typically include specialized hardware designed specifically to very rapidly perform complex computationally-intensive operations required to generate graphics images.
FIG. 1 which shows a typical system architecture including a CPU 10 and a GPU 11B. For example, a typical GPU 11B may include a transformation and lighting unit, shading hardware, texture mapping hardware, pixel processing hardware and much more. Some systems may include several CPU's 10 and/or several GPU's 20.
To control GPU 11B, the CPU 10 typically issues or writes a “display list” to the GPU. The display list is typically a list or series of graphics commands or instructions that tells the GPU 11B what to do. This is a little like the front office of a company issuing instructions to the manufacturing department asking for a certain number of widgets of a particular type to be manufactured. In this case, the GPU 11B is manufacturing a single image frame for display as specified by the display list. The CPU 10 can deposit working materials (data) into shared memory 11e and use the display list to tell the GPU 11B where to find them and what to do with them. The GPU 11B will then dutifully do what the CPU 10 tells it to do by following those instructions to the letter and manufacture a frame of video based on the instructions.
In the world of computer graphics, the GPU 11B is typically held to a very strict time schedule called the frame or update rate. Generally speaking, in interactive computer graphics, the display must be updated quickly enough so that the human eye does not see any delays but instead interpolates between frames to create the illusion of continuous motion. If the GPU redraws or renders new images that quickly, (i.e., 30 or 60 times every second), then the user sees no discontinuities and the illusion of continuous animated motion is preserved.
In part because of the relatively loose coupling between CPU 10 and GPU 11B, it is sometimes possible for the GPU to fall behind or otherwise be unable to maintain its tight schedule. In particular, suppose CPU 10 sends the GPU 11B bad or corrupt display list instructions. The GPU 11B may try to diligently do precisely what the CPU 10 has instructed the GPU to do, but if the instructions are bad, the GPU can get confused or even completely stuck. Often, because the GPU 11B is optimized to do precisely what it is instructed to do very quickly, it has no capability to deviate from the instructions, seek clarification from the CPU 10 or otherwise compensate for bad instructions.
This would be like the front office asking the manufacturing department to make something that is impossible to make or which requires parts that are not on hand. In a corporate environment, manufacturing could report back to the front office that the instructions cannot be followed and instead request clarification and/or new instructions. However, in the computing world, it can sometimes happen that a GPU 11B given poor instructions can simply lock up or “hang.” Meanwhile, if the CPU 10 assumes the GPU 11B is performing as requested while the GPU is actually stumped and has gotten stuck, the CPU needs to do something to help the GPU out of its impossible situation.
As a general matter, it is known to automatically reset a GPU when a fault is detected. For example, in certain Nintendo video game systems such as the GameCube, some developers in the past wrote application programs that would check for proper GPU activity and initiate a GPU hardware reset if a fault was detected. However, since such techniques were generally left up to the developer; there was no automatic way to reset the GPU if the application was poorly designed or the application did not take GPU fault situations into account.
It is known to use a so-called “watchdog” circuit or function to monitor and remedy system faults. For example, suppose a circuit such as a microprocessor is programmed or designed to periodically emit a signal. A so-called “watchdog timer” can be used to monitor the signal and determine whether it is present. If the “watchdog timer” does not see the periodic signal when it is supposed to be produced, the “watchdog timer” can take an appropriate action (e.g., including initiating a hardware reset of the CPU or other circuit).
Thus, while much more has been done in the past to monitor hardware for faults, further improvements are desirable. In particular, it would be desirable to accommodate both fault testing performed by a well or conservatively-designed application while also providing a fallback operation that does not depend on such testing performed by the application while nevertheless being compatible with such application testing.
The example, non-limiting technology herein provides an advantageous watchdog function that deduces faulty operation of a graphics processing unit from historical-indicating parameters while also accommodating more active testing performed by an application. When a GPU fault is detected, the example non-limiting technology rapidly resets the GPU during an interframe time so the GPU is ready to process new frame instructions or display lists and avoiding missing or skipping further frames.