Computing devices often have one or more processors (e.g., CPUs, GPUs, APUs, DSPs, etc.). Such processors sometimes experience periods of low use or non-use. In order to conserve power, such as battery power, it is desirable to reduce or eliminate power being provided to such processors during the period of low and/or non-use. Subsequent periods of increased use require that the processor be receive increased power, relative to the low/no power state, and be reinitialized and prepared to operate as part of the larger computing device. Such re-initialization and preparation takes time. In some instances, the time needed for such re-initialization and preparation creates a noticeable and undesirable waiting period between when processor operation is called for and when processor operation can be provided.
One such example is shown in FIG. 1. When power is to be conserved, system controller 100 removes power from processor 110, such as a graphics processor on a graphics card, by opening switch 120. Removal of power from processor 110 removes power from system management unit 130 within processor 110. Removal of power from processor 110 further removes power from memory PHY 140 of processor 110. Removal of power from processor 110 still further removes power from (and thus the data contents of) volatile memory 150 supported by processor 110. Loss of power by processor 110 also results in loss of training data that enables increased efficiency in communication between processor 110 and volatile memory 150. Similarly, loss of power by processor 110 causes the loss of training data that enables increased efficiency in communication between processor 110 and the rest of the system (such as via PCIe Bus 160).
Upon subsequent startup, re-initialization and preparation for processor 110 is conducted. An exemplary and simplified process of re-initialization and preparation is shown in FIG. 2. It should be appreciated that the times shown in FIG. 2 are intended to be exemplary and not exact. Processor 110 initially receives power via closing of switch 120. Processor 110 receives power, 200, within a few milliseconds and starts a reset process. The reset process, 210, loads onboard instructions and then begins to establish trained communication over the PCIe bus 160, at 220.
Processor 110 then proceeds to train the memory bus between processor 110 and memory 150, at 240. Once the memory bus is trained, context data and processor state data is obtained via the trained PCIe bus 160 and stored in memory 150. This context data and processor state data describe the desired state of the processor that enables use thereof. The context data and processor state data allow processor 110 to perform restoration and initialization, 250.
Once the processor state is restored, an I/O memory management unit (IOMMU) is updated, 230. One such I/O memory management unit is a graphics address remapping table (GART) (although other IOMMUs are known). The data for such an update is obtained via the trained PCIe Bus 160.
Overall, as shown in FIG. 2, various events that occur to provide a functional processor upon wake-up or power-up are performed serially. Thus, the overall time needed to achieve startup is an additive function of all the sub-parts that need to be completed. It should be appreciated that additional events and processes are expected to be required to provide a functional processor upon wake-up. At least a portion of these additional events are expected to also be performed serially and thus contribute to the overall time needed to achieve startup.
Such compounding of time can produce a delay in achieving readiness at processor 110. Alternatively, always maintaining processor 110 in a fully powered mode leaves processor 110 always at the ready without power-up delays. However, this readiness comes at the price of the power consumption needed to achieve this readiness. Accordingly, there exists a need for power savings for processors with decreased startup times when exiting a power saving mode.