1. Technical Field
The invention relates to the field of multiprocessor systems, and more specifically to the field of aligning timing signals among processors to achieve time-synchronous operation.
2. Description of Related Art
Time is important for managing the available processing resources. The number of active tasks or programs in a large SMP system may exceed the total number of hardware threads across all the processors in the system, which means not all of the programs can execute at the same time. The operating system may allocate portions of time to different sets of tasks or programs, with different durations (time slices) allocated to different tasks depending on the priority and resources required for each task. A hypervisor may partition various processor resources such that the operating system may only directly control or be affected by certain processors and memory of a SMP system. Thus the hypervisor may assist in allocating time resources as well providing various error correcting routines.
A timebase (TB) Register is used to represent time in a processor or core. The TB register is a free-running 64-bit register that increments at a constant rate so that its value can be converted to time. The TB registers are synchronized across all processors in an SMP system so that all processors in the system have the same representation of time. The TB register is a shared resource across all threads in a multi-threaded processor, and the constant rate that it increments is known to software executing on each thread. Software calculates time by multiplying the TB register value by the known incrementing rate, and adding the result to a known time offset.
In prior designs, the processor clock frequency was a known constant, so the TB register could simply increment every ‘n’ processor cycles, where ‘n’ is set depending on the desired granularity of time increment. For example, at a processor frequency of 1.0 GHz, with n=8, a TB register value of ‘000000001234ABCD’x represents 2.4435 seconds.
However, it is desirable to construct a multiprocessor from smaller processing building blocks, i.e. nodes, wherein each is able to act as a complete stand-alone computer, including a time-of-day clock.
However, it becomes increasingly unmanageable to extend a single clock as architectures continue to scale to larger collections of processing nodes. More importantly, a centralized clock or oscillator represents a single point of failure that can idle a very expensive multiprocessor environment.
A possible solution is to have separate timing clocks for disparate collections of nodes. Such architecture allows scalability on increments of one node, and avoids the overhead of infrastructure costs in small configurations. Separate nodes can be interconnected via a board or cables as well as coherency protocols on the system bus fabric between nodes, as may carried by various boards or cables. The nodes may each be asynchronous to allow slight differences in frequency between clocks that run each of them. However, a means to correct for an oscillator that is one cycle ahead of the others is necessary to meet a requirement that all processors see the same time, and that time is always increasing.
One significant limitation in known redundant clock distribution signaling is that the redundant clock often travels along the same distribution network as the primary clock signal or at least of similar conductor lengths, where same distribution network is common to all chips in the system. Such is evident from prior art methods wherein a register in each of four time of day (TOD) clock sources is incremented by a high frequency signal to achieve a TOD value resolution, which comes form a frequency-multiplied lower reference frequency signal for synchronization of the clock sources.
Unfortunately current technique lacks a reconfigurable distribution network to address the needs of modern highly scalable microprocessor networks. Further because the current technique has no distribution network of varying conductor lengths and paths, the current technique fails to compensate for timing delays that occur at end-nodes to a timing distribution networks. More specifically, since the current technique requires a common precise time reference to be distributed to all TOD register logic in the system, with the amount of skew small relative to the reference period, it does not allow combining of multiple separate processing building blocks (nodes) which can act as stand-alone computers into a single symmetric multi-processor (SMP).
Therefore, it would be advantageous to have a network to select from among two oscillators available in a large, multi-node multiprocessor configuration, where each node may also be capable of operating as a stand-alone computer with its own reference oscillator. It is also advantageous to support multiple different configurations with variable propagation delays between processing nodes and large skew relative to the reference period while detecting any oscillator that is out-of-specified limits. Moreover, it is advantageous to have redundancy in said network to avoid any single points of failure and to recover from disconnects of conductors that interrupt oscillator or other timing signals originating on another board or chip, by switching over to perhaps a redundant local oscillator, inter-chip or inter-drawer connection.
In addition, it would be advantageous to provide for recovery of parity errors that may occur in registers that store configuration information or store a time-stamp indicating a system time.
It is also desirable to dynamically re-assign the selection of the topology to allow concurrent repair of any processing node in a multi-node configuration.
Less notably, it is desirable to minimize analog circuitry such as phase-locked-loops as used in U.S. Pat. No. 5,146,585.