1. Field of the Invention
The present invention relates to a data processing apparatus and method for translating a signal between a first clock domain and second clock domain.
2. Description of the Prior Art
Within a data processing apparatus, it is common for certain components to be clocked at different clock frequencies to other components. For example, considering a system-on-chip (SoC), a component such as a central processing unit (CPU), which requires high performance, may be clocked at a higher frequency than certain other components of the system, for example a bus infrastructure, a memory system, a peripheral, etc, which either do not require such high performance, or are incapable of operating at the higher clock frequency.
As a result, a number of different clock domains may existing within the data processing apparatus, each clock domain having an associated clock period. Typically the various different clock periods are generated from a single clock generator that uses divider circuitry or the like to generate a number of different clock periods that are synchronously related to each other. For the purpose of the present application, when describing two different clock periods as being synchronous with each other, this means that at periodic intervals the sampling edge (typically the rising clock edge) of both clock periods are coincident (within the tolerances of the system).
In such systems, a mechanism needs to be provided for translating signals between different clock domains, such that a signal issued by one component operating in one clock domain can be received by another component operating in a different clock domain. In systems where two different clock domains have clock periods that are synchronous with each other, a clock enable signal is often generated in the faster of the two clock domains to provide a timing indication about the sampling edge in the slower clock domain, so that in the faster clock domain it can be determined when an occurrence of the sampling edge of the faster clock is coincident with the sampling edge of the slower clock. If a register is provided in the faster of the two clock domains to register a signal issued from the slower clock domain, then the above-mentioned clock enable signal can be used as an enable signal to that register to ensure that the signal issued from the slower clock domain to the faster clock domain is sampled only on the sampling edge of the faster clock period that is coincident with the sampling edge of the slower clock period. As a result, the output delay associated with the signal originating in the slower clock domain can be constrained against the slower clock period. During this output delay time, the output from the component in the slower clock domain cannot be guaranteed to be stable, and accordingly should not be sampled in the fast clock domain. By use of the clock enable signal, it is ensured that the signal is in fact not sampled at a time when the output is still unstable, since the earliest it is sampled in the fast clock domain is one slow clock period after it is issued by the component in the slow clock domain.
Similarly, a signal originating in the faster clock domain that must be translated to the slower clock domain can be arranged so that its value is only permitted to change on the sampling edge of the fast clock period that is coincident with the sampling edge of the slow clock period. Again, such a property can be ensured by using a clock enable signal as discussed above. By such an approach, it can be ensured that the input delays of the signal translated to the slower clock domain can be constrained against the slow clock period. Such input delays result from propagation delays over the path in the slow clock domain, combinatorial delays associated with combinatorial logic in the path, and setup delays associated with the component in the slow clock domain that is to receive the signal.
Whilst the above described mechanism is an effective mechanism for translating signals between two different clock domains, the mechanism adds significant latency to the transfer of the signal, because a transition of the signal in the fast clock domain that occurs following a sampling edge of the fast clock must be delayed until the next sampling edge of the slow clock, as illustrated schematically in FIG. 1.
In particular, FIG. 1 shows the transfer of a valid signal from a fast clock domain to a slow clock domain. The valid signal is asserted at the point indicated by the numeral 5 in FIG. 1, and if it were to be sampled by another component in the fast clock domain, it would be sampled at time 7 shown in FIG. 1 (i.e. at the next rising edge of the fast clock). However, the transition at point 5 is only allowed to be output into the slow clock domain as the asserted valid slow signal at point 30, namely at the time where the rising edge of the fast clock is coincident with the rising edge of the slow clock. As described earlier, this is indicated by the set clock enable signal 10, and the presence of the set clock enable signal 10 whilst the valid fast signal 20 is asserted causes the valid slow signal to be asserted at point 30. However, it is only on the next rising edge of the slow clock, i.e. at point 45 in FIG. 1, that the asserted valid slow signal can be sampled in the slow clock domain. Further, when the clock enable signal is again asserted at point 40 in the presence of an asserted valid slow signal, this causes both the valid fast and the valid slow signals to be de-asserted shortly after point 45 as indicated by transitions 47, 50, respectively, given that the valid slow signal was sampled at point 45. The timing between points 7 and 45 in FIG. 1 represents the worst case for the additional latency introduced as a result of the fast to slow clock domain transition, since the valid signal was asserted at a time when the fast and slow clocks were coincident.
In the right-hand side of FIG. 1, the best case for the additional latency added as a result of the fast to slow clock domain transition is indicated. Here, the valid fast signal is asserted one fast clock period before a time at which the fast and slow clock periods are coincident. Accordingly, at time 75, the presence of the asserted clock enable signal 60 and the asserted valid fast signal 70 causes the valid slow signal to be asserted at point 80. Then, one slow clock period later, at point 97, the valid slow signal is sampled in the slow clock domain and the presence of the next asserted clock enable signal 90 in the presence of an asserted valid slow signal causes the valid fast and valid slow signals to be de-asserted at points 93, 95, respectively.
As is apparent from FIG. 1, even in the best case, the additional latency added as a result of the fast to slow clock domain transition equates to one full slow clock period.
FIG. 2 shows an alternative scheme that has been developed with the aim of seeking to reduce the latency when translating signals between two different clock domains. In accordance with this scheme, a signal originating in the faster clock domain that must be translated to the slower clock domain is allowed to change on any sampling edge of the fast clock, regardless of whether that sampling edge is coincident with a sampling edge of the slow clock. Hence, as shown in FIG. 2, if at point 105, the valid fast signal is asserted as indicated by the line 120, then the rising edge 110 of the fast clock will cause the valid slow signal to be asserted at point 130, this occurring one fast clock period before the next sampling edge of the slow clock period. Accordingly, the valid slow signal will be sampled in the slow domain at point 145. Further, the presence of the asserted clock enable 140 in the presence of an asserted valid slow signal will cause the valid fast and valid slow signals to be de-asserted at points 147, 150, respectively. As a result, it can be seen that for this best case, the additional latency resulting from the fast to slow transition is merely a single fast clock cycle.
Considering the worst case as shown in the right-hand side of FIG. 2, if the valid fast signal 170 is asserted one fast clock period before the point where the sampling edges of the fast and slow clocks are coincident, then on the next sampling edge 160 of the fast clock, the valid slow signal 180 can be asserted, as indicated by the point 155 in FIG. 2. However, it is not until the next rising edge of the slow clock at point 185 that the valid slow signal will actually be sampled in the slow clock domain and the presence of the next asserted clock enable signal 190 in the presence of an asserted valid slow signal will cause both the valid fast and valid slow signals to be de-asserted at points 193, 195, respectively. Hence, in this worst case scenario, the additional latency equates to one slow clock period. A similar mechanism can also be adopted for signals being sent from the slow clock domain to the fast clock domain, to again reduce the latency associated with the transition.
Whilst such a mechanism undoubtedly reduces the latency associated with the transition between the clock domains, a problem with such an approach is that the input delays of the signal translated to the slower clock domain (or the output delays of a signal translated to the fast clock domain) must be constrained against the fast clock period since in the case of the smallest latency there is only one fast clock period between the signal being asserted and the signal being sampled. This imposes a significant number of constraints in the design of the components in the slow clock domain, and in many situations it is difficult to meet the required timing in the slow clock domain. Indeed, such an approach is feasible only in a limited set of cases, for example where the inputs/outputs in the slow clock domain component are registered and are physically close to the fast clock domain outputs/inputs.
In practice, timing closure may not be achievable when using such a scheme, and it then becomes necessary to add additional registering and associated logic in the slow clock domain to seek to address this problem. Such additional registering and logic increases area, power consumption, and design iteration, and further can cause latency to be increased to such a level that it can negate the potential latency reductions associated with the technique of FIG. 2.
Accordingly, it would be desirable to provide an improved technique for translating a signal between two clock domains which enables a reduction in the latency when compared with the standard prior art approach of FIG. 1, and which does not suffer from the problems associated with the technique as illustrated in FIG. 2.
U.S. Pat. No. 5,809,336 describes a microprocessor system having a CPU clocked by a ring oscillator variable speed clock. An input/output interface is independently clocked by a second clock connected thereto.