This invention relates to computationally efficient clock data recovery and synchronization in systems with interconnected devices having high rates of data transfer between devices.
Several trends in system design are increasing the demand for data transfer in interconnected devices. The volume of data requiring transfer between digital devices are constantly increasing, driving a requirement for higher data transfer rates. As an example, more digital signal processing applications are being performed by field programmable gate arrays (FPGAs) instead of conventional digital signal processors. System designs require more interconnections between FPGA devices and between FPGA devices and other devices. In the past, interface architectures among multiple devices have typically used parallel connections. The major drawback of parallel connections among multiple processing devices is a proliferation of input/output (I/O) pins. To avoid this, the trend in current designs is to use high speed serial links. Currently, device connections within a system are migrating from parallel backplanes to serial links.
Digital devices inherently require a clock for timing internal and external operations. For serial links, synchronization of the clocks at the transmitting device and the receiving device is critical for successful data transfer. A loss of synchronization may jeopardize the integrity of the data. The clock provides the time base used to control the transfer of digital information. Reliable link design for two chips on a board include source synchronous design, where the transmitting device, or source, provides the data and a clock signal. The receiving device then synchronizes to the received clock signal. For communication between boards, there is a need to minimize the number of wires, so a separate clock signal on a separate wire is not used. Instead, the transmitting device embeds the clock signal in the data. The receiving device recovers the clock signal embedded in the received signal.
Recovering the clock signal is referred to as clock data recovery (CDR) or clock recovery. CDR is required for two basic purposes: first, to establish a timing signal to sample the incoming pulses or signal and second, to transmit outgoing pulses or signal at the same rate as that of the incoming signal. A CDR module includes a phase lock loop (PLL). The PLL locks to the frequency and phase of an input signal and generates an output signal that is synchronized to the input signal. This output signal can be used as a clock signal. For this discussion, “clock” and “clock signal” both refer to a timing signal. Also for this discussion, a “clock source” generates a clock signal that is independent of any other clock signal in the system.
A system that includes an analog to digital converter (ADC) connected to a digital processor is a preferred implementation for many applications. Typically, an ADC is on the same board as the digital processor. This is a disadvantage for some applications. For example, in a digital radio system, transferring the received analog signal from an antenna or analog front end (AFE) to the ADC requires expensive cables for radio frequency (RF) signals. Positioning the ADC near the AFE improves signal reception and would allow the received analog signal to be digitized and transmitted the over a lower cost digital link. However, the digital signal processing portion would require a computationally expensive CDR module for the digital link. It would be advantageous to avoid consuming the resources of the digital signal processing portion of the system with CDR operations. For applications that transfer data between a remote ADC and a digital signal processing device, efficient clock recovery would conserve system resources.
Clock data recovery is an important component of communication systems and data networks. There are two major strategies for clock synchronization, one used in telecommunication systems, described herein as the telecom model, and the other used in data networks, described herein as the datacom model. Both models include CDR modules for synchronization on both sides of the communications link.
In the telecom model, a typical arrangement for a communication system includes a master station and one or more slave stations, each station including a transmitter (TX) and a receiver (RX). The master station and the slave station each include a CDR module for the received signal. The master includes a clock source for its TX. The master transmits the clock signal from the clock source in addition to data embedded in the transmitted signal. The slave station includes a CDR module to synchronize to the received clock signal. The output of the CDR module is the recovered RX clock signal. The slave station uses the recovered RX clock signal to synchronize its TX clock signal. The synchronized TX clock signal is used for timing the slave's transmit signal. At this point, the slave's RX clock frequency matches the TX clock frequency. However, when the master station receives the signal transmitted from the slave, the master still has to synchronize to the phase of the received data, even though the clock frequency is matched. The phase offset is due to propagation delay that is a function of connection length and other distortions. The telecom model is prevalent the digital telephone network and wide area network (WAN) architectures.
In the datacom model, such as in Ethernet networks, the TX and RX of each station have independent timing control. Each station includes a clock source that produces a clock signal that is embedded in the transmitted signal. Each station also includes a CDR module that locks to the embedded clock signal of the signal received from the other station.
In current architectures using the telecom model or the datacom model, a digital device requires CDR for every communication channel with any remote device. Each CDR module includes a computationally expensive PLL. For example, current commercially available FPGAs include various numbers of PLL resources, where low end devices having 0 to 2 PLLs and high end devices having 4 to 8 PLLs. In addition, not all FPGA PLLs have the same capability. In specialized high end FPGAs, some of the PLLs are capable of supporting CDR functionality. Other FPGA devices have PLLs with lesser functionality that do not support CDR operations. For systems with multiple interconnected devices, the cost and complexity for communication among devices increases for every channel. This produces scalability problems, where the overhead for communication becomes prohibitive. The implementation of a PLL on an FPGA is physically large on the die. Therefore, the PLL is the most expensive resource on the FPGA. There is a need to reduce the number of PLLs required for communication in systems having interconnected FPGAs and other devices. Reducing the requirements for PLLs conserves the most precious resource of the FPGA, thus reducing the cost and complexity of the system. Similarly, for device implementations using an application specific integrated circuit (ASIC), digital signal processor, microcontroller or microprocessor, reducing the number PLLs conserves resources for other application tasks. The present invention addresses this need and others as described below.