1. Field of the Invention
The present invention relates to an apparatus and method for transferring a data signal propagated along a bidirectional communication path within a data processing apparatus.
2. Description of the Prior Art
Bidirectional communication paths are often used in data processing systems. For example, such bidirectional communication paths may be used within interconnect structures that form an integral part of global communication networks in multiprocessor chips. Such bidirectional communication paths facilitate high bandwidth with low silicon overhead by eliminating the need for replicating unidirectional signal wires. Examples of interconnects using such bidirectional communication paths are described in the article by C. Park et al. entitled, “A 1.2 TB/s on-chip ring interconnect for 45 nm 8-core enterprise Xeon® processor,” ISSCC, pp. 180-181, 2010, and the article by S. Satpathy et al., entitled “SWIFT: A 2.1 Tb/s 32×32 Self-Arbitrating Manycore Interconnect Fabric,” SoVC, pp. 180-181, 2011.
Conventional bidirectional communication paths typically include a series of repeaters distributed along the communication path in order to amplify the data as it is propagated along the communication path. Typically the repeater structures are based on duplication of unidirectional repeaters, one of which is selectively activated for signal propagation. An example of such a conventional repeater structure is illustrated in FIG. 1. As shown in FIG. 1, a bidirectional communication path is provided between a first processor core 10 and a second processor core 12, that communication path being separated into a series of bidirectional communication path portions 15 by the inclusion of a plurality of repeater circuits along the bidirectional communication path. In this example, each repeater circuit comprises a pair of inverters 20, 25, one of which is activated at any point in time dependent on the contents of an associated flip-flop 30. In particular, each flip-flop is controlled by a clock signal, and on the rising edge of the clock signal samples the enable signal presented to its input, that enable signal identifying which one of the inverters 20, 25 should be activated. Accordingly, if the first processor core 10 is to send a data signal to the second processor core 12, the enable signals will be set in order to cause the inverters 25 to be activated. Conversely, if the second processor core 12 is to send a data signal to the first processor core 10, the enable signals will be set in order to cause the inverters 20 to be activated.
Whilst such repeater circuits can be formed as standard cells, and hence can be incorporated into a wide variety of different designs of interconnect, such an approach incurs logic and interconnect overhead in order to configure the repeaters, due to the need to provide enable signals to the associated flip-flops 30 in order to control the operation of the repeaters. This can significantly degrade performance and energy efficiency. Additionally, a synchronising signal in the form of a clock signal is needed to eliminate contention when reversing signal propagation direction.
Furthermore, as interconnect structures increase in complexity, the number of locations from which a bidirectional communication link can be driven is increasing, making the handling of the control signals required to configure the repeaters a significant design challenge. For example, the article by B. Stackhouse et al., entitled “A 65 nm 2-Billion Transistor Quad-Core Itanium Processor,” JSSCC pp. 18-31, Vol. 44, No. 1, January 2009, describes a complex interconnect structure employing snoop-based signalling schemes, where information regarding the direction of a data transfer is not available until approximately the same time as the data needs to be transferred. Accordingly, the need to issue control signals to configure the repeaters having regards to the direction of the data transfer before the data transfer can take place significantly impacts performance in such complex interconnect structures.
Recently, there has been a significant amount of research into the development of repeater-less signalling techniques. Examples of articles describing such repeater-less signalling techniques are:    B. Kim et al., “A 4 Gb/s/ch 356 fJ/b 10 mm Equalized On-chip Interconnect with Nonlinear Charge-Injecting Transmit Filter and Transimpedance Receiver in 90 nm CMOS,” ISSCC, pp. 66-67, 2009;    J. Seo et al., “High Bandwidth and Low Energy On-Chip Signaling with Adaptive Pre-Emphasis in 90 nm CMOS,” ISSCC, pp. 182-183, 2010;    R. Ho et al., “High-Speed and Low-Energy Capacitively-Driven On-Chip Wires,” ISSCC, pp. 412-413, 2007; and    E. Mensink et al., “A 0.28 pJ/b 2 Gb/s/ch Transceiver in 90 nm CMOS for 10 mm On-chip interconnects,” ISSCC, pp. 414-415, 2007.
These techniques generally involve the use of pulse generation circuitry at one end of the bidirectional communication path to generate a pulse, with pulse detection circuitry at the other end of the bidirectional communication path then being arranged to detect that pulse. However, whilst such techniques can achieve high-speed communication with low energy dissipation based on reduced voltage swing, the pulse generation and detection circuitry needs to be carefully custom-designed having regards to each specific interconnect situation, and involves precise device matching, additional voltage supplies, and wider wire thickness/pitch. According, such techniques cannot be easily used in synthesis-based design flows or reused in different interconnect situations in the same design.
Accordingly, it would be desirable to provide an improved technique for transferring a data signal propagated along a bidirectional communication path within a data processing apparatus, which can be used without re-design within a wide variety of implementations, but with improved performance and reduced energy consumption when compared with the traditional approach described earlier with reference to FIG. 1.