Circuit designers of multi-Gigabit systems face a number of challenges as advances in technology mandate increased performance in high-speed components. At a basic level, data transmission between high-speed components within a single semiconductor device or between two devices on a printed circuit board may be represented by the system 10 shown in FIG. 1. In FIG. 1, a transmitter 12 at a transmitting device 8 (e.g., a microprocessor) sends data over a transmission channel 14 (e.g., a copper trace on a printed circuit board or “on-chip” in a semiconductor device) to a receiver 16 at a receiving device 9 (e.g., another processor or memory). When data is sent from an ideal transmitter 12 to a receiver 16 across an ideal (lossless) channel, all of the energy in a transmitted pulse will be contained within a single time cell called a unit interval (UI).
However, real transmitters and real transmission channels do not exhibit ideal characteristics, and the effects of transmission channels are becoming increasingly important in high-speed circuit design. Due to a number of factors, including, for example, the limited conductivity of copper traces, the dielectric medium of the printed circuit board (PCB), and the discontinuities introduced by vias, the initially well-defined digital pulse will tend to spread or disperse as it passes along the channel 14. This is shown in FIG. 2. As shown, a single ideal positive pulse 20 is sent by the transmitter 12 during a given UI (e.g., UI0). However, because of the effect of the channel 14, this data pulse 20 becomes spread 21 over multiple UIs at the receiver 16, i.e., some portion of the energy of the pulse is observed outside of the UI in which the pulse was sent (e.g., in UI−1 and UI1). This residual energy outside of the UI of interest may perturb a pulse otherwise occupying either of the neighboring UIs in a phenomenon referred to as intersymbol interference (ISI).
Due to several factors associated with the complexity in designing, building, and testing such circuitry, it is a common practice in the art of integrated circuit design to simulate the operation of a circuit using a computer system. Simulation software allows the circuit designer to verify the operation and margins of a circuit design before incurring the expense of actually building and testing the circuit. Simulation is particularly important in the semiconductor industry, where it is generally very expensive to design and produce a given integrated circuit. Through the use of simulations, design errors or risks are hopefully identified early in the design process, and resolved prior to fabrication of the integrated circuit.
A traditional approach used to simulate the transmission of a signal down a channel required the designer to first produce an input waveform to the channel to be simulated. Such an input waveform would typically comprise a number of UIs, representing a bit stream. To ensure that the simulation would assess a significant amount of variation in the bit stream, the produced bit stream would typically comprise a random, or at least pseudo-random, sequence of logic levels (e.g., 00100101101011 . . . ). Typically, the goal of channel simulation is to assess whether the system 10 in question can reliably transmit and receive bits at a suitable Bit Error Ratio (BER); permissible error rates in modern-day systems might comprise 10−12 (i.e., one bit error in a trillion) or less.
To be able to resolve BERs in a statistically significant fashion, the number of UIs in the produced input waveform would have to be even higher than the inverse of the BER, for example, at least 1013 cycles or so. Moreover, a realistic simulation would preferably not assume that the logic states in the input waveform were at perfect voltage levels, nor would it assume that transitions between logic states would always occur with perfect timing or with a uniform slew rate. Thus, the designer, using various methods or computerized tools, might additionally seek to add amplitude or timing variation to the input waveform. When one considers the large number of UIs required in the input waveform in light of the BER, and the desirability of adding variation to the input waveform, production of the input waveform using traditional techniques is difficult, and very computationally intensive. Memory in the computer system used by the designer could easily be exhausted, and computer simulation times for an input signal of such great length could easily be prohibitive.
Because of the impracticality of production and simulation of an input waveform in this manner, the industry has turned to various forms of statistical signaling analysis (SSA). An example of SSA is disclosed in B. Casper et al., “An Accurate and Efficient Analysis Method for Multi-Gb/s Chip-to-Chip Signaling Schemes,” 2002 Symposium on VLSI Circuits Digest of Technical Papers, pp. 54-57 (2002), which is submitted in the Information Disclosure statement accompanying the filing of this disclosure, and which technique is summarized in FIGS. 3A-3E.
Casper's technique assumes a particular transfer function, H(s)chan, for the channel 14, which transfer function models the capacitance, resistance, and other parameters of the channel. By entering such transfer function information and other modeling information into a computer system, as is typical, the effects of the channel 14 on an idealized positive pulse 20 are simulated, resulting in a positive pulse response 21. An example positive pulse response 21 is seen in further detail in FIG. 3A, and is described as a function X. As was the case in FIG. 2, the majority of the energy of the distorted positive pulse 21 occurs in UI0, which corresponds to the UI of the ideal positive pulse 20, and which may be referred to as the cursor UI for short. Some energy also occurs before UI0, e.g., in unit intervals UI−1 and UI−2, which may be referred to as pre-cursor UIs. Likewise, some energy occurs after UI0, e.g., in unit intervals UI1 and UI2, which may be referred to as post-cursor UIs.
The positive pulse response 21, X, may be described as a series of discrete points, each referenced to a particular time ‘i’ in the unit intervals. Index ‘i’ is shown in FIG. 3A such that the points are roughly in the middle of each UI, but this is merely illustrative. These points may be modeled as a series of delta functions occurring at each of the UIs, as shown in the equation at the top of FIG. 3A, with each delta function being scaled by the magnitude of the positive pulse response 21 at that UI. Such delta function scaling is commonly utilized in digital signal processing sampling theory. Viewed more simply, and as is more convenient for simulation in a computer system, the positive pulse response 21 may also be characterized as a vector containing each of the magnitude components (e.g., [ . . . X(i)−2, X(i)−1, X(i)0, X(i)1, X(i)2 . . . ] or [ . . . −0.025, 0.15, 0.75, 0.2, −0.15 . . . ] to use the voltage values actually illustrated). How many magnitude terms are used, or how long the vector will be, is a matter of preference, but would logically incorporate the bulk of the positive pulse response 21. More terms will improve the accuracy of the analysis to follow, but will require additional computing resources.
Also shown in FIG. 3A is a zero response 22, Z, which characterizes the transmission of a logical ‘0’ across the channel. As can be seen, this zero response 22 assumes that the channel 14 has no effect, and as such the resulting magnitude values Z(i) are all set to zero. Although seemingly uninteresting, the zero response 22 is used in Casper's technique along with the positive pulse response 21 to generate statistics regarding receipt of data at the receiver 16, as will be seen below.
From the positive pulse response 21 and the zero response 22, i.e., from vectors X(i) and Z(i), Casper's technique derives a probability distribution function (PDF) at time T as shown in FIGS. 3B and 3C, which PDF(i) is meant to simulate where the receiver 16 could statistically expect to see signal voltage values occurring at the end of the channel 14 assuming repeated sampling at a fixed time interval. Casper's technique uses convolution to derive the PDF(i), as illustrated in some detail in FIG. 3B, and more specifically involves a recursive convolution of various pairs of corresponding terms X(i) and Z(i) in the positive pulse response 21 and the zero response 22. Take for example the terms corresponding to the cursor UI X(i)0 and Z(i)0. Because these terms both occur within the same UI, UI0, they are written in FIG. 3B as a pair (X(i)0, Z(i)0) or (0.75, 0) to use the actual illustrated values. This pair recognizes that the receiver could expect to see a value of 0.75 if a logic ‘1’ was transmitted, or a value of zero if a logic ‘0’ was transmitted, and assumes that only one sample is taken during the UI and that in a random data stream reception of either of these values are equally probable. Thus, this pair can be represented as a PDF having two delta functions, one each at values 0.75 and 0, and each having a magnitude of 0.5 (50%). Likewise, and working with the pre-cursor interval pairs first, the next pair (X(i)−1,Z(i)−1) or (0.15, 0) can also be represented as a PDF having two delta functions. These two pairs can then be convolved as shown, resulting in yet another PDF illustrating the now four possibilities for the received voltages (0, 0.15, 0.75, and 0.9), each with a probability of 0.25 (25%). Convolution (represented herein using an asterisk symbol ‘*’) is a well-known mathematical technique for cross-correlating two functions, and is assumed familiar to the reader. Convolution is a linear operation, and therefore relies on the mathematical assumption that the system under analysis is linear, time-invariant (LTI), a well-known and common assumption. Introduction of system nonlinearities introduces errors during the calculation process. It should be understood that the PDF resulting from the convolution is appropriately scaled to achieve a sum total probability of 1.
This resulting PDF can then be convolved with a third pair of terms (X(i)−2, Z(i)−2) or (−0.025, 0), resulting in a new PDF with eight values, each with probabilities of 0.125 (12.5%), and so on until all of the pre-cursor pairs have been convolved. Thereafter, and as shown in the formula in FIG. 3B, the post-cursor pairs are similarly recursively convolved, until all pairs of interest have been treated. (It bears noting here that convolution is commutative, and therefore it does not matter in which order the various pairs are convolved). Eventually, when all of the pairs of terms have been recursively convolved, the result is a final PDF at time ‘i,’ as illustrated in FIG. 3C. Because an actual PDF, as calculated this way in a computer system, will likely have discrete values, curve fitting can be used to arrive at a PDF which is smooth, as shown in FIG. 3C. As would be expected, the resulting PDF is bi-modal, comprising two lobes corresponding to the received voltages for the transmission of a logic ‘1’ and/or ‘0’ across the channel 14, which again are assumed to be transmitted with equal probabilities, such that each lobe encompasses an area of 0.5 (50%). Although the PDF lobes, as illustrated in FIG. 3C, appear Gaussian, the actual resulting shape will depend on the particulars of the channel 14 being simulated.
(It should be noted that the length of Vectors X(i) and Z(i) factor into the computation time because they define how many convolution operations are carried out. But they do not explicitly determine the length of the vectors being convolved. For example, when convolving (X(i)0, Z(i)0) or (0.75, 0) with (X(i)−1,Z(i)−1) or (0.15, 0), the actual vectors being convolved are comprised of unit delta functions at locations 0 and 0.75 and 0 and 0.15 with zeros inserted between the delta functions. These zeros act as placeholders for possible convolution data outputs and are determined by the location of the non-zero delta function divided by the desired voltage resolution. For example, assuming a voltage resolution of 5 mV, the representation of (X(i)0, Z(i)0) or (0.75, 0) would have 0.75/0.005=150 place holder zeros between the delta functions, for a total vector length of 151.
Once the PDF is determined for a particular time ‘i’, ‘i’ can be changed, allowing for new terms X(i) and Z(i) to be determined from responses 21 and 22, and for a new PDF to be determined. The cumulative effect is illustrated in FIG. 3D, which shows the PDFs as determined for different values of ‘i’ across the cursor UI. As would be expected, the lobes of the PDFs are sharper and better separated near the center of the UI, signifying that the resolution at the receiver 16 between logic ‘1’ and ‘0’ is statistically easier in such areas. Toward the edges of the UI, the lobes are closer and broader, indicating that the resolution at the receiver 16 between logic ‘1’ and ‘0’ is statistically more difficult.
These PDFs in sum allow the reliability with which data is received at the receiver 16 to be analyzed. Such data also allows sensing margins 25 to be set, and Bit Error Ratios to be deduced. For example, on the basis of the PDFs illustrated in FIG. 3D, it may be decided that the receiver 16 should sample received data anywhere between t1=45 ps to t2=55 ps within the UI, and use a reference voltage between Vref1=0.34 and Vref2=0.41V to discern between logic ‘0’s and ‘1’s, because the statistics of the PDFs indicate an acceptable Bit Error Ratio (e.g., no more than 1 error in 1012 bits) within these margins 25. As such, Casper's technique is similar in nature to “eye diagrams” (FIG. 3E) also used to assess data reception reliability, and to set appropriate sensing margins. See, e.g., U.S. Patent Application Publication 2009/0110116, discussing eye diagrams in further detail. In an eye diagram, successive UIs of a simulated or measured received signal (usually, a random bit stream) are overlaid to see where the signal occurs, and where a clear “eye” exists within the margins. To generate an eye diagram prior to fabrication, the designer must simulate the data transmission over millions-to-trillions of cycles to arrive at statistically significant Bit Error Ratios. Casper's technique, by contrast, doesn't require randomizing the input data, and thus provides a simpler method to, in effect, generate an “eye” to characterize a channel without the need for simulation of an actual randomized bit stream of data. Instead, only simulation of the transmission of a single ideal positive pulse 20, and analysis of the resulting positive pulse response 21, is needed.
Thus, Casper's technique requires the simulation of only a single ideal pulse, and otherwise extracts the necessary statistical information need to generate an eye diagram (and hence a BER) from the received response. This differs greatly from the traditional approach below which required the generation and simulation of large, computationally difficult, input waveforms. Extensions of Casper's basic technique are also disclosed in U.S. patent application Ser. Nos. 12/838,144 and 12/838,120, both filed Jul. 16, 2010, both of which are owned by the present assignee and incorporated herein by reference.
However, Casper's technique can also be computationally difficult because of the recursive convolution involved. Assume that recursive convolution is to occur with respect to a vector having N terms (or, more precisely, N (X(i), 0) terms, where N equals the number of unit intervals of interest in the positive pulse response for example), and a total length of K (where zero placeholder values have been inserted into the vector as discussed above). A Fast Fourier Transform (FFT) algorithm can be used to perform the recursive convolution calculation, which requires 2M*log 2M+M mathematical operations, where M is the next power of two greater than K. This technique is known, and is therefore only briefly described: the vector is converted to the frequency domain; an element-by-element multiplication of the resulting terms in the spectra is performed; and the resulting spectrum is converted back into the time domain using an inverse FFT. Because this must be repeated until all N terms are accounted for, the total number of computations can be approximated as N*(2M*log 2M+M).
Even with the benefit of an FFT, the number of calculations required in Casper's technique can still be extensive. Significant channel distortion effects such as ISI can often be felt in UIs distant from the cursor UI, and therefore the vector involved in the recursive convolution can have many terms. Moreover, and as pointed out about, the recursive convolution needs to be performed for different values of the time index ‘i’ across a UI. As a result, the total computation time further scales by the number of locations “i” across the UI. There can be 100 or more such calculations for different values of ‘i’ to render a smooth and statistically-significant eye diagram, although the exact number used can vary in accordance with designer preferences or in accordance with the desired resolution needed to arrive at statistically-significant BERs. In any event, the point here is that the computation involved in SSA techniques can require significant computing resources. Thus, like traditional techniques, SSA techniques can also be limited by the processing speed and memory in the designer's computer system.
The inventors have discovered methods for improving the speed and memory efficiency of SSA techniques involving recursive convolution, and such details are discussed herein.