The repeater last stage exhibits large gain at the transition period that amplifies the stage internal noise sources as well as preceding stages noise. When transition completes, the gain drops and output noise also drops drastically. Thus the average noise is low compared to transition noise. Jitter at the output of the repeater is thus given by , where Vn,rms is the root mean square noise at t0. Clock edges are sampling the noise every clock cycle, hence we only need to integrate noise from 0 to fclk/2. Building a simulation model for the cable link shown in Fig.1.2 is a necessary step to estimate clock accumulation and jitter of the synchronous link, and thus predict its performance. The most accurate way to perform this simulation, given the strong non-linear and time variantnature of the clock path is to use a transient noise analysis, where the different noise sources inside the SPICE model are internally replaced by a transient random sources that satisfy the power density and bandwidth of the original noise source. For a white noise source, this requires transient simulation step to be small enough to sample the highest frequency components of the noise source, and simulation time needs to be long enough to take at least one full cycle of lowest frequency components of noise. Those requirements are known obstructions for circuit designers and hinder practicality of transient noise simulation to cable links. On top of that,vertical grow system long simulation time needed to propagate the clock across the cable section. For instance, for CAT 7 cables, delay is almost 4ns/meter.
For a 100 meter cable, this is a 400ns of simulation time that contains no information. Such simulations would typically take hours to a day. On the other hand, Steady state analysis could also be used for this purpose where noise analysis is solved as a small signal analysis on top of a linear time variant solution of the circuit, as we explained in the previous section. However, a circuit that contains cable model represented by S-Parameters is very tough to solve with state of the art steady state simulators, in particular when those models exhibit excessive delays as in the case with cable model. Instead, we propose using a fast linear time variant model to estimate jitter accumulation and power of the clock forwarded cable link. Fig. 3.6 shows a block diagram of the clock forward link expressed in freq. domain transfer functions. The cable is a linear time variant element so it’s expressed as Hc, where l is the cable section length. As explained in section 3.2.1 the clock repeater transfer function can be expressed in linear time variant model as Hr, and it’s output noise power spectral density expressed by Srn, where t0 is the observation time set as the zero crossing point of the output clock. For most practical considerations we can assume that the clock driver output has a 50% duty cycle and fast edges compared to clock period. The Fourier expansion of such a clock contains only odd harmonics n scaled by 2n 1 +1 . The harmonics are filtered by the cable transfer function before being applied to the next clock repeater. A single cable and repeater section can thus be expressed as shown Fig. 3.6 with a voltage source VS representing the filtered clock signal.For most practical cases, the clock signal at the output of the clock repeater has fast rise and fall times i.e. negligible fraction of the clock cycle. It’s at this window of rise and fall time where the repeater has non zero gain for noise and signal. The rest of the clock cycle, gain is almost zero.
As a result, the linear time variant impulse response can be expressed as an almost ideal train of narrow impulses in the time domain. The frequency domain transfer function which can obtained by equation consists of identical side bands over a wide range of frequency much bigger than the repeater bandwidth. Fig. 3.7 shows the frequency domain transfer function of the clock repeater, Hr, with 5 side bands obtained by a Steady state simulation of the clock repeater. As can be seen from the figure the side bands have the same magnitude and shifted by a frequency that’s double the clock frequency. This is because the train of linear time variant impulses repeat for the rising and falling edges of the clock, i.e. The Fourier fundamental frequency is twice the clock frequency.The aforementioned analysis proposes a fast and accurate model to estimate jitter accumulation in repeater based synchronous links. The cable used in the link design is a CAT7 cable with a channel response shown in Fig 3.9. The cable has a 2.2dB attenuation per meter at the Nyquist frequency of 12Gbps. Only a single repeater needs to be simulated to obtain LTV transfer function and output noise for a given driver amplitude and cable section length. Fig. 3.10 shows the jitter accumulation profile at the end of a 100m cable for a 500mV clock driver swing, for different cable length sections and clock frequencies. As frequency increases, loss of clock amplitude and slope inside the cable increase which results in SNR degradation. A smaller clock amplitude also implies that earlier stages inside the repeater possess more gain, as depicted in Fig. 3.3, which causes more noise contribution from those stages. Consequently, more degradation of SNR and jitter increases with the increase of clock frequency. A similar effect occurs with increasing cable section length. Fig.3.10 shows that 40ps RMS jitter is observed at end of 100m if we used 19m cable section length, and 800MHz Clock frequency. An amount of jitter that’s practically intolerable by a receiver at the other end of the cable. Increasing the clock amplitude increases the SNR and reduces jitter on the expense of total clocking power. Fig. 3.11 shows the total repeating clocking power, excluding clock multiplication, needed to meet a 4ps RMS jitter requirement at the end of the 100m cable.
As expected, power increases when cable section length and clock frequency increase, to compensate for SNR loss and jitter accumulation. Figures 3.10 and 3.11 suggest that a shorter cable section length and lower clock frequency are favorable for lower jitter accumulation along the entire cable. Shorter cable section means more number of sections needed to meet the required distance. Thus, more connectors are needed to connect cables to the repeaters which adds cost to the link and poses more mechanical week points. Detailed analysis of this issue is beyond the scope of this work, but generally less number of cable sections are needed to achieve the required length. On the other hand lowering the clock frequency reduces jitter accumulation because of less SNR degradation inside the cable but this doesn’t come without a price. The lower the clock frequency,indoor weed growing accessories the larger the multiplication ratio needed inside the CMU in fig. 1.2 to multiply the clock up to the data rate. To understand the impact of large CMU multiplication ratio on the performance of the link we need to have a closer look at jitter accumulation inside the CMU. Fig. 3.12 shows the RMS jitter observed as time elapses from some reference edge inside a typical ring oscillator. Jitter accumulates indefinitely inside an open loop oscillator with the square root of observation time. When the VCO is used inside a PLL CMU, the jitter accumulation plateaus at an observation time approximately equals to the CMU time constant. For over damped PLLs which are commonly used in repeater and jitter filtering applications, the PLL time constant is inversely proportional to the loop bandwidth. Uncorrelated jitter is amplified at frequencies inversely proportional to clock-data delay. This suggests that filtering of high frequency jitter is advantageous in clock forwarded systems to mitigate uncorrelated jitter accumulation. Because the jitter filtering element is inserted only in the clock path, jitter filtering bandwidth should be controlled to track correlated jitter and pass it un-filtered, meanwhile it rejects uncorrelated high frequency jitter. There are several candidates for jitter filtering. For instance, a tuned clock buffer where a differential inductor is used at the clock amplifier filters the phase noise around the center frequency. While effective, the main disadvantage is the large silicon area for the inductor needed at propagated clock frequency. A 5nH differential inductor that resonates with 5pF cap at 1GHz can easily consume 300×300µm2 . Additionally, an LC-based filter does not accommodate a wide range of frequencies easily without needed large varactors that can compromise the filter performance. Another widely used jitter filtering circuit is a cleanup PLL with the appropriate bandwidth. A PLL is already needed for the CMU and hence with proper design may serve both purposes. In the example above with a filter bandwidth of 75MHz, a cleanup PLL with similar bandwidth can be difficult to over-damp due to the delay within the loop. Furthermore, with a cascade of PLLs in the clock repeaters results in accumulated peaking of the PLL transfer function. Sufficient damping of the transfer function is very challenging with wide tracking bandwidths and results in jitter amplification near the PLL bandwidth.
This work proposes a third option of using delay elements to implement a finite impulse response phase filter to perform the high-frequency jitter filtering. As shown in Fig. 3.15, a first-order phase FIR needs a delay and summation. The summation can be implemented as a phase interpolator as described in the next section. The filter resembles phase averaging used in implementing a DLL in, where the phases of the delay cells are added to average timing mismatches. Fig. 3.16 compares different filtering approaches for uncorrelated jitter and absolute jitter . The analysis assumes a CAT7 cable link with 13m clock repeating distance, 250mV clock amplitude and observing jitter at the end of the 100m cable. The plot shows the impact of filtering with clock forwarding at clock frequencies ranging from 200MHz to 800MHz. The uncorrelated jitter is a filtered version of the absolute jitter. The reference is without any filtering and the decorrelation between clock and data stems from the 1 clock cycle delay inside the clock multiplication unit and noise from the clock repeaters. Additional filtering can reduce the high frequency noise but at the expense of further decorrelating the clocks and hence the filter is designed for high bandwidth. The LC-tuned amplifier design uses an inductor with quality factor of 4 at 1GHz. The PLL design assumes a bandwidth of 1/10 of the input frequency and 60o phase margin. The FIR filter design is a 1+αD first order filter at each repeater with the delay set at one clock cycle. The FIR zero falls at half the clock frequency. As shown in both figures, a PLL has superior performance at lower frequencies due to a higher order of its filter but suffers at high frequencies due to the peaking in its transfer function. The FIR and LC filtering have very similar performance making the FIR an attractive option for a low-area implementation. Fig. 3.17 compares FIR and PLL filtering with different noise sources. In mixed signal environments, supply noise from on-chip switching activity and external noise coupled to the chip can be a dominant component to the total output noise. This noise generally has a high-pass or band-pass characteristic due to high frequency capacitive and inductive coupling or behavior of the PLL. The FIR filter approach matches the PLL filtering performance at low frequency but outperforms the PLL at high frequency. As shown in Fig. 3.15, a simple first order FIR has a delay of 1 clock cycle. We opportunistically observe that with the proper architecture the CMU for frequency multiplying and generating the sampling clock can produce this delay. By injecting the reference clock edge into the VCO, MDLLs do not accumulate jitter in comparison with VCO-based PLLs for data sampling. The divided output of the MDLL has an intrinsic delay of 1 clock cycle between the input and the feedback clocks and has an all-pass transfer function. To implement the FIR, at the output of the MDLL, we insert a phase interpolator that takes as inputs the incoming reference clock and feedback clock of the CMU.