200M-4Gbps Wide-range Clock and Data Recovery circuit

KISANG JUNG1, KANGJIK KIM1, GUIHAN KO1, WONKI PARK2, SUNGCHUL LEE2, SEONGIK CHO1

1The Department of Electronics Engineering
Chonbuk National University
664-14 1GA DUCKJIN-DONG JEONJU JEONBUK
SOUTH KOREA
2Korea Electronics Technology Institute

kszexes@jbnu.ac.kr, kkangjik@jbnu.ac.kr, rhlngks1@jbnu.ac.kr
wkpark74@keti.re.kr, leesc@keti.re.kr, sicho@jbnu.ac.kr

Abstract: - A design of dual loop wide-range clock data recovery circuit for multi-channel transmitter-receiver is proposed. This paper presents a delay line to prevent the increase of jitter in recovered data when the input data is wide-range, which is used greatly in multichannel transmitter-receivers. Proposed Clock and Data Recovery Circuit (CDR) frequency is 200Mbps~4Gbps, the delay steps in the proposed delay line is 7ps~12ps. Designed CDR is tested with 0.18um CMOS process.

Key-Words: - CDR(Clock and Data Recovery), Wide-range, Wireline Transceiver, RX, Dual-Loop CDR

1 Introduction

As the CMOS process developed rapidly the demand for high speed and high capacity communication between off-chips increases. Therefore high-speed serial communication is used in many interfaces such as optical communications, backplane, and high-speed communication between off-chips [1, 2]. To satisfy these demands, methods to reduce size and power consumption have been proposed, this is considered the most popular topic in today’s market [3, 4]. In order to reduce the power consumption of the transceiver, variations given to the supply voltage has been proposed as well. Also the operating frequency can be lowered when only a miniscule amount of data is needed to be transmitted. These kinds of methods and transceivers need a wide-range interface that can satisfy both high and low frequency operations. Numerous proposals have been made on wide-range Clock and Data Recovery Circuit (CDR) [5~8]. Dual loop structured CDR, based on Dual Locked Loop (DLL) is used widely in multi-channel transceivers [5~7]. Dual loop CDR is configured of two loops. One generates the clock and the other delays the phase to fit the skew. Because dual loop CDR is a one-pole system, it is stable with low system complexity and has no jitter accumulation. A dual loop CDR that uses phase interpolator (PI) generates Cycle-to-cycle jitter brought on by phase step. As the input clock lowers in frequency the Cycle-to-cycle jitter becomes greater. To minimize the Cycle-to-cycle jitter, the use of the phase average [5] and the use of delta sigma dithering method [6] have been proposed. The use of Vernier Oversampling and Alignment (VOSA) instead of PI method to shape wide-range dual loop CDR has been proposed as well [8].

This paper proposes a design of CDR using the phase interpolation method. However, when changing the wide-range input clock frequency in this CDR the phase step will not be affected. Chapter II will articulate the structure and operation method of proposed wide-range CDR, Chapter III will break down the integrated circuits in the proposed wide-range CDR. Chapter IV demonstrated the simulation done on the wide-range CDR. Last but not the least Chapter V discloses the conclusion.
2 Wide-range CDR Architecture

Fig. 1. is a block diagram of the proposed wide-range CDR. The proposed wide-range CDR receives multi phase clock from the wide-range clock generator and regenerates data by matching both the phase of the data as well as the clock. CDR using the present PI would show deteriorated linearity and larger phase steps when the input frequency is low. In the proposed design a coarse delay stage was used to divide phase delay time to prevent this from happening, shown in Fig. 2. The divided delay time would then pass through the phase selector. The selected clock then passes through the fine delay stage to minutely transfer the phase.

Fig. 2. Wide operation of coarse and fine Delay stage

2.1 Delay Stage

Fig. 3. is block diagram of the delay stage. A clock that has been delayed like the diagram shown is then selected using the phase selector. The selected clock is then transferred to the fine delay stage.

The operational frequency is resulted by quantity of delay cells in the coarse delay stage. If the number delay cells increase, the range of operational frequency becomes greater. Unfortunately this results in a larger chip area as well. This also means that less in the number of delay cells will decrease the range of operational frequency but results in more chip area. Further more if the delay time of the delay cell is greater you can achieve a greater range of operational frequency with less delay cells. The proposed delay stage consists of 16 delay cells with a unit delay time of 300ps.

Fig. 4. Timing diagram of Delay stage

The fine clock generates a clock between S0 and S1 by receiving control from the control logic then uses the DAC to transform it into current. If a clock strays from the range between S0 and S1, it is changed into the next signal to fit into the next range, shown in Fig. 4. The architecture is structured to prevent jitter at output by controlling the fine current into Seamless Switching.

The fine delay stage was designed using PI circuit. It receives two clocks as input from the most near coarse delay stage, then by the means of voltage weight it outputs to the near input clock.
2.2 Delay Stage

The multi phase clock received by wide-range CDR tunes the clock’s phase to maintain optimum state of data sample. Fig. 5. elaborates the sampling conditions in 1/5-rate Sampler. Controlled clock from the delay stage becomes sampled in the same manner as in Fig. 5. Each phase’s sampled data is sent to the phase error detector. As shown in Fig. 5, the sampled data is grouped into fives then sent to the phase error detector to detect UP or DN signals. For example if there is a change in block Clk0 and Clk1 the phase error detector would send a DN signal. If the change is in block Clk1 and Clk2 the phase error detector would output a UP signal.

Fig.4 Sampling condition of 1/5-rate sampler

Table 1. is a Truth Table for Phase Error Detector Operation (PEDO). If data change results in block Clk0, Clk1 and the sampling rate of Clk0 is zero, the sampling rate of Clk1 becomes one. In the conditions as mentioned above except when the sampled rate of Clk0 is one, the sampling rate of Clk1 becomes zero. Under these conditions PEDO is an output of DN signal. Also if the same changes occur in block Clk9 and Clk0 and the sampled rate of Clk9 is zero, the sampling rate of Clk0 becomes one and vice versa. Under these circumstances PEDO is an output of DN signal. It is designed not to send any output signal OF UP or DN if, there are no data changes in blocks Clk9 and Clk0, Clk0 and Clk1. Also when data change is resulted in both blocks Clk9 and Clk0, Clk0 and Clk1 it shows that the data frequency and the clock frequency are different therefore no signal is sent.

<table>
<thead>
<tr>
<th>Phase Error Detector Operation</th>
<th>1/5-rate sampler output</th>
</tr>
</thead>
<tbody>
<tr>
<td>Clk9</td>
<td>Clk0</td>
</tr>
<tr>
<td>Skip</td>
<td>0</td>
</tr>
<tr>
<td>DN</td>
<td>1</td>
</tr>
<tr>
<td>Skip</td>
<td>1</td>
</tr>
<tr>
<td>UP</td>
<td>0</td>
</tr>
<tr>
<td>UP</td>
<td>1</td>
</tr>
</tbody>
</table>

3 Simulation

Fig.6 JTOL simulation via Verilog-A

Fig. 6. Represents the Jitter Tolerance (JTOL) of the proposed CDR when the input data is 4Gbps and input clock jitter is 30ps. The JTOL used $2^7-1$ PRBS is tested via Verilog-A.

Fig.7 (a)Input/output property and (b)resolution of fine Delay shifter
Fig. 7. is the result of the test on fine delay stage output by code. The average resolution 9.3ps and the maximum resolution is 13ps.

Table 2. is a description of the proposed CDR’s characteristics.

<table>
<thead>
<tr>
<th>Process</th>
<th>0.18-um 1P6M CMOS</th>
</tr>
</thead>
<tbody>
<tr>
<td>Supply Voltage</td>
<td>1.8</td>
</tr>
<tr>
<td>Operating Frequency Range</td>
<td>200Mbps ~ 4Gbps</td>
</tr>
<tr>
<td>Maximum Delay Step</td>
<td>13ps @FF</td>
</tr>
<tr>
<td>Minimum Delay Step</td>
<td>7ps @FF</td>
</tr>
</tbody>
</table>

4 Conclusion
A wide-range dual loop CDR is designed to fit the need of multi-channel transceivers. The output jitter of the recovered data of the proposed CDR does not increase when operational frequency and the frequency step is lowered due to equilibrium in frequencies. Moreover the data recovery loop does not require any additional loops therefore allowing it to achieve low system complexity. The designed CDR was tested with 0.18um CMOS process, the operational frequency ranges from 200Mbps to 4Gbps. Highest delay step recorded was 12ps with the lowest at 7ps.

References:


