# A 10 GBPS 4-PAM CMOS SERIAL LINK TRANSMITTER WITH PRE-EMPHASIS by Minghai Li B.Eng, North University of China, China, 1996 A thesis presented to Ryerson University in partial fulfillment of the requirement for the degree of Master of Applied Science in the Program of Electrical and Computer Engineering. Toronto, Ontario, Canada, 2006 © Minghai Li, 2006 PROPERTY OF RYERSON UNIVERSITY LIBRARY UMI Number: EC53600 ### INFORMATION TO USERS The quality of this reproduction is dependent upon the quality of the copy submitted. Broken or indistinct print, colored or poor quality illustrations and photographs, print bleed-through, substandard margins, and improper alignment can adversely affect reproduction. In the unlikely event that the author did not send a complete manuscript and there are missing pages, these will be noted. Also, if unauthorized copyright material had to be removed, a note will indicate the deletion. UMI Microform EC53600 Copyright 2009 by ProQuest LLC All rights reserved. This microform edition is protected against unauthorized copying under Title 17, United States Code. ProQuest LLC 789 East Eisenhower Parkway P.O. Box 1346 Ann Arbor, MI 48106-1346 ### **Author's Declaration** I hereby declare that I am the sole author of this thesis. I authorize Ryerson University to lend this thesis to other institutions or individuals for the purpose of scholarly research. Signature V p I further authorize Ryerson University to reproduce this thesis by photocopying or by other means, in total or in part, at the request of other institutions or individuals for the purpose of scholarly research. Signature ## **Instructions on Borrowers** Ryerson University requires the signatures of all persons using or photocopying this thesis. Please sign below, and give address and date. ### A 10 GBPS 4-PAM CMOS SERIAL LINK TRANSMITTER WITH PRE-EMPHASIS Master of Applied Science 2006 Minghai Li Electrical and Computer Engineering Ryerson University ### Abstract This thesis presents the design of 10 Gbps 4-PAM CMOS serial link transmitters. A new area-power efficient fully differential CMOS current-mode serial link transmitter with a proposed 2/4-PAM signaling configuration and a new pre-emphasis scheme is presented. The pre-emphasis in the analog domain and the use of de-emphasis approach decrease pre-emphasis power and chip area. The high-speed operation of the transmitter is achieved from the small voltage swing of critical nodes of the transmitter, shunt peaking with active inductors, multiplexing-at-input approach, the distributed multiplexing nodes, and the low characteristic impedance of the channels. The fully differential and bidirectional current-mode signaling minimizes the noise injected to the power and ground rails and the electromagnetic interference exerted from the channels to neighboring devices. A PLL containing a proposed five-stage VCO is implemented to generate multi-phase on-chip clocks. The proposed VCO minimizes the phase noise by keeping a constant rising and falling time. Simulation results demonstrate that the current received at the far end of a 10 cm FR-4 microstrip has a 4-PAM current eye width of 185 ps and eye height of 1.21 mA. It consumes 57.6 mW power with differential delay block, or 19.2 mW power with inverter buffer chain. The total transistor area of the transmitter is 26.845 $\mu m^2$ excluding the delay block. ## Acknowledgments I would like to express my gratitude to Dr. Fei Yuan for being an outstanding advisor and excellent professor. His constant encouragement, support, and valuable suggestions made this work possible and successful. He has been everything one could want in an advisor. I am deeply indebted to my defense committee members Dr. Alagan Anpalagan, Dr. Xavier Fernando, and Dr. Gul Khan for their time and effort in reviewing this work. My thanks also go to Mr. Jason Naughton and Mr. Daniel Giannitelli for their support and help on Cadence and computer systems. I have enjoyed a warm atmosphere in MCS group and valuable discussion with my colleagues in the research group. Special thanks go to Jean Jiang and Bendong Sun who helped me a lot in the early stage of my research. I am deeply and forever indebted to my parents for their love, support and encouragement through my entire life. It is certainly hard to find a word to express my gratitude to my wife Elyn, who provides me with a constant support and trust. # Contents | 1 Introduction | | | ion | 1 | |----------------|-----|--------|----------------------------------------------------|------------| | | 1.1 | Motiv | ration | 1 | | | 1.2 | Contr | ibutions | 4 | | | 1.3 | Organ | nization | $\epsilon$ | | 2 | An | Overv | riew of Serial Links | 7 | | | 2.1 | A Typ | pical Serial Link Architecture | 7 | | | 2.2 | Signal | ling | 9 | | | | 2.2.1 | Voltage-Mode Signaling | 9 | | | | 2.2.2 | Current-Mode Signaling | 10 | | | 2.3 | Limita | ations | 11 | | | | 2.3.1 | Electronic Limitations | 11 | | | | 2.3.2 | Transmission Medium Limitations | 14 | | | | 2.3.3 | Multi-Phase Clock Generation | 16 | | | 2.4 | Design | Techniques | 17 | | | | 2.4.1 | Voltage Mode and Current Mode | 17 | | | | 2.4.2 | Multi-Level Pulse Amplitude Modulation | 19 | | | | 2.4.3 | Transmitter Pre-emphasis and Receiver Equalization | 19 | | | | 2.4.4 | Active Inductors | 22 | | | 2.5 | Summ | ary | 24 | | 3 | High-speed Serial Link Transmitter Design | | | <b>2</b> 5 | | |---|----------------------------------------------|-------------------------------------------|------------------------------------------------------------|------------|--| | | 3.1 | 1 An overview of Serial Link Transmitters | | | | | | | 3.1.1 | Transmitter with Inverter Driver | 27 | | | | | 3.1.2 | Transmitter with LVDS Driver | 28 | | | | | 3.1.3 | Transmitter with Open-Drain Driver | 29 | | | | | 3.1.4 | Transmitter with Class AB Driver | 30 | | | | 3.2 | .2 10 Gbps Current-Mode Transmitters | | | | | | | 3.2.1 | $V_{DD}$ -Insensitive Transmitter | 31 | | | | | 3.2.2 | 4-PAM Transmitter with Current-Mirror Driver | 36 | | | | | 3.2.3 | 2/4-PAM Transmitter with Class AB Driver | 46 | | | | 3.3 | Summ | ary | 47 | | | 4 | Tra | $_{ m nsmitt}$ | er Pre-emphasis | 50 | | | | 4.1 Pre-emphasis - A State-of-the-Art Review | | | | | | | | 4.1.1 | Pre-emphasis in Digital Domain | 52 | | | | | 4.1.2 | Pre-emphasis in Analog Domain | 53 | | | | | 4.1.3 | Pre-emphasis Using Pseudo-nMOS Multiplexer | 54 | | | | 4.2 | Power | -Area Efficient Pre-emphasis in Analog Domain | 55 | | | | | 4.2.1 | Power-Area Efficient Current-Mode Pre-Emphasis Transmitter | 56 | | | | | 4.2.2 | The Multiplexer | 57 | | | | | 4.2.3 | The Pre-amplifier and Driver | 59 | | | | | 4.2.4 | The Pre-emphasis | 62 | | | | | 4.2.5 | 4-PAM Current-Mode Pre-Emphasis Transmitter | 65 | | | | | 4.2.6 | Simulation Results | 66 | | | | | 4.2.7 | Transmitter Layout | 70 | | | | 4.3 | Summ | nary | 71 | | | 5 | Mu | lti-pha | se Clock Generation | <b>7</b> 4 | | | | 5.1 | Phase | -Locked Loops | 74 | | | | | 5.1.1 | Loop Dynamics | . 75 | |----|-------|---------|-------------------------------|------| | | 5.2 | Buildi | ng Blocks | . 77 | | | | 5.2.1 | Phase-Frequency Detector | . 77 | | | | 5.2.2 | Charge Pump | . 78 | | | | 5.2.3 | Voltage-Controlled Oscillator | . 81 | | | | 5.2.4 | Frequency Divider | . 85 | | | | 5.2.5 | Simulation Result | . 86 | | | 5.3 | Summ | ary | . 86 | | 6 | Con | clusion | ns | 89 | | | 6.1 | Conclu | usions | | | | 6.2 | | e Work | | | Bi | bliog | raphy | | 92 | | Gl | ossar | ·y | | 98 | # List of Tables | 2.1 | Skin depth of some interconnects at 100 MHz and 5 GHz | 15 | |-----|-------------------------------------------------------------------------------|----| | 3.1 | Output impedance of inverter drivers | 28 | | 3.2 | Comparison of none-saturated, saturated class AB, and open drain transmitter. | 38 | | 3.3 | Output of 2-bit DACs and transmitter | 39 | | 3.4 | Output of 4-PAM transmitter | 46 | | 4.1 | 4-PAM current-mode transmitter with pre-emphasis | 66 | | 4.2 | Circuit parameters ( $L$ =0.13 $\mu m$ is used for all transistors) | 67 | | 4.3 | Performance of transmitter | 71 | # List of Figures | 1.1 | Backplane system cross-section indicating different sections of the signaling | | |------|--------------------------------------------------------------------------------|----| | | path | 2 | | 1.2 | The response of a FR4 cable to a 100ps wide pulse | 3 | | 1.3 | The effect of inter-symbol interference | 3 | | 1.4 | Pulse response from a pre-emphasized transmitter | 4 | | 1.5 | An example of 4-PAM signaling symbols | 5 | | 2.1 | Structure of serial links | 8 | | 2.2 | Voltage-mode signaling scheme. | 9 | | 2.3 | Voltage at the far end of a 4-ns line when a logic level "1" is applied at the | | | | near end of the cable | 10 | | 2.4 | Current-mode signaling scheme | 10 | | 2.5 | (a) Reduction EMI with bi-directional signaling scheme, (b) EMI in a single- | | | | ended signaling scheme. | 11 | | 2.6 | Eye diagram showing limitations of bit time | 12 | | 2.7 | The noise sources in a typical PLL | 13 | | 2.8 | Lumped LCRG model of transmission line | 14 | | 2.9 | Skin effect of transmission lines | 15 | | 2.10 | Frequency dependance of cable loss | 16 | | 2.11 | Clock generation using a 3-stage ring oscillator | 17 | | 2.12 | EMI cancellation in LVDS current-mode driver | 18 | | 2.13 | Transmitter pre-emphasis and receiver equalization. | 20 | | 2.14 | Effect of 3-tap pre-emphasis, $a_1, a_2$ and $a_3$ are the pre-emphasis coefficients | 20 | | |------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----|--| | 2.15 | 4-PAM eye diagrams, a) slow transition, b) sharp transition | | | | 2.16 | Active inductor and its small-signal equivalent circuit. $g_m$ , $C_{gs}$ and $C_{gd}$ are | | | | | transistors' trans-conductance, gate-source and gate-drain capacitances, re- | | | | | spectively, $\mathcal{C}_L$ is the total capacitance associated with the driving node | 23 | | | 2.17 | (a) Active inductor equivalent circuit, (b) active inductor bode plot | 24 | | | 3.1 | (a) Voltage-mode transmitter. (b) Current-mode transmitter | 26 | | | 3.2 | Transmitter with inverter driver | 27 | | | 3.3 | Transmitter with LVDS driver | 28 | | | 3.4 | Transmitter with open drain driver | 29 | | | 3.5 | Transmitter with class AB driver | 30 | | | 3.6 | Full rail-to-rail N-to-1 multiplexer with inductive shunt peaking. $W_S =$ | | | | | $\overline{W}_S = 10 \ \mu\text{m}, \ W_{LP} = \overline{W}_{LP} = 3 \ \mu\text{m}, \ W_{LN} = \overline{W}_{LN} = 1.5 \ \mu\text{m}, \ W_{(N-1)} = W_{(N-2)} = 1.5 \ \mu\text{m}$ | | | | | $W_{(N-3)}=2~\mu{\rm m},~R_{in}=7~K\Omega,~L=0.13~\mu{\rm m}$ is used for all transistors | 32 | | | 3.7 | Class AB driver. Circuit parameters : $W_{a1}=W_{a2}\!=\!5~\mu\mathrm{m},~W_{a3}=W_{a4}\!=\!10$ | | | | | $\mu \text{m}, W_{N} = 5 \mu \text{m}, W_{P} = 10 \mu \text{m}, W_{DN} = 10 \mu \text{m}, W_{DP} = 10 \mu \text{m}, R_{a1} = R_{a1} = 4 K\Omega,$ | | | | | $R_{a3}=R_{a4}{=}1~K\Omega,J_P{=}0.8$ mA, $J_N{=}1$ mA, $L{=}0.13~\mu{\rm m}$ for all transistors | 33 | | | 3.8 | Voltages of nodes $A, \overline{A}, B, \overline{B}, C, \overline{C}$ of the proposed transmitter. $V_{DD}$ is varied | | | | | from 1.1 V to 1.3 V with step 0.1 V | 35 | | | 3.9 | Voltage at nodes $B$ and $\overline{B}$ of the proposed transmitter. $V_{DD}$ is varied from | | | | | 1.1 V to 1.3 V with step 0.1 V | 36 | | | 3.10 | Output current of the class AB serial link transmitter in [6]. $V_{DD}$ is varied | | | | | from 1.1 V to 1.3 V with step 0.1 V | 36 | | | 3.11 | Output current of the proposed transmitter. $V_{DD}$ is varied from 1.1 V to 1.3 | | | | | V with step 0.1 V | 37 | | | 3.12 | Output current of the open-drain transmitter. $V_{DD}$ is varied from 1.1 V to | | | | | 1.3 V with step 0.1 V | 37 | | | 3.13 | Received eyediagram after 5 cm FR4 cable | 38 | | | 3.14 | N-to-1 fully differential 4-PAM current-mode transmitter with inductive shunt | | |------|---------------------------------------------------------------------------------------------------------------------------------------------------------|----| | | peaking. $W_1=\overline{W}_1=6~\mu\text{m},W_2=\overline{W}_2=9~\mu\text{m},W_3=\overline{W}_3=18~\mu\text{m},W_4=\overline{W}_4=15$ | | | | $\mu \text{m}$ , $W_5 = \overline{W}_5 = 30 \ \mu \text{m}$ , $K_1 = 3$ , $K_2 = 1.5$ , $K_3 = 2$ , $R_s = 500 \ \Omega$ , $L = 0.13 \ \mu \text{m}$ is | | | | used for all transistors | 39 | | 3.15 | Fully differential 2-bit digital-to-analog converter<br>(DAC) $W_d=\overline{W}_d$ =0.8 $\mu{\rm m},$ | | | | $W_{d+1} = 3 \ \mu\text{m}, \ W_{A1} = W_{A2} = W_{A3} = W_{A4} = 0.6 \ \mu\text{m}, \ W_{A5} = 1.5 \ \mu\text{m}. \ L = 0.13 \ \mu\text{m}$ | | | | is used for all transistors | 40 | | 3.16 | Selection pulse generation | 40 | | 3.17 | Voltage at the multiplexing node $A$ with passive peaking inductors. The | | | | inductance of the shunt-peaking inductors is varied from 0 nH to 20 nH with | | | | step 5 nH | 43 | | 3.18 | Output current of the transmitter with passive peaking inductors. The induc- | | | | tance of the shunt-peaking inductors is varied from 0 nH to 20 nH with step | | | | 5 nH | 43 | | 3.19 | Voltage at the multiplexing node $A$ with active peaking inductors. The width | | | | of transistor forming the active inductor is varied from 5 $\mu$ m to 30 $\mu$ m with | | | | step 5 $\mu\mathrm{m}$ | 44 | | 3.20 | Output current of transmitter with active peaking inductors. The width of | | | | transistor forming the active inductor is varied from 5 $\mu m$ to 30 $\mu m$ with step | | | | 5 $\mu \mathrm{m}$ | 44 | | 3.21 | Voltage at the multiplexing node $A$ when active peaking inductors. The width | | | | of transistor $M_5$ and $\overline{M}_5$ is 60 $\mu\mathrm{m}$ . The width of transistor forming the active | | | | inductors is varied from 5 $\mu$ m to 30 $\mu$ m with step 5 $\mu$ m | 45 | | 3.22 | Output current of the transmitter when active peaking inductors. The width | | | | of transistor $M_5$ and $\overline{M}_5$ is 60 $\mu\mathrm{m}$ . The width of transistor forming the active | | | | inductors is varied from 5 $\mu m$ to 30 $\mu m$ with step 5 $\mu m$ | 45 | | 3.23 | Total current $i_{dd}$ drawn from $V_{dd}$ and output current of the 4-PAM current- | | | | mode transmitter with active peaking inductors | 46 | | 3.24 | Configuration of the 4-PAM current-mode transmitter with active peaking | | |------|-------------------------------------------------------------------------------------------------------------------|----| | | inductors | 47 | | 3.25 | (Left): Current output of driver J, (Right): Current output of driver 2J | 48 | | 3.26 | 4-PAM current output of transmitter | 48 | | 4.1 | Serial link transmitter with digital pre-emphasis | 53 | | 4.2 | Serial link transmitter with analog pre-emphasis. | 54 | | 4.3 | Serial link transmitter with pseudo-NMOS multiplexer pre-emphasis | 55 | | 4.4 | Architecture of the proposed area-power efficient pre-emphasis serial link trans- | | | | mitter | 56 | | 4.5 | Fully differential multiplexer | 57 | | 4.6 | 2-PAM serial link transmitter driver with pre-emphasis | 60 | | 4.7 | 2-PAM signaling pre-emphasis analysis | 63 | | 4.8 | 4-PAM signaling pre-emphasis analysis | 63 | | 4.9 | Output current of 2-PAM serial link transmitter with pre-emphasis | 64 | | 4.10 | Architecture of 4-PAM serial link transmitter | 65 | | 4.11 | Voltage of multiplexing node of fully differential multiplexer | 68 | | 4.12 | Voltage of the critical nodes of pre-amplifier and driver, $V_{PN}, \overline{V}_{PN}, V_{PP}, \overline{V}_{PP}$ | | | | lags $V_N, \overline{V}_N, V_P, \overline{V}_P$ by $T_{sym}=200$ ps | 68 | | 4.13 | 4-PAM transmitter output current and current drawn from $V_{DD}$ , Left: without | | | | pre-emphasis, Right: with pre-emphasis and inverter buffer chains | 69 | | 4.14 | 4-PAM transmitter output current and current drawn from $V_{DD}$ with pre- | | | | emphasis and differential pair delay block | 69 | | 4.15 | Output current of 4-PAM transmitter. Left - with pre-emphasis $V_{ctrl,n}{=}1.0~V$ | | | | and $V_{ctrl,p}$ =0.2 V). Right: $V_{ctrl,n}$ is varied from 0.6 V to 1.0 V, and $V_{ctrl,p}$ | | | | varied from 0.6 V to 0.2 V with step 0.2 V | 70 | | 4.16 | Eye diagram of the received current after 10 cm FR-4 cable. Left - without | | | | pre-emphasis; Right - with pre-emphasis ( $V_{ctrl,n}$ =0.8 $V$ , $V_{ctrl,p}$ =0.4 $V$ ) | 70 | | 4.17 | Layout of the proposed 4-PAM transmitter with pre-emphasis | 73 | | 5.1 | (a) XPR/LPF type PLL, (b) Charge-pump type PLL | 74 | |------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----| | 5.2 | PLL configuration | 75 | | 5.3 | Linear model of charge-pump type PLL | 76 | | 5.4 | Phase-frequency detector diagram | 77 | | 5.5 | D-FlipFlop PFD clock diagram and characteristic | 78 | | 5.6 | Simulation result of DFF phase-frequency detector | 79 | | 5.7 | Block diagram of charge pump. | 80 | | 5.8 | Charge pump implementation. $\overline{M}_{1,2}=8~\mu\text{m},~\overline{M}_{3,4}=15~\mu\text{m},~\overline{M}_{5,6}=40~\mu\text{m},$ | | | | $L{=}0.18~\mu\mathrm{m}$ is used for all transistors | 81 | | 5.9 | Proposed ring VCO delay cell. $\overline{M}_{1,2}{=}10~\mu\text{m},~\overline{M}_{3,4}{=}15~\mu\text{m},~\overline{M}_{5,6}{=}5~\mu\text{m},$ | | | | $\overline{M}_{7,8}{=}25~\mu\mathrm{m},\overline{M}_{9,10}{=}15~\mu\mathrm{m},\overline{M}_{11,12}{=}30~\mu\mathrm{m}~L{=}0.18~\mu\mathrm{m}$ is used for all transistors. | 83 | | 5.10 | Cross-coupled ring VCO with active inductor loads in [40, 41]. $\overline{M}_{1,2}$ =10 $\mu$ m, | | | | $\overline{M}_{3,4}{=}5~\mu\mathrm{m},~\overline{M}_{5,6}{=}15~\mu\mathrm{m},~L{=}0.18~\mu\mathrm{m}$ is used for all transistors | 83 | | 5.11 | Comparison of the rise and fall times of the proposed ring VCO and the | | | | cross-coupled ring VCO with active load [40, 41] | 84 | | 5.12 | Output voltage waveform of proposed VCO with a single delay loop | 84 | | 5.13 | Frequency tuning range of 5-stage proposed ring VCO, $W{=}10~\mu\mathrm{m}$ for the | | | | control pMOS transistor | 85 | | 5.14 | Divide-by-four frequency divider (a) Block diagram, (b) Timing diagram. The | | | | width of NMOS and PMOS transistors in inverters are 2.5 $\mu m$ and 5 $\mu m$ | | | | respectively, all transistor width for transmission gate is 4 $\mu m$ , 0.18 $\mu m$ is | | | | used for all transistor length | 86 | | 5.15 | Simulation result of PFD | 87 | | 5 16 | PLL control voltage reference clock and VCO output clock | 88 | # Chapter 1 # Introduction ### 1.1 Motivation We have witnessed the integration of computer systems into a more global context of information technology system in the past decade. Moore's law allows every new generation of CMOS device's performance to increase at an exponential rate, allowing more computations to occur within chips. However, the continued scaling of integrated circuit technology not only increases the data processing capabilities, but also raises the challenges in the field of data communication, that does not improve with Moore's law. Thus, chip-to-chip interconnection has become one of the leading bottlenecks of computer system performance improvement. The challenges and demands associated with eliminating these bottlenecks and providing efficient data communication with a high speed, low power consumption, and a smaller chip area have brought a boom in development of techniques. Traditionally, high-speed serial links in the giga-bit-per-second range was implemented in GaAs or bipolar technologies. An advantage of these technologies over CMOS technology is the high intrinsic device speed. However, the main reason why CMOS becomes targeted technology for serial links is the high degree of integration. Recent advance in CMOS technology has significantly reduced the feature size of MOS devices to nanometer regions. As a result, the intrinsic cutoff frequency of MOS devices has been increased to several tens giga hertzs while the switching power consumption is kept very low, which is on the order of 0.1 mW/GHz/gate. Data transmission over copper channels at giga-bit-per-second is of the fundamental to the systems such as the backplane of complex microelectronic systems, multi-processor systems, gigabit Ethernet for LANs and WANs, data communications between computers and peripheral devices, and global interconnects linking subsystems integrated on the same silicon substrate. A sample signaling path is illustrated in the Fig. 1.1. Figure 1.1: Backplane system cross-section indicating different sections of the signaling path. Channels of serial links are band-limited. A narrow pulse at the near end of the channels will be significantly attenuated and becomes much wider at the far end of the channels, as shown in Fig. 1.2 [1]. If two neighboring narrow pulses are sent to the channel, as shown in Fig. 1.3, an inter-symbol interference (ISI) effect will be seen at the far end of the channel [1, 2]. There are two solutions at the near end of channels to overcome the limitation mentioned above: #### 1. Transmitter pre-emphasis: Transmitter pre-emphasis is a finite impulse response(FIR) filter integrated into the line driver specified by the following equation: $$V_o(n) = V_i(n) - \sum_{k=1}^{M} a_k V_i(n-k),$$ (1.1) Figure 1.2: The response of a FR4 cable to a 100ps wide pulse. Figure 1.3: The effect of inter-symbol interference. where $V_i(n)$ and $V_i(n-k)$ are the present and past $k^{th}$ -input of the pre-emphasis respectively, $V_o(n)$ is the output of the pre-emphasis, $a_k$ is the weighting factor of the past $k^{th}$ -input of the pre-emphasis, and M is the number of taps. The pre-emphasis filter effectively suppresses the power of low-frequency components by reducing the amplitude of continuous strings of same-value data on the line. Simultaneously, it keeps the power of high-frequency components the same. Fig. 1.4 shows the pre- emphasized pulse and the resulting pulse at the end of the cable compared with the original pulse [1]. Figure 1.4: Pulse response from a pre-emphasized transmitter. ### 2. Multi-level pulse amplitude modulation (M-PAM): M-PAM signaling is another way to overcome the finite bandwidth limitation of transmission channels. It transmits multi-bit in each symbol time. As a result, the required bandwidth of a channel for a given bit rate decreases. An example of 4-PAM signaling symbols is shown in Fig. 1.5. Pre-emphasis proposed in [3] pre-distorts the signal based on the algorithm specified by Eq. (1.1), has been proven effective to overcome the inter-symbol interference (ISI). M-PAM signaling reduces the bandwidth of the signal, while keeping the data rate constant by using a simple coding. However, these approaches have a common drawback - an increased hardware cost and power consumption. ### 1.2 Contributions This thesis contains the following original contributions: Figure 1.5: An example of 4-PAM signaling symbols. - 1. A new current-mode class AB transmitter with a low supply voltage sensitivity. The rail-to-rail swing mode ensures that the output current is insensitive to supply voltage fluctuation. The full push-pull operation of the driver minimizes the static power consumption of the transmitter. - 2. A new fully differential current-mode 4-PAM transmitter with a current mirror driver is proposed. This transmitter utilizes the advantages of the multiplexing-at-input approach and active inductors to maximize the bandwidth and minimize the area and power consumption. - 3. A new fully differential current-mode 4-PAM transmitter with class AB driver is proposed. The transmitter provides both 5 Gbps (2-PAM) and 10 Gbps (4-PAM) signaling abilities. The proposed inheres all the advantages of current-mode transmitter, while providing an attractive feature a tunable signal amplitude. - 4. A novel pre-emphasis approach is proposed. This approach avoids the re-construction of the past symbols needed for transmitter pre-emphasis in the digital domain, as of most reported pre-emphasis transmitters that require a large chip area and a high level of power consumption, the pre-emphasis of the proposed transmitter is realized in the analog domain by employing a delay block for each pre-emphasis tap, such that - the degree of each pre-emphasis tap can be tuned individually and independent of the current symbol. - 5. A new VCO delay cell is proposed to overcome the effects of unequal rising and falling time. The equal rising and falling times lead to lower timing jitter of clock. The proposed VCO also provides good linearity with a very large frequency tuning range. - 6. A phased-locked loop (PLL) with the proposed VCO delay cell is implemented in UMC's 0.13 $\mu$ m CMOS technology. The PLL generates a multi-phase clock with a large frequency tuning range, a symmetric waveform at 1GHz. ## 1.3 Organization This thesis is organized into the following six chapters: - Chapter 1 introduces the background and motivation of this work. - Chapter 2 describes a typical high-speed serial link architecture, and investigates limitations, channel properties, trade-offs between different modulation and equalization schemes. - Chapter 3 introduces two different signaling schemes and investigates different serial link transmitter designs to overcome the limitations of device speed, noise and transmission medium. Three serial link transmitters are proposed. - Chapter 4 reviews existing pre-emphasis approaches. A current-mode transmitter with a novel power-area efficient pre-emphasis scheme is proposed. - Chapter 5 presents a comparative study of the architecture of low-voltage CMOS ring-VCOs. A phase-locked loop (PLL) that employs the proposed delay cell is designed in this chapter. - Chapter 6 concludes the thesis and discusses the directions of future work. # Chapter 2 # An Overview of Serial Links In this chapter, a basic serial link architecture is examined to provide a framework for understanding the trad-offs and limitations of serial link transmitters. Factors that affect the bandwidth of serial links are discussed in detail. Two signaling schemes and techniques to overcome the limitations are introduced. ## 2.1 A Typical Serial Link Architecture The basic configuration of a serial link is shown in Fig. 2.1. It consists of the following blocks: - Registers: Registers located at both the transmitter and the receiver ends synchronize the input/output data. - Multiplexer (MUX): Parallel input data streams after synchronization are serialized by the multiplexer. The on-chip clock period of digital circuitry that supplies parallel data is $\frac{1}{N}$ that of the serial link, where N is the degree of serialization of the multiplexer [4, 5]. - Phase-locked Loop (PLL) and Clock and Data Recovery (CDR): A PLL acts as a timing generator in a serial link. It generates a high-frequency clock by multiplying a low-frequency reference clock. The CDR block at the receiver end incorporates a Figure 2.1: Structure of serial links. PLL and additional circuits needed to synchronize the receiver with the incoming data stream. - Driver: A driver provides voltages or currents that sufficiently large to channels such that the data received at the receiver end can be recovered with a low bit-error-rate (BER). - Sampler and De-multiplexer (De-MUX): A sampler and a de-multiplexer sample a bit stream using evenly spaced clock phases to de-multiplex the data directly. The data and clock recovery unit adjusts the phase of the sampling clock such that data are sampled at the center point of data eyes. ## 2.2 Signaling Two signaling techniques for high-speed digital system signaling are introduced in this section. The first one is voltage-mode signaling, it has been used in most computers in the past several years. The second technique is current-mode signaling. ### 2.2.1 Voltage-Mode Signaling CMOS inverters are typically used at both transmitter and receiver ends. As shown in Fig. 2.2, the transmission medium, typically a cable or a PCB trace, has a characteristic impedance of about 50 to $100\Omega$ , and is unterminated at the receiver end. Figure 2.2: Voltage-mode signaling scheme. The reasons why voltage-mode signaling is not suitable for high-speed data transmission are as the followings: - 1. Unterminated receiver: receivers with a high input impedance limit signaling speed. The high impedance driver is unable to switch the line voltage completely [3]. The driver must charge up the line as the signal propagates over several round trip instead. The figure in Fig. 2.3 shows the voltage at the far end of the line as a function of time. - 2. **High power consumption:** voltage-mode signaling is power hungry. This is because they use rail-to-rail signal swings for transmission. As a result, a high level of power consumption is required. Figure 2.3: Voltage at the far end of a 4-ns line when a logic level "1" is applied at the near end of the cable. 3. Low speed: rail-to-rail large swing of voltage-mode signaling increase the time to charge or discharge the capacitance of critical nodes, this increased time constrains the speed of a voltage-mode signaling. ### 2.2.2 Current-Mode Signaling A signaling scheme overcomes the limitations of the voltage-mode signaling scheme is shown in Fig. 2.4. The transmitter (Tx) behaviors as a current source and draws a current from the channel. The voltage swing is in the range of 100 mV to 1 V. The serial link is terminated at Figure 2.4: Current-mode signaling scheme. both ends with its characteristic impedance. Taking advantages of the improved receiver detection with a low offset voltage and high sensitivity, this signaling scheme can operate reliably using very small voltage swings. Another advantage resulting from the small voltage swing is a considerable power consumption reduction. Incorporating with a differential configuration, the bi-directional current-mode signaling scheme, as shown in Fig. 2.5, can minimize the EMI to neighboring devices. Figure 2.5: (a) Reduction EMI with bi-directional signaling scheme, (b) EMI in a single-ended signaling scheme. ### 2.3 Limitations Many factors limit the performance of a high-speed serial link, including device speed, transmission medium, high-speed on-chip clock generation, and noise. The following two sections examine these limitations in detail. ### 2.3.1 Electronic Limitations ### 1. Speed The maximum data rate of a serial link is limited by the electronics used to generate and receive the signal. As shown in Fig. 2.6, limits of signaling speed arise from rising time, sampling time, timing jitter [3]. The symbol time $T_{symbol}$ must exceed $T_{rising}$ (the time required for a signal to change from one logic to another), summed with $T_{sampling}$ (the time for receiver to sample a signal while stable), and $2T_{jitter}$ (timing uncertainty). Figure 2.6: Eye diagram showing limitations of bit time. A metric for bit rate that is independent of technology and operation conditions, such as process technology, temperature and supply voltage, is needed to compare the performance between different systems. FO4 is a figure of merit qualifying the average propagation delay of a complementary static inverter of the minimum size with the load of four identical inverters. For a typical 0.18 $\mu$ m and 0.35 $\mu$ m CMOS technology, the approximately value is 80 ps and 175 ps respectively. #### 2. Noise As CMOS technology and supply voltage are continuously scaled down, the signal represented by the voltage in most systems is getting lower. However, the noise existing in electronic systems remains at the same level. The noise of the transmitted signal or the received sampling clock is qualified by the timing jitter. Some other noises include limited sampling resolution, thermal device noise and supply noise. (a) Timing noise: timing noise origination is from the source of timing generation on a chip, which is usually a PLL or a DLL. A typical PLL is shown in Fig. 2.7, the dominant noise source is the VCO (voltage-controlled oscillator), which generates the clock and distributes to the output through the clock buffer. The phase-frequency detector generates the phase error signal, which is then filtered by the low-pass filter to create control voltage for the VCO and steer its phase to align with the reference clock. Figure 2.7: The noise sources in a typical PLL. - (b) Other noise sources: Some other noise sources existing in a serial link include thermal device noise, supply noise and limited sampling resolution. - i. Thermal device noise: the source of thermal noise in serial links are the $50\Omega$ termination resistor at the receiver end [2]. In addition to the termination resistor, the device noise of the receiver circuits also adds several dB of noise figure to the termination noise level. - ii. Supply and substrate noise: the supply and substrate noise does not introduce noise to the link directly, instead, the noise impact the performance by inducing jitter in transmit timing generation loops [6], and by modulating the input-reffered receiver offset [7]. - iii. Sampling resolution: sampling resolution is the minimum voltage level that can be distinguished by the receiver comparator when other noise sources are not considered. It is determined by several factors such as: receiver static offset voltage due to transistor mismatches, input-referred supply noise, and the input voltage required for the comparator to obtain a decision within a certain period of time. # 2.3.2 Transmission Medium Limitations Transmission lines in high-speed serial links include PCB traces, coaxial and twisted-pair cables. They can be modelled by a series of lumped LCRG elements, as shown in Fig. 2.8. The signal loss in the transmission line is mainly caused by the series resistance, the parallel conductance, and radiations. Figure 2.8: Lumped LCRG model of transmission line. Frequency-dependent loss in transmission lines is mainly due to the skin-effect resistance. As shown in Fig. 2.9, a magnetic field is created when an AC current flows though a conductor. The resultant magnetic field will impact a force called **Lorezen force** on moving electrons and push them to the surface of the conductor, resulting a higher resistance at the center and a lower resistance near the surface. As shown in Fig. 2.9, the effective conducting area is measured by skin depth $\delta$ given by $$\delta = \frac{1}{\sqrt{\pi\mu\sigma f}},\tag{2.1}$$ where $\mu$ is the permittivity of the conductor, $\sigma$ is the conductivity of the inductor, and f is the frequency of the current. The induced resistance is given by R(f), $R_{DC}$ is the resistance when a DC signal is applied, r is the radius of the conductor, and L is the length of the transmission line. $$R(f) = R_{DC}(\frac{r}{2\delta}). \tag{2.2}$$ As seen from above equation, the skin-induced resistance is proportional to the square-root of frequency. The skin depth for some interconnect materials at 100KHz and 5GHz are give in Table 2.1. Figure 2.9: Skin effect of transmission lines. | Interconnect | Resistivity $(10^{-9}\Omega m)$ | Skin depth at 100 MHz(µm) | Skin depth at 5 $GHz(\mu m)$ | |--------------|---------------------------------|---------------------------|------------------------------| | Silver | 16.3 | 6.4 | 0.905 | | Copper | 17.3 | 6.6 | 0.933 | | Gold | 22.7 | 7.6 | 1.07 | | Aluminum | 27.3 | 8.3 | 1.17 | | Silicon | 100-300 | 15.9-27.6 | 2.25-3.9 | Table 2.1: Skin depth of some interconnects at 100 MHz and 5 GHz. For some insulating materials, dielectric absorbtion also causes a frequency-dependant loss. It can be modelled as a conductance G between the signal wire and the ground. This effect can be mitigated by using a low-loss dielectric material. However, the material choice is also limited because of certain restriction on PCB materials. The PCB traces demonstrate a higher dielectric loss compared to cables. The loss is defined as Loss Tangent in [3]: $$\tan(\sigma_D) = \frac{G}{\omega C},\tag{2.3}$$ where C is the capacitance-per-unit-length, and it is approximately constant, therefore, the dielectric loss typically increases lineally with the frequency. With these analysis, the total frequency loss is given by the following equation [3], $$H(f,l)|_{dB} = -(h_s\sqrt{f} + h_d f)l,$$ (2.4) where l is the length of the transmission line, and $h_s$ , $h_d$ are the skin-effect and dielectric loss coefficients respectively. Fig. 2.10 shows the frequency response of a copper wire, with skin-effect and dielectric loss component shown separately. At low frequencies, the skin effect is the dominant loss, while the dielectric loss dominates at higher frequencies. Figure 2.10: Frequency dependance of cable loss. ### 2.3.3 Multi-Phase Clock Generation Multi-phase clock generation is another challenge in serial link design. It is usually implemented by multiplying a reference clock through a PLL or a DLL. The smallest period determined by a ring VCO in a given technology is limited to be no more than 2(N+1)FO4, where N is the number of stages. Consider the example of a 3-stage ring VCO shown in Fig. 2.11. The narrowest pulse width generated by this circuit is approximately 3FO4 if the delay $\tau$ is 1FO4. The highest frequency can be calculated from $$f_{max} = \frac{1}{6\tau}. (2.5)$$ Figure 2.11: Clock generation using a 3-stage ring oscillator. Another way to generate the on-chip clock is to use a LC tank oscillator to provide frequencies above 10 GHz. However, the bulky, noisy, and large chip area inductors have to be fabricated on-chip, which is not attractive for system-on-chip (SOC) design. Another design difficulty for clock generation is the clock jitter in addition to the clock speed. The clock jitter usually comes from the unstable off-chip reference clock, thermal noise, noisy power rails and substrate. The details on this topic is discussed in Chapter 5. ## 2.4 Design Techniques ### 2.4.1 Voltage Mode and Current Mode Voltage-mode and current-mode signaling schemes have been used in high-speed serial links. Current-mode signaling offers following advantages over voltage-mode counterpart: 1. **Higher speed:** the speed of a serial link transmitter is determined from the charging and discharging times of the critical nodes in the transmitter. The capacitance of one node is fixed for a given technology, and the capacitance charging and discharging time is determined from $$\Delta t = \frac{C\Delta V}{I},\tag{2.6}$$ where $\Delta V$ is the voltage swing of the node, I is the average current to charge or discharge the node, and C is the capacitance of the node. A typical characteristic of a current-mode circuit is its small voltage swing. It can be seen from the above equation that a smaller voltage swing at the node results in a smaller $\Delta t$ . - 2. A constant current drawn from power rails: with a well defined differential configuration, a constant current is drawn from the power source such that noise injection to the substrate and neighboring devices is minimum. - 3. Lower electromagnetic interference (EMI): a typical example is the LVDS current-mode driver shown in Fig. 2.12. It is seen that the EMI caused by the differential currents in the channel cancel each other. Figure 2.12: EMI cancellation in LVDS current-mode driver. ### 2.4.2 Multi-Level Pulse Amplitude Modulation As discussed in section 2.3, a signal is usually distorted after the transmission line due to its limited bandwidth. As a result, the signal can not travel far and usually causes Inter-Symbol Interference (ISI), and limits the performance of serial links. This problem becomes more server for high-frequency signals, which use narrow pulses to represent the signal information. Multi-level transmission schemes using each pulse conveys $\log_2(M)$ bits of information have been proven effective [4, 8]. For a given data rate, M-PAM modulation reduces the effective symbol rate by a factor of $\log_2(M)$ compared to a conventional 2-PAM signaling scheme. This symbol-rate reduction not only reduces ISI in the channel, but also relaxes on-chip clock frequency. The cost of a M-PAM serial link scheme is its complex configuration. A 4-PAM signaling scheme usually doubles the hardware as compared with its 2-PAM counterpart. Another reason for avoiding a large M is due to the limited signal resolution of the receiver and transmitter swing budget. This work uses a fully differential 4-PAM signaling scheme to decrease the symbol rate by a factor of 2. The differential configuration improves the performance by eliminating the effect of common-mode noise and increasing the total signal swing by a factor of 2 as compared to single-ended signaling. ### 2.4.3 Transmitter Pre-emphasis and Receiver Equalization ### 1. Transmitter pre-emphasis: Transmitter pre-emphasis is a technique to alleviate ISI and compensate for the high-frequency loss of channels. As shown in Fig. 2.13, it uses a symbol-spaced finite impulse response (FIR) filter integrated into the line driver, that performs the following computation: $$V_o(n) = V_i(n) - a_1 V_i(n-1) - a_2 V_i(n-2) - \dots,$$ (2.7) Figure 2.13: Transmitter pre-emphasis and receiver equalization. where $V_i(n-k)$ are the present and past input symbols respectively, $V_o$ is the output, and $a_1, a_2...$ are the weight coefficients of the filter. The output will no longer have the distinct signal levels of the unfiltered signal stream, as shown in Fig. 2.14. Figure 2.14: Effect of 3-tap pre-emphasis, $a_1, a_2$ and $a_3$ are the pre-emphasis coefficients. Another observation from Fig. 2.14 is that the FIR filter suppresses the power of low-frequency components and keeps the power of the high-frequency components by decreasing and keeping the amplitude of the signals respectively. One drawback of this approach is the reduction of the low-frequency components decreases the signal-to-noise ratio. In this work, we not only decrease the amplitude of the low frequency components, but also increase the amplitude of the high frequency components. As a result, the current difference between the low frequency and high frequency components is doubled as compared to conventional approach such that a more efficient pre-emphasis scheme is achieved. A detailed discussion on this topic will be seen in Chapter 4. #### 2. Receiver equalization: High frequency components can be further accentuated by increasing high frequency gain at the receiver end. An equalizer at the receiver end can not be substituted by pre-emphasis at the transmitter end. A sharp transition at the transmitter end is desirable for eye-opening, as shown in Fig. 2.15. Clearly, the eye diagram with a sharp slope results in larger timing margins, which makes the system more tolerant to sample phase errors. The effect of sharp transitions become even more important for 4-PAM signaling schemes, where the top and bottom eye-width is seriously affected by signal transition slope. Figure 2.15: 4-PAM eye diagrams, a) slow transition, b) sharp transition. Although a fast transition is critical at the transmitter end, it is undesirable from the transmission media point of view. This is because a sharp transition will cause ringing and crosstalks between adjacent channels. Another reason is that the preemphasized high-frequency components at the transmitter move signal power into more lossy regions of channels. Therefore, the transition slope of the signal should adjusted to the minimum required value. Ideally, a better approach to overcome the loss caused by lossy channels is to perform equalization at the receiver end. Since no signal power loss is wasted for pulse preshaping at the transmitter end, filters can also be used to sharpen the transition edges of the signal at the far end of channels. #### 2.4.4 Active Inductors Inductors have been widely used in transceiver design to increase bandwidth [14, 15]. This technique moves the -3 dB pole to a higher frequency and improves bandwidth as much as 70 percent. Conventional on-chip inductors are usually realized using planner spirals. They suffer from several design difficulties: 1) a large chip area, 2) a large parasitic capacitance to the substrate, 3) a small and none-tunable inductance, and 4) a low self-resonant frequency and low quality factor. As compared to passive inductors, active inductors, which are synthesized using active devices, offer intrinsic advantages including: 1) a small area, 2) a large and variable inductance, and 3) a high and tunable quality factor. However, active inductors also suffer from the following drawbacks: 1) high noise due to thermal noise of resistors and MOSFETs, 2) a smaller dynamic range because MOSFETs must be in the saturation region, and 3) a limited frequency range over which an inductor characteristic exists. There are many ways to implement active inductors. A simple active inductor shown in Fig. 2.16 can be realized using a NMOS transistor and a resistor. The resistor can be implemented using a PMOS transistor. By applying a voltage source at the source terminal of the NMOS transistor, the input Figure 2.16: Active inductor and its small-signal equivalent circuit. $g_m$ , $C_{gs}$ and $C_{gd}$ are transistors' trans-conductance, gate-source and gate-drain capacitances, respectively, $C_L$ is the total capacitance associated with the driving node. impedance is obtained from Eq. (2.8). $$Z_{in}(s) = \frac{1 + sC_{gs}R_g}{g_m + sC_{gs}}. (2.8)$$ The equivalent inductance $L_{eq,s}(\omega)$ and $R_{eq,s}(\omega)$ of the active inductor can be obtained by substituting s with $j\omega$ : $$Z_{in}(\omega) = \frac{1 + j\omega C_{gs}R_g}{q_m + j\omega C_{gs}} = \frac{g_m + \omega^2 C_{gs}^2 R_g}{q_m^2 + \omega^2 C_{gs}^2} + j\omega \frac{C_{gs}(g_m R_g - 1)}{g_m^2 + \omega^2 C_{gs}^2}.$$ (2.9) The equivalent impedance of the active inductor is as shown in Fig. 2.17(a), where the equivalent series resistance, inductance and Q-factor are given in the following equations: $$R_{eq,s}(\omega) = \frac{g_m + \omega^2 C_{gs}^2 R_g}{g_m^2 + \omega^2 C_{gs}^2},$$ (2.10) $$L_{eq,s}(\omega) = \frac{C_{gs}(g_m R_g - 1)}{q_m^2 + \omega^2 C_{gs}^2},$$ (2.11) $$Q(\omega) = \frac{\omega L_{eq,s}(\omega)}{R_{eg,s}} = \omega \frac{C_{gs}(g_m R_g - 1)}{g_m + \omega^2 C_{gs}^2 R_g}.$$ (2.12) The dependance of $|Z_{in}|$ on $R_g$ and $g_m$ is shown in Fig. 2.17(b), where the pole frequency $\omega_p = \frac{g_m}{C_{gs}}$ and zero frequency $\omega_z = \frac{1}{R_g C_{gs}}$ are determined from Eq.(2.8). It is seen that: Figure 2.17: (a) Active inductor equivalent circuit, (b) active inductor bode plot. - 1. An increase in $R_g$ lowers the lower bound of the frequency range over which the circuit exhibits an inductive characteristic and reduces the self-resonant frequency of the active inductor. - 2. An increase in the width of the NMOS transistor lowers the impedance of the active inductor and reduces the self-resonant frequency. ## 2.5 Summary The shortcomings of conventional CMOS serial links and the limitations have been investigated. The bandwidth of channels is limited by the low-pass characteristics of the channels caused by skin-effect resistance and dielectric loss. The finite bandwidth of the channels gives rise to an inter-symbol interference. A number of techniques have been investigated in this chapter. M-PAM modulation enables systems to transmit $\log_2 M$ bits per symbol time. It decreases the symbol rate and relaxes on-chip clock by a factor of $\log_2(M)$ . FIR filters are used at both the transmitter and receiver ends to increase eye opening. Active inductors at critical nodes sharpen transaction edges. # Chapter 3 # High-speed Serial Link Transmitter Design In this chapter, a state-of-the-art review for several serial link transmitters with inverter, LVDS, open-drain and class AB drivers is presented. Three transmitters including a 2-PAM transmitter with improved $V_{DD}$ insensitivity and two 4-PAM current-mode transmitters for 10 Gbps serial links are proposed. This chapter is organized as follows: section 3.1 provides an in-depth review of the design of serial link transmitters. The advantages and limitations of the four widely used serial link transmitter drivers are investigated. Section 3.2 proposes a 2-PAM and two 4-PAM current-mode 10 Gbps serial link transmitters. Simulation results are presented at the end of this section. Finally. The chapter is summarized in section 3.3. # 3.1 An overview of Serial Link Transmitters A serial link transmitter converts parallel digital streams into a serial analog signal with a pre-defined waveform and amplitude to compensate for high-frequency loss of channels. The parallel-to-serial function is realized by a multiplexer. In order to achieve a high speed, a small time constant is always preferred at the multiplexing node. Depending on signals, transmitters are classified into the following categories, as shown in Fig. 3.1. Voltage-mode transmitter: Transmitters with a high output impedance making their appearance as a voltage source and are referred as voltage-mode transmitters. Current-mode transmitter: Transmitters with a low output impedance making their appearance as a current source and are referred as current-mode transmitters. Figure 3.1: (a) Voltage-mode transmitter. (b) Current-mode transmitter. Traditional voltage-mode transmitters become a bottleneck in serial link transmitter designs when the signaling speed reaches the range of giga-bit-per-second. The average rising and falling times of a node is determined from $$\Delta t = \frac{C_{node} \Delta V_{node}}{I},\tag{3.1}$$ where I is the average current charging or discharging the node, $C_{node}$ and $V_{node}$ are the capacitance and the voltage variation of the nodes respectively. This equation reveals that a high speed can be achieved by: 1) minimizing the voltage swing at the node, nd 2) maximizing the current available for charging and discharging the capacitance of the node. # 3.1.1 Transmitter with Inverter Driver Transmitter with an inverter driver, as shown in Fig. 3.2, have been traditionally used for low-speed signaling. Figure 3.2: Transmitter with inverter driver. They are voltage-mode transmitters due to the high output output impedance and full swing voltage signaling. The impedance matching is normally realized with a serial termination resistor at the near end. Due to the full swing nature of inverters, the rising and falling times can not be improved by reducing the $\Delta V$ , as shown in Eq. (3.1). Another observation is to reduce the resistance by increasing the transistor size of PMOS and NMOS, such that a increase of I can be achieved. An increase in capacitance at the node is also seen with the increase of the transistor size. As a result, no net speed improvement is achieved. Another drawback of inverter drivers is the variation of the output impedance due to the changing of the transistor operation mode. With the drain-source voltage variation, the output impedance is determined from the Table 3.1, where $R_{NMOS-triode}$ and $R_{PMOS-triode}$ are the output impedance of the NMOS and PMOS in triode region respectively, and $R_{NMOS-saturation}$ , $R_{PMOS-saturation}$ are the NMOS and PMOS output impedance in saturation. Due to the large impedance of inverter, a strong reflection is seen at the output node during the transaction. Another observation is another strong reflection caused by the large input impedance at the receiver end, which is also implemented with a inverter, resulting into multiple reflections of the signal at both near and far end of the channel, thereby limiting the transceiver data rate. Some other drawbacks are also seen including: 1) a large dynamic power consumption, 2) the injection of impulse currents to the power rails, and 3) sensitivity of the output voltage to supply and ground fluctuations. Table 3.1: Output impedance of inverter drivers. | Output Impedance | Impedance | Operation mode | |------------------|--------------------------------------------|---------------------------------| | $Z_{out}$ | $R_{NMOS-triode}$ | When $V_{out}$ is low | | $Z_{out}$ | $R_{PMOS-triode}$ | When $V_{out}$ is high | | $Z_{out}$ | $R_{NMOS-saturation}//R_{PMOS-saturation}$ | When $V_{out}$ is in transition | #### 3.1.2 Transmitter with LVDS Driver The Low-voltage differential-signaling (LVDS) driver, as shown in Fig. 3.3, sources and sinks two well-defined currents to the channels [9]. LVDS drivers are current-mode transmitter drivers. Figure 3.3: Transmitter with LVDS driver. Because the total current drawn from the supply voltage and that injected to the ground rail are constant, the switching noise induced by LVDS drivers is minimum. Also, because the current conveyed to the channels is well-defined and is independent of the fluctuation of the supply voltage and ground bouncing, the effect of switching noise on the output current is minimum. Moreover, the differential output currents that have the same amplitude and flow in the opposite directions minimize the electro-magnetic interference exerted from the channels to neighboring devices. The advantages of the LVDS driver are obvious, however, there are several drawbacks limiting the application of LVDS drivers: 1) the need for four transistors stacked between the supply voltage and ground rails limits the LVDS applications where only a low supply voltage is available, 2) a large input capacitance exists because both PMOS and NMOS switches must be driven at the same time, and 3) the need for four noisy buffer chains to drive the four switching transistors that must be sufficiently large to drive channels. ## 3.1.3 Transmitter with Open-Drain Driver The open drain driver is shown in Fig. 3.4. The two NMOS transistors behave as a pair of complementary switches, the open-drain driver sinks a constant current from the channels by steering well-defined tail currents [4, 10]. Figure 3.4: Transmitter with open drain driver. An open-drain driver is a typical current-mode driver and offers the following intrinsic advantages: 1) the constant current drawn from the channel minimizes both the noise coupled from the supply voltage and ground rails and the switching noise injected by the driver, 2) consumes less power at multi-Gbps data rates despite its DC power consumption, and 3) the swing of the signal represented by current can be large with a low supply voltage, this is especially critical when the supply voltage is reduced. A number of drawbacks, however, exist with this configuration: 1) the multiplexing-at-output approach requires that the number of drivers is the same as the parallel-to-series ratio, as shown in Fig. 3.4. Because these drivers must be large enough to provide sufficiently large output currents to the channels, a large chip area, a high level of power consumption, and a high level of switching noise exist, 2) the common-mode component of the output current is high, causing difficulties in the design of the receivers, and 3) the unipolar signaling characteristics of the open-drain drivers also gives rise to a high level of electro-magnetic interference exerted from the channels to neighboring devices. #### 3.1.4 Transmitter with Class AB Driver Class AB driver shown in Fig. 3.5 employs a pair of N and P-differential pairs in its preamplifier stage to achieve signal amplification. They are driven by the differential voltage signals from the output of the multiplexer and generate two pairs of differential voltage signals at output nodes of the two pairs, these signals are then used to drive the output stage, which operates in a push-pull mode. One key point in this transmitter is that all the transistors are biased in the saturation mode to avoid the delay of fully on and off. Figure 3.5: Transmitter with class AB driver. A high data rate, low power consumption, and a small chip area are achieved by multiplexing at low impedance nodes, the use of multiplexing-at-input approach, the use of active inductors at critical nodes, and the push-pull configuration of the driver. This configuration, however, has the following drawbacks: 1) although the voltage drop of one threshold voltage $V_T$ caused by the active inductors in both the multiplexer and the pre-amplifier ensures that all transistors, especially those in the push-pull output stage, remain in saturation all the time to avoid the speed penalty of the complete turn-on/off of the transistors, the small output impedance of deep sub-micron CMOS technologies, gives rise to a direct path from the supply voltage to the ground. As a result, the output current is sensitive to the fluctuation of the supply voltage and ground bouncing, and 2) the output transistors, which are large in size, are in saturation all the time, resulting in a high level of static power consumption. # 3.2 10 Gbps Current-Mode Transmitters Three 10 Gbps current mode transmitters, including a 2-PAM $V_{DD}$ -insensitive transmitter, a 4-PAM transmitter with a current-mirror driver, and a 2/4-PAM transmitter with class AB driver are proposed in this section. ## 3.2.1 $V_{DD}$ -Insensitive Transmitter In this section, we propose a new fully differential current-mode transmitter with a low supply voltage sensitivity for 10 Gbps serial links. The transmitter consists of a modified full-swing pseudo-NMOS multiplexer that provides a rail-to-rail output voltage swing to a class AB pre-amplification stage, and a push-pull driver stage that operates in a full push-pull mode. The current-mode architecture of the transmitter ensures that the nodes of the transmitter are of a low-impedance characteristic. The rail-to-rail output voltage of the multiplexer ensure that the downstream class AB pre-amplification and push-pull driver stage are operated in a true class AB mode such that not only the static power consumption of the driver stage is minimized, the effect of the supply voltage fluctuation and that of ground bouncing on the output current are also minimized. #### 1. Full-Swing Pseudo-NMOS Multiplexer The schematic of the full-pseudo-NMOS multiplexer is shown in Fig. 3.6. The active shunt peaking inductors are formed by $M_s$ and resistor $R_{in}$ . The voltage swing of the output $$V_A = V_{DD} - V_{TN}$$ (Logic-1), $\overline{V_A} = \frac{3R_o}{3R_o + R_s} V_{DD}$ (Logic-0), (3.2) Where $R_s$ is the resistance seen from the source of $M_s$ and $\overline{M}_s$ and is given by $\frac{1}{g_{ms}}$ at low frequencies, $R_o$ is the output impedance of the $M_{(N-1),(1,2,3)}$ . Due to the finite resistance of deep sub-Micron MOSFETs, both $R_o$ and $R_s$ are small. As a result, a small voltage swing at multiplexing nodes A and $\overline{A}$ exists. Figure 3.6: Full rail-to-rail N-to-1 multiplexer with inductive shunt peaking. $W_S = \overline{W}_S = 10 \ \mu\text{m}$ , $W_{LP} = \overline{W}_{LP} = 3 \ \mu\text{m}$ , $W_{LN} = \overline{W}_{LN} = 1.5 \ \mu\text{m}$ , $W_{(N-1)} = W_{(N-2)} = W_{(N-3)} = 2 \ \mu\text{m}$ , $R_{in} = 7 \ K\Omega$ , $L = 0.13 \ \mu\text{m}$ is used for all transistors. To increase the swing of the output voltage of the multiplexer, a nMOS-latch and a pMOS-latch are added, as shown in Fig. 3.6. $M_{LP,LN}, \overline{M}_{LP,LN}$ form the latches to provide a rail-to-rail output voltage. Observed from Fig. 3.6, when A = Logic-1 and $\overline{A} = \text{Logic-0}$ , transistors $M_{LP}$ and $\overline{M}_{LN}$ turn on, $M_{LN}$ and $\overline{M}_{LP}$ turn off. As a result, the multiplexing node A will be continually charged from $V_{DD} - V_T$ to $V_{DD}$ via $M_{LP}$ , while the node $\overline{A}$ will be continually discharged from $\frac{3R_o}{3R_o + R_s}V_{DD}$ to 0 via $\overline{M}_{LP}$ such that a full rail-to-rail output voltage swing is obtained. ## 2. Fully Push-Pull Class AB Driver As shown in Fig. 3.7, the fully push-pull class AB driver employs a nMOS and a pMOS differential pairs with active inductors as its pre-amplification stage. The active inductors sharpen the voltage at the input nodes of the output drivers that have a large capacitance. The rail-to-rail voltage from the preceding multiplexer ensures that nMOS and pMOS pairs are fully switched such that the rail and head current of the differential pairs is steered between their two arms. To ensure that the transistors in the output stage operate in a full push-pull mode, two latches are employed to overcome the voltage loss caused by the active inductors. The size of the latch transistors should be kept small to minimize their impact on the delay. Figure 3.7: Class AB driver. Circuit parameters : $W_{a1} = W_{a2} = 5~\mu\text{m}$ , $W_{a3} = W_{a4} = 10~\mu\text{m}$ , $W_N = 5~\mu\text{m}$ , $W_P = 10~\mu\text{m}$ , $W_{DN} = 10~\mu\text{m}$ , $W_{DP} = 10~\mu\text{m}$ , $W_{DP} = 10~\mu\text{m}$ , $W_{a3} = R_{a4} = 1~K\Omega$ $W_{a4} W ### 3. Supply Voltage Sensitivity When $V_{DD}$ varies by $\Delta V_{DD} \ll V_{DD}$ , the voltage at the output nodes A and $\overline{A}$ of the multiplexer is determined by $$\frac{V_A = V_{DD} + \Delta V_{DD}}{\overline{V}_A = 0},$$ (3.3) Because the downstream class AB driver operate in a full push-pull mode, the small variation of the input voltage of the differential pairs will not alter the operation conditions of the driver. As a result, $M_N$ and $\overline{M}_P$ are on (triode), and $\overline{M}_N$ and $M_P$ are off. The head current source $J_P$ is steered to $\overline{M}_P$ and the tail current source $J_N$ is steered to $M_N$ . Because $J_P$ and $J_N$ are constant, $V_B$ follows $V_{DD}$ with the same voltage variation. The same conclusion can be drawn for $V_{\overline{C}}$ . The latches ensure that $V_{\overline{B}}$ is high enough such that $\overline{M}_{DP}$ is off and $V_C$ is low enough such that $M_{DN}$ is off, while $M_{DP}$ and $\overline{M}_{DN}$ are on. The turn-off of $\overline{M}_{DP}$ and $M_{DN}$ ensures that there is no direct DC path from $V_{DD}$ to the ground in the output stage. This differs fundamentally from the class AB transmitter proposed in [17] where all transistors in the output stage are in saturation. The removal of the direct path from the supply voltage to the ground ensures that not only the output current is independent of the supply voltage, the static power consumption of the output stage is zero. #### 4. Simulation Results The proposed transmitter is implemented in a UMC's 1.2 V 0.13 $\mu$ m CMOS technology and analyzed using Spectre from Cadence Design Systems with BSIM3v3 device models that account for both device parasitics and high-order effects. Fig. 3.8 plots the voltage of the critical nodes $A, \overline{A}, B, \overline{B}, C, \overline{C}$ of the transmitter. The voltage at nodes A and $\overline{A}$ confirms that the proposed multiplexer provides a rail-to-rail output voltage swing while meeting the timing constraints of 10 Gbps data rates. The voltage swing of $V_B$ and $V_C$ assures that the transistors can be turned on/off fully with 10 Gbps data rates. Fig. 3.9 shows that the voltage at nodes B and $\overline{B}$ varies with the fluctuation of $V_{DD}$ . The amount of variation is approximately the same as that of $V_{DD}$ , resulting in no change in $V_{SG,M_{DP}}$ . As a result, a constant output current can be obtained in the presence of the supply voltage variation. For the purpose of comparison, the class AB transmitter proposed in [17] is also imple- Figure 3.8: Voltages of nodes $A, \overline{A}, B, \overline{B}, C, \overline{C}$ of the proposed transmitter. $V_{DD}$ is varied from 1.1 V to 1.3 V with step 0.1 V. mented and analyzed. The simulation results are shown in Fig. 3.10. A variation of the output current is observed. Fig. 3.11 shows the output current of the proposed transmitter when $V_{DD}$ is varied from 1.1 V to 1.3 V. As compared with Fig. 3.10, one observe that the variation of output current of the proposed serial link transmitter is approximately one-third of that of the serial link transmitter in [17], and is approximately the same as that of the open-drain driver plotted in Fig. 3.12. The current received at the far end of a 5 cm FR4 cable is shown in Fig. 3.13. The eyediagram shows a clear eye opening with eye-width of 80ps and eye-height of 3.1mA while transmitting the data at 10 Gbps. Table 3.2 compares some key parameters of the proposed transmitter and those of the transmitter in [17]. Figure 3.9: Voltage at nodes B and $\overline{B}$ of the proposed transmitter. $V_{DD}$ is varied from 1.1 V to 1.3 V with step 0.1 V. Figure 3.10: Output current of the class AB serial link transmitter in [6]. $V_{DD}$ is varied from 1.1 V to 1.3 V with step 0.1 V. # 3.2.2 4-PAM Transmitter with Current-Mirror Driver In this section, we present a new fully differential 4-PAM current-mode transmitter for 10 Gbps serial links. The current-mode architecture of the proposed transmitter ensures the Figure 3.11: Output current of the proposed transmitter. $V_{DD}$ is varied from 1.1 V to 1.3 V with step 0.1 V. Figure 3.12: Output current of the open-drain transmitter. $V_{DD}$ is varied from 1.1 V to 1.3 V with step 0.1 V. nodes of the transmitter are of low-impedance. The low-voltage swing at the multiplexing nodes minimizes the time for charging and discharging the nodes. The inductive shunt Figure 3.13: Received eyediagram after 5 cm FR4 cable. Table 3.2: Comparison of none-saturated, saturated class AB, and open drain transmitter. | Parameters | Transmitter in [17] | This work | | |-----------------------------------|---------------------|-------------------|--| | Technology | $0.13~\mu m$ | $0.13~\mu m$ | | | $ rac{\Delta I}{\Delta V_{DD}}$ | $1.37~\mu A/mV$ | $0.47~\mu A/mV$ | | | Total current drawn from $V_{DD}$ | 9.2 mA | 6.0 mA | | | Total transistor width | $335~\mu m$ | $218~\mu m$ | | | Output current swing | 3.2 mA | $3.2~\mathrm{mA}$ | | | Total power consumption | 11.04 mW | $7.2~\mathrm{mW}$ | | peaking employed at the multiplexing nodes with active inductors greatly sharpens the rising and falling edges of the voltage at the multiplexing nodes, subsequently that of the output currents with little chip area overhead. The fully differentially configured 2-bit digital-to-analog converters (DACs) convert two digital bits to two 4-level differential current signals that are then amplified by a fully differential current amplifier. #### 1. Fully Differential 4-PAM Current-Mode Transmitter The schematic of the proposed fully differential 4-PAM current-mode transmitter is shown in Fig. 3.14. The multiplexing are implemented at nodes A and $\overline{A}$ , while two fully differential current signals generated by the DACs are drawn from the two nodes. Figure 3.14: N-to-1 fully differential 4-PAM current-mode transmitter with inductive shunt peaking. $W_1=\overline{W}_1=6~\mu\text{m},~W_2=\overline{W}_2=9~\mu\text{m},~W_3=\overline{W}_3=18~\mu\text{m},~W_4=\overline{W}_4=15~\mu\text{m},~W_5=\overline{W}_5=30~\mu\text{m},~K_1=3,~K_2=1.5,~K_3=2,~R_s=500~\Omega,~L=0.13~\mu\text{m}$ is used for all transistors. 2-bit DACs are implemented using a fully differential current-mode approach, as shown in Fig. 3.15 such that they convey two differential currents to the downstream driver, as detailed in Table 3.3. Table 3.3: Output of 2-bit DACs and transmitter. | D1 | D0 | $i_{in}^+$ | $i_{in}^-$ | $i_{out}^+$ | $i_{out}^-$ | |----|----|------------|------------|-------------|-------------| | 0 | 0 | 0 | 3I | -9I | +9I | | 0 | 1 | I | 2I | -3I | +3I | | 1 | 0 | 2I | I | +3I | -3I | | 1 | 1 | 3I | 0 | +9 <i>I</i> | -9I | A key advantage of the DACs is that the total current drawn by the DACs is constant, thereby minimizing the noise injection to the substrate. The selection pulse is generated by employing two adjacent clock signals' rising and falling edges, as show in Fig. 3.16, such Figure 3.15: Fully differential 2-bit digital-to-analog converter (DAC) $W_d=\overline{W}_d=0.8~\mu\text{m},~W_{d+1}=3~\mu\text{m},~W_{A1}=W_{A2}=W_{A3}=W_{A4}=0.6~\mu\text{m},~W_{A5}=1.5~\mu\text{m}.~L=0.13~\mu\text{m}$ is used for all transistors. that the narrow selection pulses can be realized by two adjacent clock phases of a clock at low frequency. Figure 3.16: Selection pulse generation. As another key advantage of this design, the signal at DAC is small comparing to the signal at the final output, as a result, no buffer chain is needed in DACs, for the same reason, the transistor size can be kept at very small size, as shown in Fig. 3.15, by which the chip size could be greatly minimized. The outputs of the DACs are fed to the fully differential current-mode driver that amplifies the currents and convey two differential currents with amplitude of 3.5mA to the channels. Because of the large number of DACs required for 4-PAM operation, the total capacitances at nodes A and $\overline{A}$ are estimated from $$C_A \approx \sum_{i=1}^{N/2} 3(C_{gd_d,j} + C_{db_d,j}) + C_{gs1} + C_{gs2} + C_{gs3} + C_{gd2} + C_{gd3}$$ (3.4) where N is the number of digits of the input, $C_{gs}$ , $C_{gd}$ , and $C_{db}$ are the gate-source, gate-drain, and drain-substrate capacitances, respectively, are large. The diode connection of the input transistors $M_1$ and $\overline{M_1}$ ensures that the input impedance given by $\frac{1}{g_{m1}}$ is low, resulting in a small time constant at nodes A and $\overline{A}$ . To further reduce the voltage rising and falling time of the nodes A and $\overline{A}$ , shunt-peaking inductors $L_s$ with the small series resistance $R_s$ are employed at nodes A and $\overline{A}$ , as shown in Fig. 3.14 to form RLC networks to achieve small rising and falling time [18]. The DC operating point of $M_1$ in this case is determined from $$RI_{D1} + \frac{2I_{D1}}{\mu_n C'_{ox}(\frac{W}{L})_1} + V_T = V_{DD}, \tag{3.5}$$ where $C'_{ox}$ and $\mu_n$ are the gate capacitance per unit area and the surface mobility of free electrons, respectively. The shunt-peaking inductors can be implemented using passive spiral inductors with the drawbacks of a large chip area and a strong interaction with the substrate [18]. It was shown in [19] that single-ended inductors can be synthesized using active devices as shown in Fig. 3.14. The DC operation points of $M_1$ , $M_2$ , $M_3$ in this case are determined from equation (3.6), (3.7), (3.8) $$V_A = \frac{V_{DD} - \left[1 - \sqrt{\frac{(W/L)_1}{(W/L)_s}}\right] V_T}{1 + \sqrt{\frac{(W/L)_1}{(W/L)_s}}}.$$ (3.6) $$V_B = V_{DD} - \left[1 - \sqrt{\frac{\mu_n(W/L)_2}{\mu_p(W/L)_4}}\right] V_T - V_A \sqrt{\frac{\mu_n(W/L)_2}{\mu_n(W/L)_4}}$$ (3.7) $$V_C = 25C'_{ox} \left[ \mu_p \left(\frac{W}{L}\right)_5 (V_{DD} - \overline{V_B} - V_T)^2 - \mu_n \left(\frac{W}{L}\right)_3 (V_A - V_T)^2 \right]$$ (3.8) The fully differential configuration of the driver ensures that the current drawn from the supply voltage is also constant. The selection pulses realized by applying two adjacent clocks to the input of DACs ensures that only one DAC is activated at a time. Depending upon the input data, the activated DAC provides 4-level differential output currents, $\{0,3I\}$ , $\{I,2I\}$ , $\{2I,I\}$ , and $\{3I,0\}$ to the driver for amplification. Neglecting channel length modulation and imposing $K_2 = K_1K_3$ , one can readily show that the output currents of the driver are given by $$i_{out}^{+} = K_2 i_{in}^{+} - K_1 K_3 i_{in}^{-}, i_{out}^{-} = K_2 i_{in}^{-} - K_1 K_3 i_{in}^{+}.$$ $$(3.9)$$ The output currents of the transmitter are given in Table 3.3. It is seen that the transmitter conveys two currents of the same amplitude but opposite polarities to the channels. #### 2. Simulation Results To quantify the performance of the transmitter, the transmitter with 10-to-1 multiplexer is implemented in UMC's 1.2 V 0.13 $\mu$ m CMOS technology and analyzed using Spectre from Cadence Design Systems with BSIM3v3 device models. Fig. 3.17 and Fig. 3.18 plot the voltage of the node A and the output current of the transmitter with the input at 10 Gbps (200 ps symbol time) when the passive inductors are applied. It is seen that the voltage swing of node A is only about 0.3 V, whereas the output current swing reaches 3.5 mA with equal and large spacing. Inductive shunt peaking greatly improves the rising and falling edges of the voltage and the current. It should, however, be noted that due to the low output resistance of deep sub-micron MOS devices, $i_{out}^+$ and $i_{out}^-$ contain a DC component that is not shown in equation (3.9). The common-mode components of the output current imposes a design challenge to receivers. Fig. 3.19 and Fig. 3.20 plots the voltage at node A and transmitter output current when the active inductors are applied to the driver. Figure 3.17: Voltage at the multiplexing node A with passive peaking inductors. The inductance of the shunt-peaking inductors is varied from 0 nH to 20 nH with step 5 nH. Figure 3.18: Output current of the transmitter with passive peaking inductors. The inductance of the shunt-peaking inductors is varied from 0 nH to 20 nH with step 5 nH. As we can see from the plot, the voltage swing at node A is approximately 0.2 V with Figure 3.19: Voltage at the multiplexing node A with active peaking inductors. The width of transistor forming the active inductor is varied from 5 $\mu$ m to 30 $\mu$ m with step 5 $\mu$ m. Figure 3.20: Output current of transmitter with active peaking inductors. The width of transistor forming the active inductor is varied from 5 $\mu$ m to 30 $\mu$ m with step 5 $\mu$ m. active inductors, smaller as compared with that of passive inductors. As a result, the output current in Fig. 3.20 is also smaller. However, the loss of the output current can be compensated by increasing the width of $M_5$ and $\overline{M}_5$ , as shown in Fig. 3.21 and Fig. 3.22. As the one of the advantages of fully differential configuration, the total current drawn Figure 3.21: Voltage at the multiplexing node A when active peaking inductors. The width of transistor $M_5$ and $\overline{M}_5$ is 60 $\mu$ m. The width of transistor forming the active inductors is varied from 5 $\mu$ m to 30 $\mu$ m with step 5 $\mu$ m. Figure 3.22: Output current of the transmitter when active peaking inductors. The width of transistor $M_5$ and $\overline{M}_5$ is 60 $\mu m$ . The width of transistor forming the active inductors is varied from 5 $\mu m$ to 30 $\mu m$ with step 5 $\mu m$ . from $V_{dd}$ as shown in Fig. 3.23 is nearly constant with spikes from the charging and discharging parasitic capacitances. Figure 3.23: Total current $i_{dd}$ drawn from $V_{dd}$ and output current of the 4-PAM current-mode transmitter with active peaking inductors. # 3.2.3 2/4-PAM Transmitter with Class AB Driver #### 1. 2/4-PAM transmitter design The transmitter presented in section 3.2.1 can be further developed to be a 4-PAM current-mode transmitter. As shown in Fig. 3.24, the output of driver A is tuned to provide the current of J, while the driver B generates the current of 2J, by applying the same clock to the two drivers, we can easily get 4-PAM current signals at the near end of the transceiver. Table 3.4 shows the four-level current value upon the input data. Table 3.4: Output of 4-PAM transmitter. | D1 | D0 | Driver $2J$ | Driver $J$ | $i_{out}^+$ | $i_{out}^-$ | |----|----|-------------|------------|-------------|-------------| | 0 | 0 | 0 | 0 | 0 | 3J | | 0 | 1 | 0 | J | J | 2J | | 1 | 0 | 2J | 0 | 2J | J | | 1 | 1 | 2J | J | 3J | 0 | Figure 3.24: Configuration of the 4-PAM current-mode transmitter with active peaking inductors. Comparing to the conventional 4-PAM transmitters, this configuration provides many application flexibilities and advantages including: 1) fully differential signaling configuration making the current flows in the channel minimizes the EMI inserted to the neighboring devices and substrate, 2) user-selectable 2-PAM or 4-PAM signaling mode offers application flexibility, 3) tunable signal amplitude can further improve the symmetric of the signal, especially in 4-PAM signaling mode, the tuning is also easy to achieve by simply tuning the DC current source in the preamplifier, 4) this configuration also minimizes the $V_{DD}$ fluctuation and ground bouncing by the DC current source. #### 2. Simulation Results The simulation result of the driver J is shown in Fig. 3.25(Left), while the current output of driver 2J is shown in Fig. 3.25(Right). By summing the total output current at the channel, we can readily get the 4-PAM output current, which is plotted in the Fig. 3.26. ## 3.3 Summary A state-of-the-art review for several conventional serial link transmitters with different drivers including inverter driver, LVDS driver, open-drain driver and class AB driver has been presented. The three proposed 10 Gbps transmitters are summarized in the followings: Figure 3.25: (Left): Current output of driver J, (Right): Current output of driver 2J. Figure 3.26: 4-PAM current output of transmitter. 1) A new current-mode class AB transmitter with a low supply voltage sensitivity. The rail-to-rail swing mode ensures that the output current is insensitive to supply voltage fluctuation. The full push-pull operation of the driver minimizes the static power consumption of the transmitter. A high speed is achieved by multiplexing in current-domain at input and the inductive shunt peaking with active inductors. The fully differential configuration of the transmitter minimizes the effect of both common-mode disturbances and electro-magnetic interference exerted from the channels to neighboring devices. - 2) A new fully differential 4-PAM CMOS current-mode transmitter. The high speed is achieved by using the current-domain multiplexing, current-mode digital-to-analog conversion, inductive shunt peaking with active inductors, and 4-PAM signaling. The fully differential configuration of the transmitter minimizes the effect of both common-mode disturbances and electro-magnetic interference from channels, making the transmitter particularly attractive for high-speed data transmission over long interconnects and printed-circuit-board traces. The fully differential configuration of the driver and DACs also ensures that the total current drawn from the supply voltage is constant, thereby minimizing noise injection to the substrate. - 3) A new 5/10 Gbps user selectable 2-PAM/4-PAM serial link transmitter, this proposed transmitter provides a lot of flexibility while inheres the advantages of a cuurent mode transmitter, including: 1) user selectable signaling mode, 2) 5:1 or 10:1 multiplexing rate, 3) user tunable signal amplitude, and 4) relaxed capacitance on multiplexing node by distributed multiplexing nodes. # Chapter 4 # Transmitter Pre-emphasis This chapter describes the pre-emphasis for a 2/4-PAM 5/10 Gbps serial link transmitter. Section 4.1 investigates the reason why pre-emphasis for a high-speed serial link transmitter is necessary, and reviews some existing pre-emphasis approaches. A proposed pre-emphasis scheme and the implementation are presented in the section 4.2. The simulation results and discussions are provided at the end of the chapter. # 4.1 Pre-emphasis - A State-of-the-Art Review As we have discussed in Chapter 2, a typical serial link transmitter consists of a multiplexer where incoming parallel data are serialized by a high-frequency multi-phase selection clock to an analog waveform with timing information embedded in the waveform, a pre-emphasis block where the analog waveform from the preceding parallel-to-serial converter is pre-coded to compensate for the high-frequency loss of the wire channels, and a driver that provides a sufficiently large signal and a matching impedance to the channel. The main design challenges encountered in design of high-speed serial link transmitters include: 1) the time to charge/discharge the large capacitance encountered at the multiplexing node of the multiplexer to which all parallel input branches are connected, especially when the parallel-to-serial ratio and the voltage swing of the multiplexing node are large, 2) large chip area and high power consumption of pre-emphasis blocks, especially when the number of pre-emphasis taps and the number of parallel input bits are large, 3) Low-jitter multi-phase ring oscillators for serialization, and 4) high-speed drivers that provide pulse-amplitude-modulated signals with narrow pulse width. Although many novel techniques have emerged in design of high-speed multiplexers for serialization [4, 10, 20, 21], high-speed drivers [4, 17], and low-jitter ring oscillators [19, 22, 23, 24], the stringent requirements of high-speed serial links demand that novel transmitter architectures and design techniques be continuously innovated. The need for transmitter pre-emphasis arises from the high-frequency loss of copper channels, mainly due to 1) the *ohmic loss* of interconnects and PCB traces caused by the skin effect of conductors and 2) the *dielectric loss* to the substrate and ground planes. The skin-induced resistance is quantified by [25] $$R(f) = \frac{K_R}{d} \sqrt{f},\tag{4.1}$$ where d is the radius or width of the conductor, and $K_R = 4.15 \times 10^{-8} \Omega^{1/2}$ for a round conductor or $1.3 \times 10^{-7} \Omega^{1/2}$ for a thin rectangular strip guide, and f is frequency. The attenuation of the signal when passing through the line, denoted by A(f,x), is calculated from $$A(f,x) = e^{-\frac{R(f)}{Z_0}x}, (4.2)$$ where $Z_o$ is the characteristic impedance of the line. Eq. (4.2) reveals that signal amplitude drops exponentially along the channel with the increase of the resistance of the channel R(f) and the length of the channel x. The frequency-dependent channel resistance results in a large time constant at high frequencies and gives rise to inter-symbol interference (ISI). Other sources of signal attenuation include signal energy radiation, the finite bandwidth of package parasitics, and the lump capacitance of the load. To reduce the inter-symbol interference, transmitter pre-emphasis [20, 25, 26] and receiver post-equalization [27, 28] have been proven to be effective. The former emphasizes the high frequency component of the signal by reducing the amplitude of the low-frequency components of the signal prior to transmission whereas the latter de-emphasizes the low-frequency component of the received signal by amplifying the amplitude of the high-frequency components of the received signal such that the resultant transfer function of the links is of an all-pass characteristic. For critical links, both transmitter pre-emphasis and equalization post-equalization are employed simultaneously to further reduce ISI [4]. As we have discussed in chapter 2, transmitter pre-emphasis is often realized using a symbol-spaced finite impulse response (FIR) filter $$V_o(n) = V_i(n) - \sum_{k=1}^{M} a_k V_i(n-k), \tag{4.3}$$ where $V_i(n)$ and $V_i(n-k)$ are the present and past $k^{th}$ -input of the pre-emphasis respectively, $V_o(n)$ is the output of the pre-emphasis, $a_k$ is the weighting factor of the past $k^{th}$ -input of the pre-emphasis, and M is the number of taps. The value of the weighting factors and the number of taps are determined by the characteristics and the length of the channel, as well as design specifications [25]. Because for pulse-amplitude (PAM) based signaling, most of the energy of the signal is carried by the low-frequency components of the signal, transmitter pre-emphasis suffers from the drawback of signal loss, resulting in a reduced signal-to-noise ratio. Also, as compared with receiver equalization, because there is no prior knowledge of the characteristics of the channel over which data are to be transmitted, the optimal parameters of pre-emphasis FIR filters can only be determined once the characteristics and the length of the channel are known. The ease of implementation of transmitter pre-emphasis and its modest cost both in terms of chip area and power consumption, however, make transmitter pre-emphasis a preferred choice in low-cost short-distance serial links. # 4.1.1 Pre-emphasis in Digital Domain The serial link transmitter proposed in [25] and shown in Fig. 4.1 is the first N-tap transmitter with digital pre-emphasis. In this design, all the FIR filter calculations are done by digital adders and a digital-to-analog converter to generate the output pulse. Driver parallelism is used to overcome the speed limitations of the process, and therefore each branch requires separate digital logic. The digital logic modules, having the present and the five previous bits, calculate the amplitude of the current pulse that should be transmitted, and then multiplexed and conveyed to the channel. In this scheme, each drive module is a 6-bit DAC to reduce the quantization error of the FIR filter. Therefore, the combination of these digital filters and high-resolution DACs operating at high-frequency making the transmitter circuit quite complex and power hungry. Moreover, this complexity can be limiting when using multi-level scheme. For an example, to implement a 4-PAM serial link transmitter preemphasis in digital domain, all the digital logic need to be doubled, and each DAC should be modified to have at least 7-bit resolution. Another drawback of this approach is that the digital logic will be a bottleneck for high-speed signaling. Figure 4.1: Serial link transmitter with digital pre-emphasis. ## 4.1.2 Pre-emphasis in Analog Domain The approach proposed in [4] overcomes this deficiency by employing an analog approach to generate filtered pulses directly and independent of past symbols, as shown in Fig. 4.2. This design also uses parallelism, but with an analog technique to realize the pre-emphasis filter. The FIR N-tap filter is integrated with the driver/multiplexer, Each module sums the current at the output with the current data and past data hold by the module. As a result, it removes the need for complex digital logic, both circuit complexity and power consumption are greatly reduced. The high-resolution DAC drivers are also replaced by 1-bit driver for each filter tap. Another key advantage of this design is that multiplexing is performed at the output node to take the full advantage of the low characteristic impedance of the channel, typically $50\sim75\Omega$ , to ensure a small time constant at the multiplexing node. Although this design is the first implementation in analog domain, it still suffers from the following drawbacks: 1) one current flows in the channel due to the differential current steering configuration, 2) large chip area for duplicating the input pre-emphasis taps, especially when tap number is large, and 3) to drive the channel directly, the DAC NMOS size has to be big, which leads to a large parasitic capacitance on multiplexing node, this is not a good sign for high speed signaling. Figure 4.2: Serial link transmitter with analog pre-emphasis. # 4.1.3 Pre-emphasis Using Pseudo-nMOS Multiplexer In [29], a serial link transmitter using pseudo-nMOS multiplexer as shown in Fig. 4.3 was proposed. Both the rising and falling edges of the clock are separated into two clock phases, making it possible to use four clock phases for each 4-to-1 multiplexer. Most importantly, the three-stage nMOS pull-down multiplexer avoids the need for the dynamic AND structure. This ensures the the interpolated multi-phases are aligned with the clock phases. This design also suffers from the following drawbacks: 1) to drive the channel directly, large and noisy buffers are needed, especially when pre-emphasis tap number is big, 2) each individual pre-emphasis tap requires one multiplexer, which leads to a large chip area, high-level dynamic power consumption, and large noise, and 3) current signal flows towards the same direction in the channel, which causes EMI to the neighboring devices. Figure 4.3: Serial link transmitter with pseudo-NMOS multiplexer pre-emphasis. # 4.2 Power-Area Efficient Pre-emphasis in Analog Domain The preceding pre-emphasis transmitters suffer from the common drawbacks that the past symbols for pre-emphasis $V_i(n-k)$ , k=1,2,...,M, are constructed using the past parallel inputs $D_i(n-k)$ , k=1,2,...,M. This causes the duplications of hardware for implementing pre-emphasis taps, especially when the duplicated hardware must be sufficiently large to driver the channel. As a result, a large chip area and a high level of power consumption are needed, especially when the number of pre-emphasis taps and the parallel-to-serial ratio are large. This section presents an area-power efficient 2/4-PAM pre-emphasis transmitter for 10 Gbps serial links. Unlike most pre-emphasis transmitters where $V_i(n-k)$ , k=1,2,...,M are constructed from $D_i(n-k)$ , i=1,2,...,N, k=1,2,...,M, the past symbols for transmitter pre-emphasis are obtained directly from a set of delay blocks whose input is $V_i(n)$ such that the re-construction of the past symbols in the digital domain is avoided. A significant reduction in both hardware cost and power consumption is achieved. # 4.2.1 Power-Area Efficient Current-Mode Pre-Emphasis Transmitter The block diagram of the proposed pre-emphasis transmitter is shown in Fig. 4.4. Figure 4.4: Architecture of the proposed area-power efficient pre-emphasis serial link transmitter. The incoming parallel data are first multiplexed by the fully differential multiplexer. The multiplexed signal is amplified by the class AB pre-amplifier. The output of the pre-amplifier is fed to both the current-symbol driver that generates $V_i(n)$ and the delay cells that output $V_i(n-k)$ , k=1,2,...,M. The delay of each delay cell is one symbol time $T_{sym}$ . The output of the delay cells are fed to the pre-emphasis drivers that convert the input voltage into a current with adjustable amplitude. The output currents from the current-symbol driver and the pre-emphasis drivers are summed at the output node. #### 4.2.2 The Multiplexer The schematic of the fully differential N-to-1 multiplexer is shown in Fig. 4.5. Figure 4.5: Fully differential multiplexer. The multiplexer employs pseudo-nMOS logic to take its speed advantages. The upper bound of the output voltage is determined from $$V_{o,max} = V_{DD} - V_p, \tag{4.4}$$ where $V_p$ is the pinch-off voltage of the pull-up pMOS transistor. To find out the lower bound of the output voltage, we notice that the pull-up pMOS transistor is in triode if $V_o > |V_{Tp}|$ . Clearly, the pull-up pMOS transistor is in saturation when the output voltage reaches its lowest boundary. It can be shown that the lower bound of the output voltage of the multiplexer is given by $$V_{o,min} = \frac{3}{2} \mu_p C'_{ox} \left(\frac{W}{L}\right)_p \left[V_{DD} - |V_{Tp}|\right]^2 R_n, \tag{4.5}$$ where $R_n$ is the channel resistance of the pull-down nMOS transistors in the triode, $\mu_p$ is the surface mobility of holes, $C'_{ox}$ is the gate capacitance per unit area, and $(W/L)_p$ is the aspect ratio of the pull-up pMOS transistor. Note that we have neglected the effect of the channel modulation in derivation of (4.5). The capacitance at the multiplexing node is estimated from $$C \approx \sum_{j=0}^{N-1} (C_{gd,j,1} + C_{db,j,1}) + (C_{gs} + C_{sb})_p, \tag{4.6}$$ where $C_{gd}$ and $C_{gs}$ are the gate-drain and gate-source capacitances, respectively, $C_{db}$ and $C_{sb}$ are the drain-substrate and source-substrate capacitances, respectively, the subindex p identifies the pull-up pMOS transistor, and N is the parallel-to-serial ratio of the multiplexer. Observe that (i) the slope of the rising edge is determined by $\tau_r = R_p C$ whereas that of the falling edge is set by $\tau_f = 3R_nC$ , where $R_p$ is the channel resistance of the pull-up pMOS transistor in triode. (ii) To increase the slope of the rising edge and that of the falling edge, $R_p$ and $R_n$ should be reduced. A decrease in $R_p$ increases $V_{o,min}$ whereas a reduction in $R_n$ increases C, resulted from the large number of input branches connected to the multiplexing nodes. To reduce the rise and fall times without sacrificing the output voltage swing and speed of the multiplexer, inductive shunt-peaking is employed, as shown in Fig. 4.5. The rise and fall times of the output voltage are now determined by the RLC networks. By tuning L and R, the rise and fall times can be greatly reduced. It was shown in [17] that the peaking inductor can be implemented using active inductors to avoid the drawbacks of on-chip spiral inductors including extremely area demanding, strong interaction with the substrate, and fixed and small inductance. The schematic of the multiplexer employing the self-biased active inductor initially developed for MESFETs [30] and later re-developed for CMOS [19] is shown in Fig. 4.5. It is trivial to show that the equivalent inductance $L_s$ and series resistance $R_s$ of the self-biased active inductor are given by $$R_s(\omega) = \frac{g_m + \omega^2 C_{gs}^2 R}{g_m^2 + \omega^2 C_{gs}^2},$$ $$L_s(\omega) = \frac{C_{gs}(g_m R - 1)}{g_m^2 + \omega^2 C_{gs}^2}.$$ (4.7) Note that we have neglected $C_{gd}$ , $C_{sb}$ , $C_{sb}$ , and other second-order effects in derivation of Eq. (4.7). Observe that $g_m R > 1$ is required to ensure that the network is inductive. Also, $R_s(0) = 1/g_m$ and $R_s(\infty) = R$ . The series resistance of the active inductor is largely set by $g_m$ whereas the equivalent inductance $L_s$ can be tuned by varying R. The upper bound of the output voltage of the multiplexer with the active inductor is determined from $$V_{o,max} = V_{DD} - |V_{Tp}|. (4.8)$$ The lower bound of the output voltage is calculated from $$\frac{V_{o,min}}{3R_n} = \frac{1}{2}\mu_n C'_{ox} \left(\frac{W}{L}\right)_{c} \left[V_{DD} - V_{o,min} - V_{Tn}\right]^2, \tag{4.9}$$ where $(W/L)_s$ is the aspect ratio of the active inductor transistor, $V_{Tn}$ is the threshold voltage of nMOS transistor, and $\mu_n$ is the surface mobility of free electrons. #### 4.2.3 The Pre-amplifier and Driver The differential output voltage of the multiplexer is amplified by the downstream class AB pre-amplifier, as shown in Fig. 4.6. To avoid the speed penalty arising from the complete turn-on/off of the transistors of the N-differential and P-differential pairs of the class AB amplifier, the swing of the input voltage of the pre-amplifier, which is the output of the preceding multiplexer, is kept small, together with the proper biasing of the differential pairs, such that the transistors of the differential pairs are always in saturation. To compensate for the large capacitance encountered at the input nodes of the drivers, resulted from the large size of the output stage, four self-biased active inductors consisting of $M_{ak}$ and $R_{ak}$ , k = 1, 2, 3, 4, are employed at these nodes to reduce the rise and fall times of the output voltage of the pre-amplifier. In what follows we analyze the DC and AC operations of the pre-amplifier and drivers. #### DC Operation In the DC steady state, the differential output voltage of the preceding multiplexer is zero and only a common-mode DC voltage $V_{DC}$ is present at the output of the multiplexer. In Figure 4.6: 2-PAM serial link transmitter driver with pre-emphasis. this case, the output current of the transmitter is zero. Because the series resistance of the self-biased active inductor is given by $1/g_m$ , as per Eq. (4.7), the voltages of the output of the pre-amplifier are determined from $$V_P, \overline{V}_P = V_{DD} - \frac{I_{1,2}}{g_{ma1,2}},$$ $$V_N, \overline{V}_N = \frac{I_{3,4}}{g_{ma3,4}},$$ (4.10) where $I_{1,2} = \frac{J_n}{2}$ , $I_{3,4} = \frac{J_p}{2}$ , $J_n$ and $J_p$ are the biasing current of the N-differential pair and P-differential pair of the pre-amplifier, respectively, and $g_{ma}$ is the transconductance of the active inductor transistors. To ensure that the input transistors of the differential pairs remain in saturation all the time, using the pinch-off condition, one can show that $$V_N - |V_{Tp}| < V_o^+, V_o^- < V_P + V_{Tn}$$ $$\tag{4.11}$$ is required. Further, to ensure that the output current of the transmitter is zero in the DC steady state, we impose $I_{DP} = I_{DN}$ . Making use of Eq. (4.10), we arrive at $$\mu_p \left(\frac{W}{L}\right)_{DP} \left(\frac{I_{1,2}}{g_{ma1,2}}\right)^2 = \mu_n \left(\frac{W}{L}\right)_{DN} \left(\frac{I_{3,4}}{g_{ma3,4}}\right)^2. \tag{4.12}$$ Eq. (4.12) is the guiding equation in sizing the transistors of both the pre-amplifier and the driver. #### **Transient Operation** When an input branch of the multiplexer is activated, a differential voltage $\Delta V$ is generated at the output of the multiplexer. The output voltage of the two output nodes of the multiplexer becomes $$\frac{V_o = V_{DC} + \frac{\Delta V}{2}}{\overline{V}_o = V_{DC} - \frac{\Delta V}{2}}.$$ (4.13) The corresponding output voltage of the pre-amplifier is given by $$\Delta V_P, \Delta \overline{V}_P = -R_{s,n} g_{m,N} \left(\frac{\Delta V}{2}\right), \Delta V_N, \Delta \overline{V}_N = -R_{s,p} g_{m,P} \left(\frac{\Delta V}{2}\right),$$ (4.14) where $R_{s,n}$ and $R_{s,p}$ are the series resistance of nMOS and pMOS active inductors, respectively, and $g_{m,N}$ and $g_{m,N}$ are the transconductance of nMOS and pMOS transistors of the differential pairs, respectively. Note that $R_{s,n} = 1/g_{ma3,4}$ and $R_{s,p} = 1/g_{ma1,2}$ at low frequencies. The corresponding current of the transistors in the output stage is obtained from $$\Delta I_{DP}, \Delta \overline{I}_{DP} = R_{sn} g_{m,N} g_{m,DP} \frac{\Delta V}{2},$$ $$\Delta I_{DN}, \Delta \overline{I}_{DN} = -R_{sp} g_{m,P} g_{m,DN} \frac{\Delta V}{2},$$ (4.15) $g_{m,DN}$ and $g_{m,DP}$ are the transconductance of nMOS and pMOS transistors of the output stage, respectively. The currents conveyed to the channel are obtained from $$\Delta I_{out}^{+} = \Delta I_{DP} - \Delta I_{DN} = \left( R_{sn} g_{m,N} g_{m,DP} + R_{sp} g_{m,P} g_{m,DN} \right) \frac{\Delta V}{2},$$ $$\Delta I_{out}^{-} = \Delta \overline{I}_{DP} - \Delta \overline{I}_{DN} = \left( R_{sn} g_{m,N} g_{m,DP} + R_{sp} g_{m,P} g_{m,DN} \right) \frac{\Delta V}{2}.$$ $$(4.16)$$ The preceding analysis shows that the transmitter conveys two currents of the same amplitude but opposite polarities to the channel. The amplitude of the output current is directly proportional to the output voltage of the preceding multiplexer. #### 4.2.4 The Pre-emphasis To avoid the re-construction of the past symbols $V_i(n-k)$ using $D_i(n-k)$ , I=1,2,...,N and K = 1, 2, ..., M in the digital domain, which is area and power greedy, in this design, $V_i(n-k)$ is generated directly from $V_i(n)$ in the analog domain. In order to be able to tune the preemphasis coefficients, delay units are inserted between the output of the pre-amplifier and the pre-emphasis drivers, rather than placed at the output of the pre-emphasis drivers. As shown in Fig. 4.6, the inverter inserted between the delay unit and the pre-emphasis driver forms a de-emphasis filter, ensures that the output current is J + aJ / -J - aJ, (J / -Jis the output current from the current-symbol driver, and aJ / -aJ is the output current from the pre-emphasis driver), depending on a transaction exist or not. As a result, the pre-emphasis amplitude 2aJ double the efficiency comparing to conventional pre-emphasis scheme. $M_{CP}$ and $M_{CN}$ of the pre-emphasis drivers behave as voltage-controlled current sources that control the output current of the pre-emphasis drivers whereas $M_{PP}$ and $M_{PN}$ are switches that steer the pre-emphasis currents. It should be noted that the complementary static inverter restores the output voltage of the pre-amplifier to full voltage swing. As a result, the transistor size of the pre-emphasis driver can be greatly reduced and still be able to generate needed pre-emphasis currents. A key advantage of this is that the chip area and power consumption of the pre-emphasis blocks can be made significantly smaller, even though the number of pre-emphasis taps is large. In what follows we use the example of Fig. 4.7 and Fig. 4.8 to illustrate the operation of the proposed 2-PAM and 4-PAM pre-emphasis scheme. Figure 4.7: 2-PAM signaling pre-emphasis analysis. Figure 4.8: 4-PAM signaling pre-emphasis analysis. In the circuits of Fig. 4.6, when $V_o: 1 \rightarrow 0$ and $\overline{V}_o: 0 \rightarrow 1$ , we have $V_P: 0 \rightarrow 1$ , $\overline{V}_P: 1 \rightarrow 0$ , $V_N: 0 \rightarrow 1$ , and $\overline{V}_N: 1 \rightarrow 0$ . As a result, $M_{DP}$ and $\overline{M}_{DN}$ switch off, $M_{DN}$ and $\overline{M}_{DP}$ switch on. The current conveyed to the channel by the current-symbol driver changes from J to -J. Because the operation of the pre-emphasis block is delayed by one symbol time and then inverted, so, $M_{PP}$ , $\overline{M}_{PN}$ switch off, while $M_{PN}$ , $\overline{M}_{PP}$ switch on, such that the current conveyed by the pre-emphasis drivers to the channel is -aJ, where a is the coefficient st by the control voltages of the pre-emphasis drivers. As a result, the net current conveyed to the channel is -J - aJ. Similarly, when $V_o: 0 \rightarrow 0$ and $\overline{V}_o: 1 \rightarrow 1$ , we have $V_P: 1 \rightarrow 1$ , $\overline{V}_P: 0 \rightarrow 0$ , $V_N: 1 \rightarrow 1$ , and $\overline{V}_N: 0 \rightarrow 0$ . As a result, the switches in driver stage $M_{DN}$ and $\overline{M}_{DP}$ stay ON, while $M_{DP}$ and $\overline{M}_{DN}$ stay OFF, so the driver stage conveys current -J to the channel, while in the pre-emphasis stage, the delayed and inverted signals switch on $M_{PP}$ , $\overline{M}_{PN}$ and switch off $M_{PN}$ , $\overline{M}_{PP}$ to convey the current +aJ to the channel, as a result, the total current to the channel is -J + aJ. Fig. 4.9 shows the simulation result. Figure 4.9: Output current of 2-PAM serial link transmitter with pre-emphasis. ## 4.2.5 4-PAM Current-Mode Pre-Emphasis Transmitter The preceding pre-emphasized 2-PAM transmitter can be developed to a 4-PAM pre-emphasized transmitter, as shown in Fig. 4.10. 4-level of signaling is achieved by setting the transistors Figure 4.10: Architecture of 4-PAM serial link transmitter. in current-symbol driver and pre-emphasis driver B twice that of A. Driver-A generates four output currents -J+aJ, -J-aJ, J+aJ, and J-aJ while driver-B provides the output currents 2(-J+aJ), -2(J+aJ), 2(J+aJ), and 2(J-aJ). By setting the input data as shown in Fig. 4.10, the sum of the output currents of the two drivers gives the output current of the 4-PAM transmitter. Extending the analysis on 2-PAM pre-emphasis in Section 4.3, the 4-PAM pre-emphasis will generate four values for each current level, shown in Table 4.1. Here, an example in Fig. 4.7 [R] is given for the 4-PAM pre-emphasis. In symbol time 1, the current transaction occurs from -3J to 3J, according to the previous analysis, the driver A and B output current value is changed from -J and -2J to J+aJ and 2(J+aJ), such that the total current 3(J+aJ) is achieved, which is the highest transaction (6J). While Table 4.1: 4-PAM current-mode transmitter with pre-emphasis. | Current level | Possible $I_{out}$ | | | | |---------------|--------------------|--------|--------|---------| | -3J level | -3J-3aJ | -3J+aJ | -3J-aJ | -3J+3aJ | | -J level | -J-3aJ | -J+aJ | -J-aJ | -J+3aJ | | J level | J-3aJ | J+aJ | J-aJ | J+3aJ | | 3J level | 3J-3aJ | 3J+aJ | 3J-aJ | 3J+3aJ | in symbol time 2, the input signal keeps no change, the current of the A and B drivers will drop to J-aJ and 2(J-aJ) and , as a result, the total output current will be stay at current level of 3(J-aJ), following the same way, the total current in symbol time 3 will be (2J-2aJ)+(-J-aJ)=J-3aJ. Following the analysis, one can easily reach results showed in the Fig. 4.8. The analysis is confirmed by simulation result in Fig. 4.15 [L]. #### 4.2.6 Simulation Results The proposed 4-PAM current-mode transmitter has been implemented using UMC's $0.13\mu$ m 1.2 V CMOS technology. The circuit parameters of the transmitter are tabulated in Table 4.2. The circuits are analyzed using *Spectre* from Cadence Design Systems with BSIM3.3 device models that count for both the parasitics and the high-order effect of MOSFETs. Fig. 4.11 plots the voltage of the multiplexing nodes of the multiplexer. It is seen that the voltage swing is approximately 0.3 V. This small voltage at the gate of the differential pairs will not be large enough to turn off the transistors of the downstream differential pairs. As a result, the differential pairs remain in saturation all the time. Figs. 4.12 plots the voltage at the input nodes of the current-symbol drivers and the preemphasis drivers. It is seen that the voltage swing at the input of the pre-emphasis drivers is restored to the full voltage whereas that at the current-symbol drivers is only approximately 0.2 V. It is this large voltage swing difference that enables the use of much smaller transistors in the pre-emphasis drivers to provide a sufficient pre-emphasis output current to the channel. As a result, the chip area and power consumption of the pre-emphasis drivers can be reduced Table 4.2: Circuit parameters (L=0.13 $\mu m$ is used for all transistors). | Blocks | Parameter | Driver-A | Driver-B | |----------------|--------------------------------------|------------------------|------------------------| | Dioons | | | | | 3.6.3.4.3 | $M_{01,,(N-1)3}$ | $2~\mu m$ | $2~\mu m$ | | Multiplexer | $M_S$ | $4~\mu m$ | $4~\mu m$ | | | $R_{in}$ | $8~\mathrm{k}\Omega$ | $8~\mathrm{k}\Omega$ | | | $M_{a1,a2}$ | $5.5~\mu m$ | $5.5~\mu m$ | | | $M_{a3,a4}$ | $10~\mu m$ | $10~\mu m$ | | Pre-amplifier | $M_N$ | $6~\mu m$ | $6~\mu m$ | | | $M_P$ | $12~\mu m$ | $12~\mu m$ | | | $M_{c1,c2}$ | $4~\mu m$ | $4~\mu m$ | | | $R_{a1,a2}$ | $1~\mathrm{k}\Omega$ | $1~\mathrm{k}\Omega$ | | | $R_{a3,a4}$ | $1.5~\mathrm{k}\Omega$ | $1.5~\mathrm{k}\Omega$ | | | $J_n$ | $0.6~\mathrm{mA}$ | 0.6 mA | | | $J_p$ | 1 mA | 1 mA | | Current-symbol | $M_{DP}\overline{M}_{DP}$ | $20~\mu m$ | $40~\mu m$ | | driver | $M_{DN} \ \overline{M}_{DN}$ | $10~\mu m$ | $20 \mu m$ | | Pre-emphasis | $M_{PP,PN} \overline{M}_{PP,PN}$ | $2~\mu m$ | $2~\mu m$ | | driver | $M_{CN}\overline{M}_{CN}$ | $1.5~\mu m$ | $3~\mu m$ | | | $M_{CP}\overline{M}_{CP}$ | $3~\mu m$ | $6~\mu m$ | | Delay block | $M_{DP,110} \ \overline{M}_{DP,110}$ | $12.5\mu m$ | $12.5\mu m$ | | | $M_{DN,110}$ $\overline{M}_{DN,110}$ | $5\mu m$ | $5\mu m$ | | | $M_{dP,18}$ $\overline{M}_{dP,18}$ | $5\mu m$ | $5\mu m$ | | | $M_{dN,18}$ $\overline{M}_{dN,18}$ | $20 \mu m$ | $20 \mu m$ | | | J | 1.3 mA | 1.3 mA | greatly. This is one of the key characteristics of the proposed pre-emphasis transmitter. The 4-PAM output current of the transmitter without pre-emphasis is shown Fig. 4.13 [L], together with the total current drawn from the supply voltage, as shown in Fig. 4.13 [R], the large current fluctuation is seen, when pre-emphasis is implemented, this fluctuation is caused by the delay block, which is built by the noisy inverter buffer chain, this problem can be avoided by building the delay block with differential pair Fig. 4.6. Fig. 4.14 plots the 4-PAM output current with the current drawn from the voltage source, however, the trade off for the differential pair delay buffer is the high power consumed by the delay block. The output current of the transmitter with pre-emphasis is plotted in Figure 4.11: Voltage of multiplexing node of fully differential multiplexer. Figure 4.12: Voltage of the critical nodes of pre-amplifier and driver, $V_{PN}, \overline{V}_{PN}, V_{PP}, \overline{V}_{PP}$ lags $V_N, \overline{V}_N, V_P, \overline{V}_P$ by $T_{sym}$ =200 ps. Fig. 4.15[L] with $V_{ctrln}=1.0~{\rm V}~/~V_{ctrlp}=0.2~{\rm V}$ . The four levels of the output current are evident. Also observed is that the current spacing between the adjacent output current levels is approximately uniform, ensuring that the opening of the three eyes of the received 4-PAM data is uniform. Fig. 4.15[R] shows the variation of the output current of the transmitter with pre-emphasis as a result of different pre-emphasis coefficients. They are obtained by varying $V_{ctrln}$ from 0.6 V to 1.0 V / $V_{ctrlp}$ from 0.6 V to 0.2 V with step of 0.2 V / -0.2 V. Figure 4.13: 4-PAM transmitter output current and current drawn from $V_{DD}$ , Left: without pre-emphasis, Right: with pre-emphasis and inverter buffer chains. Figure 4.14: 4-PAM transmitter output current and current drawn from $V_{DD}$ with pre-emphasis and differential pair delay block. Fig. 4.16 shows the eye diagram of the the current received at the far end of a 10 cm transmission line with and without pre-emphasis. The improvement in both the eye-width and eye-height are evident. The characteristics of the proposed 4-PAM pre-emphasis transmitter are summarized in Table 4.3 **Figure 4.15:** Output current of 4-PAM transmitter. Left - with pre-emphasis $V_{ctrl,n}=1.0~V$ and $V_{ctrl,p}=0.2~V$ ). Right: $V_{ctrl,n}$ is varied from 0.6 V to 1.0 V, and $V_{ctrl,p}$ varied from 0.6 V to 0.2 V with step 0.2 V. Figure 4.16: Eye diagram of the received current after 10 cm FR-4 cable. Left - without pre-emphasis; Right - with pre-emphasis $(V_{ctrl,n}=0.8\ V,\ V_{ctrl,p}=0.4\ V)$ . ## 4.2.7 Transmitter Layout The proposed 4-PAM transmitter with pre-emphasis has been implemented in TSMC 1.8 V $0.18\mu m$ technology. The layout is shown in Fig. 4.17. To minimize mismatches in the layout, all transistors width greater than $5\mu m$ are implemented with multi-finger structures. Another advantage for multi-finger structure is the reduced chip area. Table 4.3: Performance of transmitter. | Technology | UMC- $0.13\mu m$ , $1.2$ V CMOS | | | |-----------------------------|----------------------------------------|--|--| | Multiplexing ratio | 5-to-1, 10-to-1 | | | | Data rate of parallel input | 500 Mb/s | | | | Data rate of serial output | 5 Gbps & 10 Gbps | | | | Output current | 2/4-PAM tunable 3.5 mA peak-to-peak | | | | Power consumption | 57.6 mW with differential delay block, | | | | | 19.2 mW with inverter buffer chain | | | | Total transistor area | $26.845\mu m^2$ (exclude delay block) | | | | 4-PAM eye width | 185 ps | | | | 4-PAM eye height | 1.21 mA | | | As seen in Fig. 4.17, the DC biasing portion of the layout is guarded by guard ring, this ensures the DC biasing is not disturbed by digital portion on the chip. The local interconnection uses M1,2,3, while the M4,5,6 are used for global interconnection, $V_{DD}$ and $V_{SS}$ . ## 4.3 Summary The high-speed serial link transmitter pre-emphasis is introduced in this chapter, and a state-of-the-art review is performed on the approach of existing pre-emphasis configuration. An area-power efficient fully differential CMOS current-mode 2/4-PAM 5/10 Gbps serial link transmitter has been presented. To avoid the re-construction of the past symbols needed for transmitter pre-emphasis in the digital domain, as of most reported pre-emphasis transistors that require a large chip area and a high level of power consumption, the pre-emphasis of the proposed transmitter has been realized in the analog domain by employing a delay-inverter block for each pre-emphasis tap. The weight of each pre-emphasis tap can be tuned individually and independent of that of the current symbol. The multiplexing-at-input approach of the multiplexer implemented using pseudo-nMOS logic with self-biased active inductors ensures that the transistors of the multiplexing branches are minimum sized. A differential output current is obtained from a class AB pre-amplifier and push-pull configured current-symbol driver and pre-emphasis drivers. The high-speed operation of the transmitter is achieved from the following key design techniques: 1) the small voltage swing of critical nodes of the transmitter, in particular, the multiplexing nodes of the multiplexer and the output of the pre-amplifier, guarantees that all transistors of the transmitter are biased and operated in saturation, avoiding the speed penalty of switching on/off transistors completely, 2) the use of active inductors at these critical nodes greatly reduces the rise and fall times of the voltage of these node, and 3) the effective utilization of the low characteristic impedance of the channels over which data are transmitted. Figure 4.17: Layout of the proposed 4-PAM transmitter with pre-emphasis. # Chapter 5 # Multi-phase Clock Generation Multi-phase clock generation is critical in serial links. The quality of a clock is measured by phase noise. This chapter details the design of a 1GHz clock generation using a PLL. Section 5.1 introduces a typical PLL configuration. Section 5.2 devotes to the design of each block of the PLL. A new VCO delay cell with equal rising and falling time is proposed. Simulation results are presented at the end of this chapter. ## 5.1 Phase-Locked Loops A PLL is a feedback system that compares the output phase with the input phase. PLLs have been widely used to generate on-chip clocks. They are categorized in two types [38]: 1) XOR/LPF type PLL, and 2) charge-pump type PLL, as shown in Fig. 5.1. Figure 5.1: (a) XPR/LPF type PLL, (b) Charge-pump type PLL. A XOR/LPF type PLL consists of a phase detector (PD), a LPF, and a VCO. The PD compares the phases of reference clock and output clock, generating an error, is converted to a DC voltage by the LPF. The voltage varies the VCO frequency until the phases are aligned, the loop is locked. A charge-pump type PLL consists of a PFD, a charge pump (CP), a LPF, and a VCO. It senses the transitions at the input and output of the PLL, detects the phase or frequency difference, and activates the charge pump accordingly. When the output frequency is far from from the reference clock, the PFD and the charge pump converting the difference to a current. The LPF integrates the phase errors and servos the VCO driver steady-state phase error towards zero. If there is no phase and frequency difference between the two clocks to the PFD, the system is locked and charge pump remains relatively idle. A detailed block diagram used in this work is shown in Fig. 5.2. Figure 5.2: PLL configuration. ## 5.1.1 Loop Dynamics Fig. 5.3 shows a linear model of charge-pump type PLL. The model gives an-open loop transfer function $$\frac{Output - clock(s)}{Reference - clock(s)} = \frac{I_{cp}K_{VCO}}{2\pi s^2 C},$$ (5.1) Figure 5.3: Linear model of charge-pump type PLL. where $I_{cp}$ , $K_{VCO}$ and C are the charge pump current, VCO gain, and loop filter capacitor respectively. The closed-loop transfer function denoted by H(s) is given in the following equation, $$H(s) = \frac{\frac{I_{cp}K_{VCO}}{2\pi C_P}}{s^2 + \frac{I_{cp}K_{VCO}}{2\pi C}}.$$ (5.2) The two poles on the original point introduce -180 degree phase shift, making the system unstable. To compensate for the poles, a resistor in series with the loop filter capacitor is introduced to generate a zero to compensate for the poles. $$V_{ctrl} = \frac{I_{cp}K_{VCO}}{2\pi s} (R_f + \frac{1}{sC}). \tag{5.3}$$ Clearly, the system is stabilized by the introduced zero. Charge-pump type PLL has a drawback. Since the charge pump drives the series combination of $R_f$ and C, each time a current is injected into the loop filter, the control voltage experiences a large jump, even in the locked condition. Mismatches between the charging and discharging currents and the charge injection introduces voltage jumps in $V_{ctrl}$ . The resulted ripples severely disturb the VCO, corrupting the output phase. One solution for this problem is to add another capacitor of $C_p$ to reduce the ripples on $V_{ctrl}$ [39]. The loop filter is a second-order system now, yielding a third order PLL and creating stability issue. Nonetheless, if the capacitance of the added capacitor $C_p$ is about one-fifth to one-tenth of C, the frequency response remains relatively unchanged. ## 5.2 Building Blocks ## 5.2.1 Phase-Frequency Detector An ideal linear phase detector produces an output signal whose DC value is linearly proportional to the phase difference of two input signals. Ideally, the input-output characteristic should be linear, non-periodic, and monotonic for a large range of phase difference. In reality, the gain $K_{PD}$ of the phase detector is not constant and may depend on either the amplitude or duty cycle of the input signals. Another drawback of phase detectors is the periodic transfer characteristic, which implies that phase-shifts of $2\pi$ can not be distinguished. A solution to avoid the false locking is the phase-frequency detector (PFD) that senses both the phase and frequency differences [37]. A typical diagram of a PFD is shown in Fig. 5.4. Figure 5.4: Phase-frequency detector diagram. The PFD functions as follows: - 1. $\omega_A < \omega_B$ : PFD produces positive pulses at $Q_A$ , while $Q_B$ remains at zero. - 2. $\omega_A > \omega_B$ : PFD produces positive pulses at $Q_B$ , while $Q_A$ remains at zero. - 3. $\omega_A = \omega_B$ : PFD produces positive pulses at $Q_A$ or $Q_B$ with a width proportional to the phase-difference between the two input signals. In this work, a D-FlipFlop type PFD is selected to take the advantages of its wide-acquisition range, a fast-locked speed, and a constant gain over a phase range of $\pm 2\pi$ . The implementation of the PFD is shown in Fig. 5.5, where $\overline{V}_{out}$ is the output voltage of the PFD. Figure 5.5: D-FlipFlop PFD clock diagram and characteristic. The PFD works as follows: starting in state 0 ( $Q_A=Q_B=0$ ), a transition of A causes $Q_A$ to reach HIGH, further transitions of A will have no effect on $Q_A$ . A transaction of B, will causes $Q_B$ to reach HIGH, activating the AND gate and resetting both PFDs. It is important to note the non-periodic behavior of the transfer characteristic for the PFD shown in Fig. 5.5. In this work, the input A is the reference clock runs at 62.5 MHz, which comes from a off-chip oscillator. The input B is the clock from the VCO. The simulation result of the D-FlipFlop type PFD used in this work is shown in Fig. 5.6. ## 5.2.2 Charge Pump A charge pump consists of two switched current sources driving the loop filter, as shown in Fig. 5.7. The UP and DN signals are non-overlapping to minimize the DC power consumption. Ideal switching is realized by a three-state phase-frequency detector, the state of the switches can be described in the following: 1. UP=1/DN=0, the charge pump enters the charge mode. Figure 5.6: Simulation result of DFF phase-frequency detector. - 2. UP=0/DN=1, the charge pump enters the discharge mode. - 3. UP=0/DN=0, this state is the lock state, the voltage across the load capacitor remains unchanged. Device mismatches and current-leakage result in a phase error that is translated into timing jitter at VCO output. These errors can be categorized as the follows: 1. Current mismatch: the current sources $I_1$ , $I_2$ in the CP are essentially transistors biased in the saturation region. Non-ideality in the current mirror, which cause the current mismatches will be converted to a ripple voltage on the control line, then further converted into a phase error. Figure 5.7: Block diagram of charge pump. - 2. Charge injection: charge injection occurs when the UP/DN switches moves from an ON state to an OFF sate. When an MOS switch is ON, it operate in the triode region. During that time, it holds mobile charges in its channel. When it is turning OFF, the charge then flows out the channel and into the drain and source. In this case, the filter will be charged or discharged causing a variation in $V_{ctrl}$ . - 3. Charge sharing: charge sharing occurs when switch in a CP move from an OFF to an ON state. As shown in Fig. 5.7, node X and Y are the drain of charging and discharging current source. When the switches are OFF the voltage at the node X and Y are $V_{dd}$ and $V_{ss}$ respectively, the output node $V_{ctrl}$ is floating. When the switches are turned ON, the voltage at node X will decease, while that on node Y increase, resulting in charge sharing between the loop filter capacitor C, $C_X$ and $C_Y$ . As a result, a variation in $V_{ctrl}$ will be seen. The charge pump design in this work is shown in Fig. 5.8. When the phase of VCO output clock lags the reference clock, the UP is ON and DN is OFF, the charge current source charges the loop filter to increase the voltage of $V_{ctrl}$ such that the frequency of the VCO becomes higher to catch up the reference clock frequency. By contrast, if the phase leads that of the reference clock, the DN is ON and UP is OFF, then the discharge current source discharges the loop filter and decreases the speed of the VCO, and the phase will match the reference clock. Figure 5.8: Charge pump implementation. $\overline{M}_{1,2}=8~\mu\text{m}$ , $\overline{M}_{3,4}=15~\mu\text{m}$ , $\overline{M}_{5,6}=40~\mu\text{m}$ , $L=0.18~\mu\text{m}$ is used for all transistors. #### 5.2.3 Voltage-Controlled Oscillator As a key element of the PLL, the timing jitter of voltage-controlled ring oscillators is a critical figure-of-merit quantifying the performance of ring VCOs. The timing jitter of saturated ring oscillators mainly comes from the device and switching noise injected at threshold crossings [46] and is estimated from $$\overline{\Delta \tau^2} = \frac{\overline{v_n^2}}{(dV/dt)^2},\tag{5.4}$$ where $\overline{\Delta \tau^2}$ is the timing jitter, $\overline{v_n^2}$ is the power of the noise injected at the threshold-crossings, and dV/dt is the slew rate of the signal at the threshold-crossings. It was shown in [47, 48] that the waveform symmetry of saturated ring oscillators, specifically, the DC component of the impulse sensitivity function of the waveform, affects flicker-noise induced phase noise of the oscillators. The single sideband phase noise, denoted by $\mathcal{L}(\Delta\omega)$ , where $\Delta\omega$ is the frequency displacement from the oscillation frequency, due to the flicker noise source $\overline{i^2}_{n,1/f}$ = $\overline{i_n^2}\left(\frac{\omega_{1/f}}{\Delta\omega}\right)$ , where $\omega_{1/f}$ is the corner frequency of the flicker noise source, is given by $$\mathcal{L}(\Delta\omega) = 10\log\left[\frac{c_o^2}{q_{max}^2} \frac{\overline{i_n^2}/\Delta f}{4(\Delta\omega)^2} \frac{\omega_{1/f}}{\Delta\omega}\right],\tag{5.5}$$ where $q_{max}$ is the maximum charge displacement of the node at which $\mathcal{L}(\Delta\omega)$ is measured, and $c_o$ is the DC component of the Fourier series coefficient of the impulse sensitivity function (ISF) of the oscillator, denoted by $\Gamma(\omega_o\tau)$ , where $\omega_o$ is the oscillation frequency of the oscillator, at the node. Eq.(5.5) reveals that 1/f-induced phase noise can be lowered by reducing $c_o$ . If we approximate ISF of node i as $$\Gamma_i(\omega_o \tau) \approx \frac{f_i'(\omega_o \tau)}{f_{i,max}'^2},$$ (5.6) where $f_i(\omega_o \tau)$ is the waveform of the oscillator at the node and $f'_{i,max}$ is the maximum first-order derivative of the waveform, to have $c_o = 0$ , the first-order derivatives, i.e. the slopes, of the waveform of the oscillator in the rise and fall transition regions must have equal amplitude but opposite polarities. To minimize the phase noise of CMOS ring VCOs, the delay cell of the VCOs must be designed with the following characteristics: 1) the delay of the output voltage of delay cells should be insensitive to supply voltage fluctuation and ground bouncing. The differential configuration of VCO delay cells is usually mandatory, 2) the output voltage of delay cells should be symmetrical, i.e. the rise and fall times should be the same, and 3) the output voltage of delay cells should have a fast rising and falling edges to minimize the state transition time windows during which device and switching noise is converted into timing jitter. Based on the above analysis, a new fully differential ring VCO, as shown in Fig. 5.9 is proposed in this work. The positive latch and active inductors ensure that the state transition duration is minimized, reducing the device and switching noise converted into timing jitter during state transitions. The delay is controlled by adjusting the charging and discharging processes of the load. The channel resistance of $M_{9\sim10}$ is controlled by adjusting the control voltage $V_c$ , $M_{11\sim12}$ are used to implement the capacitor storing the charge. When a large resistance is seen by the drain and source coupled transistors $M_{11\sim12}$ , a fast rising at Figure 5.9: Proposed ring VCO delay cell. $\overline{M}_{1,2}=10~\mu\text{m}$ , $\overline{M}_{3,4}=15~\mu\text{m}$ , $\overline{M}_{5,6}=5~\mu\text{m}$ , $\overline{M}_{7,8}=25~\mu\text{m}$ , $\overline{M}_{9,10}=15~\mu\text{m}$ , $\overline{M}_{11,12}=30~\mu\text{m}$ $L=0.18~\mu\text{m}$ is used for all transistors. gate of $M_{1\sim 2}$ is seen, as a result a high frequency clock at $V_O$ is archived. By contrast, when a smaller resistance is seen, a lower frequency clock is obtained. In both the charging and discharging processes, the charge and discharge time constants are controlled by the added control network, achieving symmetrical waveform. This is demonstrated in Fig. 5.11(Left), and Fig. 5.11(Right), where both the rise and fall times of the output voltage of the proposed ring VCO are compared with those of the cross-coupled ring VCO shown in Fig. 5.10 with active load in [40, 41]. Figure 5.10: Cross-coupled ring VCO with active inductor loads in [40, 41]. $\overline{M}_{1,2}$ =10 $\mu$ m, $\overline{M}_{3,4}$ =5 $\mu$ m, $\overline{M}_{5,6}$ =15 $\mu$ m, L=0.18 $\mu$ m is used for all transistors. It is observed that the rise and fall times of the proposed ring VCO are nearly independent of the control voltage whereas those of the cross-coupled ring VCO with active load are strong functions of the control voltage. Figure 5.11: Comparison of the rise and fall times of the proposed ring VCO and the cross-coupled ring VCO with active load [40, 41]. The improved waveform symmetry of the proposed ring VCO is further evident in the time-domain response of the output voltage shown in Fig. 5.12. Figure 5.12: Output voltage waveform of proposed VCO with a single delay loop. Because the proposed ring VCO bears a strong resemblance to the preceding cross-coupled ring VCO with active inductor load. They therefore offers comparable oscillation frequencies. For this design, the required high-frequency on-chip clock is relaxed to be 1GHz by the 4- PAM transmitter. As can be seen in the Fig. 5.13, the frequency tuning range of the proposed ring VCO is much larger than that of the preceding cross-coupled ring VCO with active inductor load in [40, 41]. Figure 5.13: Frequency tuning range of 5-stage proposed ring VCO, $W=10~\mu\mathrm{m}$ for the control pMOS transistor. ### 5.2.4 Frequency Divider Frequency divider takes a periodic input signal and generates a periodic signal at a frequency that is a fraction of the input signal. Many works have been reported for both analog and digital application [42, 43]. A different approach is to perform the frequency division in analog/digital domain by using a chain of inverters [44]. This work uses the design from [45]. One primary advantage for this design is that it works at giga-herz-range with a simple configuration and a smaller chip area. The block diagram and the timing diagram is shown in Fig. 5.14, two divide-by-four blocks are connected in series to achieve the function of divide-by-sixteen. The simulation result is shown in Fig. 5.15. The 1GHz clock from VCO is divided by sixteen, this low-frequency clock is then fed to the phase-frequency detector to compare with a off-chip reference clock running at 62.5 MHz. The difference is converted to a voltage to tune the VCO clock frequency, until a locking condition is acquired. Figure 5.14: Divide-by-four frequency divider (a) Block diagram, (b) Timing diagram. The width of NMOS and PMOS transistors in inverters are 2.5 $\mu$ m and 5 $\mu$ m respectively, all transistor width for transmission gate is 4 $\mu$ m, 0.18 $\mu$ m is used for all transistor length. #### 5.2.5 Simulation Result The system-level simulation result is shown in Fig. 5.16. It can be seen that the system acquires to lock at around 300ns, while maintaining a stable VCO control voltage level in the locking condition. ## 5.3 Summary This chapter addresses the design issues on generating multi-phase clock for the transmitter, and present the charge-pump type PLL design for building blocks: PFD, charge-pump, VCO, and frequency divider. System-level and block-level simulation results are presented. Mismatches and non-linearities in the PFD and CP design are introduced. This work is carefully designed by taking the principle of minimized noise and a smaller transistor number to save the power and chip area. The emphasis of this chapter is on the design of VCO block, Figure 5.15: Simulation result of PFD. where ten evenly spaced clock phases are generated by tapping from the five-stage differential ring VCO. To reduce the jitter of the clock, the VCO delay elements are built using fully differential configuration with active inductor loads and latches. To minimize the jitter caused by the unequal rising and falling edges, a delay cell with a tunable delay between the stages was proposed. Another advantage of the proposed VCO delay cell is the large and linear tuning range. Figure 5.16: PLL control voltage, reference clock and VCO output clock. # Chapter 6 ## Conclusions ### 6.1 Conclusions This work deals with the design of 10 Gbps CMOS serial link transmitters over copper channels. An in-depth study of the serial link limitations and design techniques have been presented. A new 2-PAM $V_{DD}$ -insensitive transmitter and two new 4-PAM current-mode serial link transmitters for 10 Gbps signaling systems, and a new pre-emphasis scheme have been proposed. These designs inhere the following characteristics: - 1. Multiplexing-at-input: this approach minimizes the input transistor size, parasitic capacitance at multiplexing nodes, and power consumption. A smaller time constant at multiplexing nodes leads to a high signaling speed. - 2. Bandwidth extension with shunt-peaking inductors: inductors are used to compensate for the large capacitive loads at multiplexing and some critical nodes in the driver to further increase the speed of the transmitter. - 3. New area-power efficient pre-emphasis scheme: a new pre-emphasis scheme is proposed. This scheme considerably minimized complexity of a 4-PAM multi-tap FIR filter implementation at transmitter such that a smaller chip area and low-level power consumption are achieved. - 4. Active inductors: active inductors used in bandwidth improvement aggressively reduce the chip size and power consumption, while inhering characteristics of ease of implementation. - 5. 4-PAM signaling scheme: 4-PAM signaling scheme reduces the signal bandwidth, BER at the receiver end, and relaxes the on-chip clock frequency by a factor of two. The maximum on-chip date rate in a certain technology is doubled. With the application of the techniques discussed above, the simulation results confirm that the proposed 2/4PAM 5/10 Gbps pre-emphasis transmitter is capable of delivering a differential 3.5mA peak-to-peak output current to channels. It consumes 57.6 mW DC power with differential delay blocks, or 19.2 mW DC power with inverter buffer chains. The total transistor area of the transmitter is 26.845 $\mu m^2$ excluding the delay block. The current received at the far end of a 10 cm FR-4 microstrip has eye-width of 185 ps and 4-PAM eye-height of 1.21mA. ## 6.2 Future Work The future work of this thesis can be extended to the following directions: The first direction is the architecture of serial links. The use of pre-emphasis needs the information of channels to set the pre-emphasis coefficients. A "smart" receiver will probably need to feed back to the transmitter through a separate channel to "train" the transmitter pre-emphasis. How to design a serial link to perform such a function will be a interesting direction. The second direction on circuit design is dealing with the transistor mismatches. How to perform the offset corrections with very little performance over head for the receiver and at very short cycle time could be another direction. The third direction is the noise generation and transmission in serial links. Noise generally come from fluctuations of power rails, clock jitters of PLL, and device noise. Studies on theory and modelling of noise generation and transmission will greatly benefit the designers by providing design implications. The fourth direction is the application of current-mode design techniques. Current-mode circuits use current to represent the signals in a small voltage swing, this is a primary advantage for high-speed data transmission over voltage-mode circuits. Applying this technique to the design of most digital or mixed-signal blocks, such as ADC, DAC, multiplexer, CCO, and filter, can yield many novel circuit topologies. # **Bibliography** - [1] R. Farjad-Rad, C. Yang, M. Horowitz, and T. LEE, "A 0.4μm CMOS 10-Gb/s 4-PAM pre-emphasis serial link transmitter," *IEEE J. of Solid-State Circuits*, vol. 34, No. 5, pp. 580-585, May 1999. - [2] T. Lee, The design of CMOS radio-frequency integrated circuits, Cambridge University Press, December 2003. - [3] W. Dally and J. Poulton, *Digital System Engineering*, Cambridge University Press, 1998. - [4] R. Farjad-Rad, C. Yang, and M. Horowitz, "A 0.3μm CMOS 8-Gb/s 4-PAM serial link transceiver," *IEEE J. of Solid-State Circuits*, vol. 35, No. 5, pp. 757-764, May 2000. - [5] M. Lee, W. Dally, and P. Chiang, "Low-power area-efficient high-speed I/O circuit techniques," *IEEE J. of Solid-State Circuits*, vol. 35, No. 11, pp. 1591-1599, November 2000. - [6] M. Mansuri and C. Yang, "Jitter optimization based on phase-lock loop design parameters," *IEEE J. Solid-State Circuits*, vol.37, No.11, pp.1375-1382, November 2002. - [7] K. Wong and C. Yang, "Offset Compensation in Comparators with Minimum Input-Referred Supply Noise," *IEEE J. Solid-State Circuits*, vol.39, No.5, pp.837-840, May 2004. - [8] R. Farjad-Rad and M. Horowitz, "An Equaliztion Scheme for 10Gb/s 4-PAM Signaling - Over Long Cables," in *Proc. IEEE Mixed Signal Connference, Digest of technical papers*, Cancun, Mexico, July 1997. - [9] A. Boni, A. Pierazzi, and D. Vecchi, "LVDS I/O interface for Gb/s per pin operation in 0.35μm CMOS," *IEEE J. Solid-State Circuits*, vol. 36, No. 4, pp. 706-711, April 2001. - [10] C. Yang, R. Farjad-Rad, and M. Horowitz, "A 0.5μm CMOS 4.0-Gbit/s serial link transceiver with data recovery using oversampling," *IEEE J. Solid-State Circuits*, vol. 33, No.5, pp. 713 -722, May 1998. - [11] V. Staojanovic and M. Horowitz, "Modeling and analysis of high-speed links," in *Proc. IEEE Custom Integrated Circuits Conference*, pp.589 594, September 2003. - [12] H. Johansson and C. Svensson, "Time resolution of NMOS sampling switches used on low-swing signals," *IEEE J. of Solid-State Circuits*, vol. 35, No. 11, pp. 237-245, Feberary 1998. - [13] M. Li and F. Yuan, "A new fully differential 4-PAM current-mode transmitter for 10 Gbps serial links in 0.13μm CMOS," in Proc. IEEE Mid-West Symp. Circuits and Systems. Cincinnati, pp. 1665 - 1668, August 7-10, 2005. - [14] S. Mohan, M. Hershenson, and T. Lee, "Bandwidth extension in CMOS with optimized on-chip inductors," *IEEE J. Solid-State Circuits*, vol.35, No.3, pp.346-355, March 2000. - [15] B. Sun and F. Yuan, "A new inductor series-peaking technique for bandwidth enhancement of CMOS current-mode circuits," Analog Integrated Circuits and Signal Processing, vol.37, No.3, pp.259-264, December 2003. - [16] M. Li and F. Yuan, "A 0.13μm CMOS current-mode 5/10-Gb/s 2/4-PAM serial link transmitter with an area-power efficient pre-emphasis scheme," Analog Integrated Circuits and Signal Processing. Submitted in August 2005. - [17] J. Jiang and F. Yuan, "A new CMOS current-mode multiplexer for 10Gbps serial links," Analog Integrated Circuits and Signal Processing, Vol. 44, No. 1, pp. 61-76, July 2005. - [18] S. Mohan, M. Hershenson, S. Boyd, and T. Lee, "Bandwidth extension in CMOS with optimized on-chip inductors," *IEEE J. of Solid-State Circuits*, vol. 35, No. 3, pp. 346-355, March 2000. - [19] E. Sackinger and W. Fischer, "A 3-GHz 32-dB CMOS limiting amplifier for SONET OC-48 receivers," *IEEE J. Solid-State Circuits*, vol. 35, No. 12, pp. 1884-1888, December 2000. - [20] C. Lin, C. Tsai, C. Chen, and S. Jou "4/2PAM serial link transmitter with tunable pre-Emphasis," in Proc. IEEE Int'l Symp. on Circuits and Systems, pp.952-955 May 2004. - [21] F. Yuan "A fully differential 8-to-1 current-mode multiplexer for 10Gbps serial links in 0.18-micron CMOS," *IEE Electronics Letters*, vol 40. No. 13, pp. 789-790, June 2004. - [22] C. Park and B. Kim, "A low-noise, 900-MHz VCO in 0.6μm CMOS," IEEE J. Solid-State Circuits, Vol. 34, No. 5, pp. 586-591, May 1999. - [23] D. Jeong, S. Chai, W. Song, and G. Cho, "CMOS current-controlled oscillators using multiple-feedback-loop ring architectures," in *Proc. Int'l Solid-State Circuit Conf.*, pp.386-387, 1997. - [24] Y. Eken and J. Uyemura, "A 5.9-GHz voltage-controlled ring oscillator in 0.18μm CMOS," *IEEE J. Solid-State Circuits*, Vol. 39, No. 1, pp. 230-233, January 2004. - [25] W. Dally and J. Poulton, "Transmitter equalization for 4-GBPS Signaling," in Proc. Hot Interconnects Symposium, Digest of Technical Papers, Vol. 17. Issue 1, pp. 48-56, January - Feberary 1997. - [26] P. Westergaard, T. Dickson, and S. Voinigescu "A 1.5V 20/30 Gb/s CMOS backplane driver with digital pre-emphasis." in *Proc. IEEE Custom Integrated Circuits Conf.*, pp. 23-26, October 2004. - [27] J. Zerbe, C. Werner, V. Stojanovic, F. Chen, J. Wei, G. Tsang, D. Kim, W. Stonecypher, A. Ho, T. Thrush, R. Kollipara, M. Horowitz, and K. Donnelly, "Equalization and clock recovery for a 2.5-10-Gb/s 2PAM/4-PAM backplane transceiver cell," *IEEE J. of Solid-State Circuits*, vol. 38, No. 12, pp. 2121-2130 December 2003 - [28] Y. Kudoh, M. Fukaishi, and M. Mizuno, "A 0.13μm CMOS5-Gb/s 10-m 28AWG cable transceiver with no-feedback-loop continuous-time post-equalizer," *IEEE J. of Solid-State Circuits*, vol. 38, No. 5, pp. 741-746, May 2003. - [29] P. Chinag, W. Dally, M. Lee, R. Senthinathan, Y. Oh, and M. Horowitz "A 20-Gb/s 0.13μm CMOS serial link transmitter using an LC-PLL to directly drive the output multiplexer," *IEEE J. Solid-State Circuits*, vol 40. No. 4, pp. 1004-1011, April 2005. - [30] S. Hara, T. Tokumitsu, T. Tanaka, and M. Aikawa, "Broadband monolithic microwave active inductor and its application to miniaturized wideband amplifiers," *IEEE Trans. Microwave Theory and Applications*, vol. 36, No. 12, pp. 1920-1924, December 1988. - [31] C. Yang and M. Horowitz, "A 0.8μm CMOS 2.5 Gb/s oversampling receiver and transmitter for serial links," *IEEE J. Solid-State Circuits*, vol. 31, No. 12, pp. 2015-2023, December 1996. - [32] M. Lee, W. Dally, and P. Chiang, "Low-power area-efficient high-speed I/O circuit techniques," *IEEE J. of Solid-State Circuits*, vol. 35, No. 11, pp. 1591-1599, November 2000. - [33] M. Horowitz, C. Yang, and S. Sidiropoulus, "High-speed electrical signaling: overview and limitations," *IEEE Micro*, vol. 18, No. 1, pp. 12-24, Jan.-Feb. 1998. - [34] J. Liu and X. Lin, "Equalization in high-speed communication systems," *IEEE Circuits and Systems Magazine*, pp. 4-17, 2nd quarter, 2004. - [35] J. Choi, M. Hwang, and D. Jeong, "A 0.18- $\mu$ m CMOS 3.5-Gb/s continuous-time adap- - tive cable equalizer using enhanced low-frequency gain control method," *IEEE J. Solid-State Circuits*, vol. 39, No. 3, pp. 419-425, March 2004. - [36] F. Yuan and M. Li, "Waveform symmetry of CMOS voltage-controlled ring oscillators," Analog Integrated Circuits and Signal Processing. Submitted in May 2006. - [37] B. Razavi Monolothic phase-locked loops and clock recovery circuits, IEEE Press, 1996. - [38] B. Razavi, Design of analog CMOS integrated circuits, McGraw-Hill., July 2000. - [39] T. Lee and J. Bulzachelli, "A 155MHz clock recovery delay and phase-locked loop," *IEEE J. Solid-State Circuits*, vol. 27, No. 12, pp. 1736, December 1992. - [40] J. Lee and B. Kim, "A low-noise fast-lock phase-locked loop with adaptive bandwidth control," *IEEE J. Solid-State Circuits*, vol. 35, No. 8, pp. 1137-1145, August 2000. - [41] J. Kim, S. Lee, T. Jung, C. Kim, S. Cho, and B. Kim, "A low-jitter mixed-mode DLL for high-speed DRAM applications," *IEEE J. Solid-State Circuits*, vol. 35, No. 10, pp. 1430-1436, October 2000. - [42] K. Yamamoto, T. Norimatsu, and M. Fujishima, "1V 2GHz CMOS frequency divider," Electronics Letters, vol. 39, No. 17, pp. 1227-1228, 2003. - [43] B. Razavi, K. Lee, and R. Yan, "A 13.4GHz CMOS frequency divider with programmable input sensitivity," in *Proc. IEEE Int.l Solid-State Circuits Conf.*, pp. 176-177, Feberary 1994. - [44] M. Nogawa and Y. Ohtomo, "A 16.3GHz 64:1 CMOS frequency divider," in Proc. Asia-Pacfic Conf. ASICs, Cheju, Korea, pp. 95-98, August 2000. - [45] C. Saavedra, "A microwave frequency divider using an inverter ring and transmission gates," *IEEE Microwave and Wireless Components Letters*, vol. 15, No. 5, pp. 330-332, May 2005. - [46] T. Weigandt, B. Kim, and P. Grey, "Analysis of timing jitter in ring oscillators," in *Proc. IEEE Int'l Symposium on Circuits and Systems*, pp. 27-30, London, 1994. - [47] A. Hajimiri, S. Limotyakis, and T. Lee, "Jitter and phase noise in ring oscillators," *IEEE J. Solid-State Circuits*, Vol. 34, No. 6, pp. 790-804, June 1999. - [48] J. Post, I. Linscott, and M. Oslick, "Waveform symmetry properties and phase noise in oscillators," *IEE Electronics Letters*, vol. 34, No. 16, pp. 1547-1548, August 1998. - [49] F. Yuan and M. Li, "A new CMOS class AB serial link transmitter with low supply voltage sensitivity," *Analog Integrated Circuits and Signal Processing*. Accepted for publication in May 2006. - [50] M. Li and F. Yuan, "A CMOS current-mode 5/10 Gbps 2/4-PAM serial link transmitter with an area-power efficient pre-emphasis scheme," in *Proc. CMC Microsystem Annual Symp.* Accepted for publication in October 2005. - [51] F. Yuan and M. Li, "A new area-efficient 4-PAM 10 Gbps CMOS serial link transmitter," in Proc. IEEE Int'l Symp. Circuits and Systems, Kos, Greece. Accepted for publication in January 2006 - [52] M. Li and F. Yuan, "A 0.13μm CMOS 10 Gbps current-mode class AB serial link transmitter with low supply voltage sensitivity," in Proc. IEEE Great Lakes Symp. Circuits and Systems, Philadelphia. Accepted for publication in Feberary 2006. ## Glossary CMOS Complementary Metal Oxide Semiconductor ICIntegrated CircuitPCBPrinted Circuit BoardCDRClock and Data RecoveryEMIElectro-Magnetic Interference SNR Signal-to-Noise Ratio BER Bit Error Rate LVDS Low-Voltage Differential Signaling ISI Inter-Symbol Interference SOC System-On-Chip Gbps Giga Bits Per Second Mbps Mega Bits Per Second PAM Pulse Amplitude Modulation Analog-to-Digital Converter ADC Digital-to-Analog Converter DAC FIR. Finite Input Response Non-Return to Zero NRZ Direct Current DCAC**Alternating Current** Phase-Locked Loop PLL Delay-Locked Loop DLL VCO Voltage Controlled Oscillator CCO Current Controlled Oscillator FR4 Flame Resistant 4 LAN Local Area Network WAN Wide Area Network MUX Multiplexer BSIM Berkeley Short-channel IGFET Mode ISF Impulse Sensitivity Function TSMC Taiwan Semiconductor Manufacturing Company UMC United Microelectronics Corporation