# Staggered Twisted-Bundle Interconnect for Crosstalk and Delay Reduction

Hao Yu and Lei He Electrical Engineering Department University of California, Los Angeles 90095 {hy255, lhe}@ee.ucla.edu

Abstract-To achieve small delay and low crosstalk for multiple signal nets with capacitive and inductive coupling, we propose in this paper a novel interconnect structure, staggered twisted-bundle wires where groups of twisted wires are staggered. This new structure is different from the previously proposed twisted bundle wires with one group of twisted wires and another group of normal wires. Using accurate circuit models and efficient algorithms to find the worst case noise and delay for comprehensive combinations of signal patterns and a range of arrival times, we assume signal and shielding ratio over 1:1 for area reduction and compare the aforementioned two structures to coplanar shielding for signal nets. The staggered twisted-bundle has the smallest worst case delay, up to 20% and 5% smaller than the coplanar shielding and twisted bundle, respectively. The staggered twisted bundle also has the smallest worst case noise, up to 6% and 12% less than coplanar shielding and twisted bundle. Furthermore, the staggered twisted bundle has the smallest delay/noise variation between signal nets. We conclude that without increasing routing area, the staggered twisted bundle is better than the twisted bundle and coplanar shielding in terms of performance and noise.

## I. INTRODUCTION

Continuous advancements in the field of very large scale integrated circuits have resulted in smaller feature size (nano-meter) and faster operating frequency (giga-hertz). It brings the difficulty to ensure the integrity of signals traveling in the global interconnects that connect tens to hundreds modules in a chip. The crosstalk between signals becomes more severe with increased signal slew rate but deceased noise margin of logic devices. The existing work primarily studied the cross-talk noise by capacitive coupling. Gao et al. [1] presented an ILP formulation of track permutation for capacitive crosstalk reduction. Kahng et al. [2] placed the repeater in the middle of its neighboring repeaters to reduce the switching factor to 1. Gupta et al. [3] used the wire swizzling to change the signal arrival time for the sake of reducing crosstalk introduced delay uncertainties. These design techniques all explicitly consider only the capacitive coupling, a short range effect.

The inductance becomes non-negligible when the slew rate of the switching signal becomes sharp yet the interconnect resistive loss is not large [4]. Moreover, because the inductive coupling is a long range effect, it further results in two difficulties: (1) it is not easy to determine the path of the returned current; (2) the crosstalk introduces strong space correlation for two signals no matter whether they are adjacent or apart. The following work considers minimizing inductive coupling introduced crosstalk. Massoud et al. [5] applied inter-digitized co-planar shielding to minimize selfinductance, He et al. [6] proposed simultaneously shield insertion and net ordering, and Kaul et al. [7] introduced the active shields by applying complementary signals on shields. Zhong et al. [8] applied a twisted bundle technique to reduce the mutual inductance between the twisted and normal groups, and Deng et al. [9] further developed an optimal algorithm to find the minimum twist numbers for the differential twisted bus architecture [10] considering the ECO (engineering change order) introduced obstacles. Coplanar shielding in Fig. 1 (a) is most effective to reduce crosstalk, where the capacitive coupling is shielded and the inductance has a dedicated returnpath. However, it increases the capacitance and delay. Moreover, this method becomes area-consuming when assigning one shield for

This research is partially supported by NSF CAREER Award CCR-0401682, SRC grant 1100, and a UC MICRO grant sponsored by Analog Devices, Fujitsu, Intel and LSI Logic. Address comments to lhe@ee.ucla.edu.



Fig. 1. The wire diagram for 6 signal nets with signal/shield ratio 3:1, where dark lines indicate signal nets and gray lines indicate shield nets: (a) coplanar shielding; (b) twisted bundle; and (c) staggered twisted bundle.

each signal net. To reduce area, several signal nets can share one common local ground (shield). In this case, there is a significant delay variation among signal nets as observed in the on-chip measurement [11] since each signal net has a different current-loop configuration. As to the twisted bundle [8] in Fig. 1 (b), there are two groups of wires: the twisted group and normal group. In the twisted group, the polarity of the inductance current-loop, composed by the signal net and its local ground (shield net), is symmetrically changed such that the mutual inductive coupling between the normal and twisted groups is significantly reduced. As stated in [8], the normal group is required to obtain such inductive coupling reduction. However, because of the existence of normal groups, the capacitive coupling between the twisted and normal groups and inside the normal group is still large. Moreover, the delay variation between different nets is non-negligible as well because each signal net has a different loop inductance and capacitive coupling length. In contrast to the twisted bundle with both the twisted and normal groups [8], we propose in this paper a novel interconnect structure staggered twisted bundle by staggering adjacent twisted groups without using normal groups (See Fig. 1 (c)). Compared to the twisted bundle, the new structure preserves the minimal inductive coupling yet further reduces the capacitive coupling. Therefore, both the delay and crosstalk noise are reduced in the staggered twisted bundle. Furthermore, shields are uniformly distributed for all signal wires such that the delay and noise variations among signal nets are also reduced. In this paper, we also present how to synthesize the staggered twisting pattern for the desired signal/shield ratio. Moreover, an accurate PEEC model with model reduction is used to quantitatively analyze the worst case noise and delay for comprehensive combinations of signal patterns and a range of arrival times. The impacts of the different signal/shield ratios and staggering numbers are also studied. The experiment shows that for 18 signal nets, the staggered twisted-bundle achieves:

- 1) the smallest worst case delay (up to 20% less than COPS and 5% less than TWB);
- 2) the smallest worst case noise when signal/shielding ratio is over 1:1 (up to 6% less than COPS and 12% less than TWB), and a similar worst case noise compared to the co-planar shielding when signal/shielding ratio is 1:1 (in this case, STWB and COPS reduce noise by 11% compared to TWB);
- 3) the smallest variation in worst case delay (up to 26% less than COPS and 19% less than TWB) and worst case noise (up to 17% less than COPS and 28% less than TWB).

Note that twisting wires will introduce additional vias t





Fig. 2. Wire diagram of twisted pair with N stage twists: two signal nets (aggressor, victim) and two local grounds (shields).



Fig. 3. The dedicated model and its equivalent model for inductive coupling introduced crosstalk.

doglegs. There are two types of vias in three-dimensional (3-D) stacked interconnect design: turn vias and terminal vias [12]. Because terminal vias link interconnects with the silicon surface, i.e., the terminal of devices like repeaters, they can cause blockage at all metal layers in between. On the other hand, turn vias, the type of vias introduced by twisting wires, are essentially an internal part of an interconnect, it does not cause additional blockage to that caused by doglegs of interconnects.

The rest of this paper is organized as follows: In Section II, we use the twisted pair structure and low-frequency crosstalk analysis to explain why staggering and twisting reduce the capacitive and inductive coupling. In Section III, we present how to generate the staggered twisting pattern for any signal/shielding ratio. We further discuss an accurate PEEC model and its macro-model to analyze the delay and noise for the staggered twisted-bundle. Finally, we present the experimental results considering worst case delay and noise in Section IV, and conclude in Section V.

### II. TWISTED SIGNAL-SHIELD PAIRS

The crosstalk and power consumption of twisted pair wires are analyzed in this part. The twisted pair (TWP) means each signal has a shield as the local ground. To easily understand why the twisted structure can significantly reduce the crosstalk, we first present a low frequency description. As shown in Fig. 2, we assume an N-stage twisted pair with unit length l per stage, where the aggressor is a normal straight wire with a shield (local ground) and the victim is twisted together with another shield. We assume that the aggressor source voltage is  $V_{Asrc}$  with source/load impedance  $Z_{Asrc}/Z_{Ald}$ , and the victim has the source/load impedance  $Z_{Vsrc}/Z_{Vld}$ . When the operating frequency is not high (less than GHz), the current/voltage at each stage are approximately independent on its position. Note that the crosstalk induced the noise at receiver contains two parts:  $V_{ind}$ and  $V_{cap}$ .

### A. Crosstalk of Twisted Pair

We first determine the inductive crosstalk introduced noise:  $V_{ind}$ . Fig. 3 (a) is a dedicated description for the inductive crosstalk in twisted wires, and Fig. 3 (b) is its equivalent model by superposing the induced voltage at each twisting stage. We assume that the current variable at *ith* stage of the aggressor is  $I_{Ai}$ , and the mutual inductance (unit length) between the loop of the aggressor with its local ground and the twisted victim with its ground  $M_0$ . Then the superposed total  $V_{induced}$  is

$$V_{induced} = s M_0 l (I_{A1} - I_{A2} + I_{A3} - \dots)$$
(1)



Fig. 4. The complete model and its equivalent model for capacitive coupling introduced crosstalk.

where the aggressor current-variable  $I_{Ai}$  is

$$I_{A1} \cong I_{A2} \cong I_{A3} \dots \cong I_A = \frac{V_{Asrc}}{Z_{Asrc} + Z_{Ald}}$$
(2)

under the low frequency approximation. Therefore, the inductive crosstalk noise  $V_{ind}$  observed at receiver is

$$V_{ind} = \begin{cases} 0 & \text{if N is even;} \\ (sM_0l) \frac{Z_{Vld}}{Z_{Vld} + Z_{Varc}} I_A & \text{if N is odd.} \end{cases}$$
(3)

Obviously, to achieve minimum inductive coupling we need design even number of stages. It implies the twisting number is odd (N-1). Note that this finding is based on the low frequency analysis, where the current at each twisted stage is approximately the same (Eq. (2)). This approximation can be still achieved in the high frequency range when the segment length l is sufficiently small such that the two neighboring current filaments are still approximately equal, which contribute a total zero magnetic flux. In other words, when the wire length is decided, we need increase the number of twists to achieve minimum inductive crosstalk. However, the twisting number can not be too large as the via resistance will become not negligible if the segmented length becomes comparable to the dimension of the via.

We further determine the capacitive-coupling introduced noise:  $V_{cap}$ . As shown in Fig. 4, we assume that the coupling capacitance (unit length) between the two signal nets (the aggressor and victim) is  $C_0$  when the upper twisted wire is victim, and is  $\alpha C_0$  ( $0 < \alpha < 1$ ) when the lower twisted wire is victim, where the factor of  $\alpha$  reflects the effect of shielding between the aggressor and victim. Then we have a superposed total  $I_{induced}$ 

$$I_{induced} = sC_0 l(V_{A1} + \alpha V_{A2} + V_{A3} + \dots)$$
(4)

with the aggressor voltage-variable  $V_{Ai}$  at each stage

$$V_{A1} \cong V_{A2} \cong V_{A3} \dots \cong V_A = \frac{Z_{Ald}}{Z_{Asrc} + Z_{Ald}} V_{Asrc}.$$
 (5)

The capacitive crosstalk  $V_{cap}$  observed at the receiver then becomes

$$V_{cap} = \left(sC_0 N l \frac{1+\alpha}{2}\right) \frac{Z_{Vsrc} Z_{Vld}}{Z_{Vsrc} + Z_{Vld}} V_A \tag{6}$$

where Nl is a constant when the wire length is given. Clearly, the capacitive coupling becomes the dominant crosstalk contribution as there is a factor of N difference compared to the inductive crosstalk (when N is odd). Therefore, in the design of Fig. 5 (a) [8] the two signal wires experience a capacitive coupling (with no shield inside) in a range that is half of the wire length. This situation becomes even severe when there are more signal nets sharing with one shield, (i.e., the structure of twisted bundle), where the capacitive coupling among the normal wires becomes the primary source of the crosstalk. Therefore, the application of this layout technique is limited without the proper treatment to reduce the coupling capacitance.

### B. Crosstalk of Staggered Twisted Pair

We notice that this situation actually can be alleviated by staggering the twists as shown in Fig. 5 (b), where shields are alternatively routed with signal nets. Let the number of staggering stagger to be  $N_{stag}$ . Note that there exist two twists in every stagger





Fig. 5. Wire diagram with analysis of inductive and capacitive crosstalk for staggered and un-staggered twisted pair.

Clearly, when wires are managed in the staggered style, the capacitive coupling-length is effectively reduced by a factor of  $2N_{stag}$ . For example, for the case of one staggering stage in Fig. 5 (b), the coupling length is reduced to l/4. In general, the capacitive crosstalk voltage for staggered design reduces to:

$$V_{cap} = \left(sC_0 N l \frac{1+\alpha}{4N_{stag}}\right) \frac{Z_{Vsrc} Z_{Vld}}{Z_{Vsrc} + Z_{Vld}} V_A \tag{7}$$

Further note that we still preserve the minimum inductive coupling by alternatively twisting with proper offset. A staggered twisted-pair is shown in Fig. 5 (b), where we alternatively place four type of unit cells: a twisting cell, a normal cell and their complementaries. Note that for *ith* unit cell in the staggered structure, its overall coupling magnetic-flux is the summation:

$$\Phi_i = \sum_{j \neq i} (\phi_{ij} - \phi'_{ij}) \tag{8}$$

where '-' sign indicates the contribution from jth's complementary. It approaches zero when we stagger wires uniformly with sufficient twists. In general, this value approaches that of the twisted-pair with normal wires. For examples in Fig. 5 with wire length 4000um, width 1um, and spacing 2um, the loop inductance matrices extracted by FastHenry [13] for twisted pair with normal wires and staggered twisted-pair are:

• Twisted-pair with normal wires:

| $\left[\begin{array}{cccc} 3.501e - 09j & 4.\\ 4.069e - 14j & 3.\\ 5.159e - 11j & 4.\\ 3.147e - 15j & 5. \end{array}\right.$ | $\begin{array}{rrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr$ | $ \begin{array}{cccc} 1j & 3.147e - 15j \\ 4j & 5.160e - 11j \\ 9j & 4.069e - 14j \\ 4j & 3.504e - 09j \end{array} \right], \qquad (9) $ |
|------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------|
|------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------|

• Staggered Twisted-pair:

| 3.491e - 09j<br>5.796e - 14j | 5.796e - 14j<br>3.491e - 09j | 5.151e - 11j<br>5.796e - 14j   | 4.640e - 15j<br>5.151e - 11j | (10) |
|------------------------------|------------------------------|--------------------------------|------------------------------|------|
| 5.151e - 11j<br>4.640e - 15j | 5.796e - 14j<br>5.151e - 11j | 3.491e - 0.9j<br>5.796e - 1.4j | 5.796e - 14j<br>3.491e - 09j | (10) |

where the inductive coupling between adjacent group are orders of magnitude reduced. Therefore, both the capacitive and inductive couplings can be reduced for this staggered twisting structure.

## III. SYNTHESIS AND MACRO-MODELING FOR STAGGERED TWISTED-BUNDLE

Because it is area-expensive to design one shield for each signal net to form a twisted pair with staggering, typically a shield is shared among multiple signal nets, i.e., a bundle of wires. In this section, we first discuss how to generate the staggered twisting pattern for a bundle of wires, where both the signals and shields are uniformly distributed and twisted, and then we present a detailed modeling approach for the accurate crosstalk analysis.

| 1. Check Parity                                                          |
|--------------------------------------------------------------------------|
| if $(N_{cell} \text{ is even}) n = N_{cell} + 2;$                        |
| else $n = N_{cell} + 1;$                                                 |
| 2. Generate routing matrix for unit twisting-cell and normal-cell        |
| 2.1 Generate routine matrix $T$ ( $n \times n$ ) for unit twisting-cell; |
| 2.1.1 Generate the cyclic permutation matrix;                            |
| 2.1.2 Replace diagonal element with 0 (shield)                           |
| and attach it to an additional column(row);                              |
| 2.2 Generate the routing matrix $T_b$ , N, and $N_b$                     |
| 3. Generate routing matrix $Rt_M$ for staggered twisting pattern         |
| 3.1 Connect $(T, N_b, T_b, N)$ alternatively;                            |
| 3.2 Permute each unit-cell of $Rt_M$ cyclically.                         |

Fig. 6. Algorithm for staggered twisting pattern generation.

| ·                   |                |                                       |                                       | ;            |                          |                     |                                  |                                       |
|---------------------|----------------|---------------------------------------|---------------------------------------|--------------|--------------------------|---------------------|----------------------------------|---------------------------------------|
| Т                   | N <sub>b</sub> | T <sub>b</sub>                        | Ν                                     |              | Т                        | Nb                  | T <sub>b</sub>                   | Ν                                     |
| N                   | Т              | N <sub>b</sub>                        | T <sub>b</sub>                        |              | N                        | Т                   | N <sub>b</sub>                   | T <sub>b</sub>                        |
| Tb                  | N              | Т                                     | Nb                                    |              | T <sub>b</sub>           | N                   | Т                                | Nb                                    |
| N <sub>b</sub>      | T <sub>b</sub> | N                                     | Т                                     |              | N <sub>b</sub>           | T <sub>b</sub>      | N                                | Т                                     |
| :                   | ÷              | ÷                                     | ÷                                     | ۰.           | ÷                        | ÷                   | ÷                                | ÷                                     |
| Т                   | N              |                                       |                                       |              |                          |                     |                                  |                                       |
| -                   | INP            | Tb                                    | N                                     |              | Т                        | Nb                  | Tb                               | N                                     |
| N                   | Т              | T <sub>b</sub><br>N <sub>b</sub>      | N<br>T <sub>b</sub>                   |              | T<br>N                   | N <sub>b</sub>      | T <sub>b</sub>                   | N<br>T <sub>b</sub>                   |
| N<br>T <sub>b</sub> | T<br>N         | T <sub>b</sub><br>N <sub>b</sub><br>T | N<br>T <sub>b</sub><br>N <sub>b</sub> | ····<br>···· | T<br>N<br>T <sub>b</sub> | N <sub>b</sub><br>T | T <sub>b</sub><br>N <sub>b</sub> | N<br>T <sub>b</sub><br>N <sub>b</sub> |

Fig. 7. The staggered twisting patterns by unit cells:  $T, T_b, N$ , and  $N_b$ .

#### A. Synthesis Algorithm

To synthesize the staggered twisting structure for a bundle of signal nets, we first present the algorithm to generate the staggered twisted pattern for multiple signal nets with one shield.

The synthesis algorithm is shown in Fig. 6. We assume the number of signal nets is  $N_{sig}$ , the number of signal/shield ratio is  $N_{cell}$ , and the number of staggering stages is  $N_{stag}$ . It means that we need to synthesize  $N_{gp}(N_{gp} = \frac{N_{sig}}{N_{cell}})$  groups of twisting wires. In each group we have  $N_{stag}$  stages formulated by connecting a unit twistingcell (T), a unit normal-cell (N) and their complements ( $T_b$  and  $N_b$ ) alternatively. Adjacent groups of wires are generated by cyclically shifting one unit cell. Fig. 1 (c) shows the wire diagram with unit cells for the case of  $N_{sig} = 6$ ,  $N_{cell} = 3$ , and  $N_{stag} = 1$ . Fig.7 further shows the general structure of the staggered twisting pattern composed by those unit cells ( $T, T_b, N, N_b$ ). Note that when the unit cells are placed in this staggered manner, the mutual inductive coupling between any two adjacent cells are minimized. Moreover, the capacitive coupling length is also reduced either between two adjacent groups or inside one group.

Below, we first discuss how to synthesize a unit twisting-cell. A unit twisting-cell consists of  $N_{cell}$  signal nets with  $N_{cell}$  segments per net. We can use the method in [8] to synthesize the unit twisting-cell with a routing matrix T

$$T = \begin{bmatrix} t_{1,1} & t_{1,2} & \cdots & t_{1,n} \\ t_{2,1} & t_{2,2} & \cdots & t_{2,n} \\ \vdots & \vdots & \vdots & \vdots \\ t_{n,1} & t_{n,2} & \cdots & t_{n,n} \end{bmatrix},$$
(11)

where each row  $[t_{i,1} \quad t_{i,2} \quad \cdots \quad t_{i,n-1} \quad t_{i,n}]$  represents the wire segments in *i*th line that is equally divided into  $n (n = N_{cell} + 1)$  segments; and the changes between each neighboring column represent the changes of routing connections. For example, neighboring pair  $(t_{k,i}, t_{k,j} = (I, J))$  means that the *k*th wire will change from *I*th track to the *J*th track. To minimize the inductive crosstalk, we need to twist both the signal and shield segments symmetrically to change the polarity of the current loop. This topology enforces a valid routing matrix for the unit twisting-cell only when *n* is even [8], where *T* is obtained as follows:

- 1) Begin with an initial row  $T_0 = [n-1 \quad n-2 \quad \cdots \quad 1];$
- 2) Cyclically shift  $T_0$  up by one segment in (n-1) times, obtain (n-1) number of permuted rows and construct a cyclic permutation matrix;
- 3) Replace the diagonal element in the cyclic permutation matrix by 0 (representing shield), attach the diagonal element to an additional column(row) and form a  $n \times n$  routing m



| $\begin{bmatrix} 0 & 1 & 2 & 3 & 3 & 3 & 3 & 3 \\ 1 & 0 & 3 & 2 & 2 & 2 & 2 \\ 2 & 3 & 0 & 1 & 1 & 1 & 1 & 1 \\ 3 & 2 & 1 & 0 & 0 & 0 & 0 & 0 \end{bmatrix}$                                                                                                                                                                             | $\begin{array}{cccccccccccccccccccccccccccccccccccc$  | $\begin{array}{c ccccccccccccccccccccccccccccccccccc$ |
|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------|-------------------------------------------------------|
| $\begin{array}{cccccccccccccccccccccccccccccccccccc$                                                                                                                                                                                                                                                                                     | $\begin{array}{cccccccccccccccccccccccccccccccccccc$  | $\begin{array}{c ccccccccccccccccccccccccccccccccccc$ |
| $ \begin{bmatrix} 3 & 2 & 1 & 0 \\ 2 & 3 & 0 & 1 \\ 1 & 0 & 3 & 2 \end{bmatrix} \begin{bmatrix} 0 & 0 & 0 & 0 & 0 \\ 1 & 1 & 1 & 1 & 1 \\ 2 & 2 & 2 & 2 \\ 2 & 2 & 2 & 2 \\ 0 & 1 & 2 & 3 \end{bmatrix} \begin{bmatrix} 3 & 3 & 3 & 3 \\ 3 & 3 & 3 & 3 \end{bmatrix} $                                                                   | $ \begin{array}{cccccccccccccccccccccccccccccccccccc$ | $\begin{array}{c ccccccccccccccccccccccccccccccccccc$ |
| $\begin{bmatrix} 3 & 3 & 3 & 3 & 3 & 3 \\ 2 & 2 & 2 & 2 & 2 \\ 1 & 1 & 1 & 1 & 1 & 1 \\ 0 & 0 & 0 & 0 & 0 & 1 & 2 \end{bmatrix}$                                                                                                                                                                                                         | $ \begin{array}{cccccccccccccccccccccccccccccccccccc$ | $\begin{array}{c ccccccccccccccccccccccccccccccccccc$ |
| $ \begin{bmatrix} 0 & 1 & 2 & 3 & 1 & 3 & 3 & 3 & 3 & 1 \\ 1 & 0 & 3 & 2^1 & 2 & 2 & 2 & 2 & 2 \\ 2 & 3 & 0 & 1 & 1 & 1 & 1 & 1 \\ 1 & 2 & 1 & 0 & 1 & 0 & 0 & 0 & 0 \\ \hline 0 & 0 & 0 & 0^1 & 0 & 1 & 2 & 2 \\ 1 & 1 & 1 & 1 & 1 & 0 & 3 & 2 \\ 1 & 1 & 1 & 1 & 1 & 0 & 3 & 2 \\ 1 & 3 & 3 & 3 & 3 & 3 & 2 & 1 & 0 \\ \end{bmatrix} $ | $\begin{array}{cccccccccccccccccccccccccccccccccccc$  | $\begin{array}{c ccccccccccccccccccccccccccccccccccc$ |

Fig. 8. The routing matrix of each group for a 2-stage 18 signal nets with signal/shield ratio 3:1.

For example Fig.8, considering the leftmost cell in the top row, we have the following steps:

1)  $T_0 = \begin{bmatrix} 3 & 2 & 1 \end{bmatrix}$ ; 2) Cyclic permutation matrix:

$$\left[\begin{array}{rrrrr} 3 & 2 & 1 \\ 1 & 2 & 3 \\ 2 & 3 & 1 \end{array}\right];$$

3) Routing matrix T:

$$T = \begin{bmatrix} 0 & 1 & 2 & 3\\ 1 & 0 & 3 & 2\\ 2 & 3 & 0 & 1\\ 3 & 2 & 1 & 0 \end{bmatrix}.$$
 (12)

Note that since there are also even number of twisting-stages in one unit twisted-cell, a zero inductive coupling, therefore, can be achieved according to (3). On the other hand, when  $n = N_{cell} + 1$  is odd, we can add one dummy wire such that the total wires in one unit twisting-cell is still even  $(n = N_{cell} + 2)$ . This avoids to generating the additional preceding matrix (and hence additional design cost) as in [8] to enforce the permeability.

The complementary matrix  $T_b$  for the unit twisting-cell T in (7) is obtained by reversing its order of each row:

$$T_{b} = \begin{bmatrix} t_{1,n} & t_{1,n-1} & \cdots & t_{1,1} \\ t_{2,n} & t_{2,n-1} & \cdots & t_{2,1} \\ \vdots & \vdots & \vdots & \vdots \\ t_{n,n} & t_{n,n-1} & \cdots & t_{n,1} \end{bmatrix}.$$
 (13)

Furthermore, we can define a unit normal-cell (N) and its complementary  $(N_b)$  by the following routing matrices:

$$N = \begin{bmatrix} t_{1,1} & t_{1,1} & \cdots & t_{1,1} \\ t_{2,1} & t_{2,1} & \cdots & t_{2,1} \\ \vdots & \vdots & \vdots & \vdots \\ t_{n,1} & t_{n,1} & \cdots & t_{n,1} \end{bmatrix}$$
(14)

and

$$N_{b} = \begin{bmatrix} t_{1,n} & t_{1,n} & \cdots & t_{1,n} \\ t_{2,n} & t_{2,n} & \cdots & t_{2,n} \\ \vdots & \vdots & \vdots & \vdots \\ t_{n,n} & t_{n,n} & \cdots & t_{n,n} \end{bmatrix}.$$
 (15)

To generate the staggered twisting pattern, we connect the twistedcell  $(T, T_b)$  and the normal-cell  $(N, N_b)$  alternatively according to Fig. 7 in the previous page. This can be realized by first constructing an *initial staggering row*:

$$Rt_M = (T \oplus N_b \oplus T_b \oplus N) \cdots (T \oplus N_b \oplus T_b \oplus N),$$
(16)

where we repeat the pattern  $(T \oplus N_b \oplus T_b \oplus N)$  by  $N_{stag}$  times. We then cyclically permute  $Rt_M$  by one unit-cell at a time to obtain the routing matrix for each group



(a) PEEC model for twisted wires: via and dogleg are modeled as RL branch



Fig. 9. The PEEC model for staggered twisted-bundle wires.

$$Rt_{M}^{(0)} = Rt_{M}$$

$$Rt_{M}^{(1)} = P_{c}^{(1)}(Rt_{M})$$

$$Rt_{M}^{(2)} = P_{c}^{(2)}(Rt_{M})$$

$$\vdots$$

$$Rt_{M}^{(Ngp)} = P_{c}^{(Ngp)}(Rt_{M}).$$
(17)

With the use of routing matrix of unit twisting-cell 3, we illustrate this procedure by Fig. 8 for an example of 18 signal nets with 1-stage staggerings. There are 6 groups for synthesis when the signal/shield ratio is 3:1. The routing matrix is shown in Fig. 8 with dash-lines in different styles to indicate different unit cells. The initial staggering row is cyclically permuted 6 times by one cell at a time, and the resulted patterns form an overall routing matrix consequently. Note that due to increased geometrical complexity, we need apply the more generalized modeling approach to handle the staggered twistedbundle structure as discussed below.

### B. Detailed PEEC Model and its Macro-modeling

To accurately analyze twisted structures in the high frequency range and consider more complicated twisting topologies, we need a distributed partial element equivalent circuit (PEEC) model [14] with the sufficient discretization and segmentation of the conductor. The conductor is volume-discretized according to the skin-depth, and longitudinal segmented by one tenth of the wave-length. As shown in Fig. 9 (a), the discretized wire segment is modeled by R, L, C element, and the via and dogleg are modeled as the RL branch. One important property of the PEEC model is that it assumes the path of returned current is at the infinity, and the partial inductance is stamped in the model instead of the loop inductance. In [8], it assumes that every signal net in a wire-group shares one common return-path. This assumption does not hold in general when there are multiple signal nets sharing one local ground. For example, if two neighboring signals switch in anti-phase at the high frequency, their coupling capacitance can act as the low impedance return-path even when the driver/receiver themselves have finite impedances. Fig. 9 (b) shows the PEEC model for two staggered twisting groups. For the coupling between any two wire segments, we consider (1) only adjacent capacitive coupling as capacitive coupling is short-range; and (2) every inductive coupling between any pair of segments as magnetic coupling is long-range<sup>1</sup>

Note that the analysis of crosstalk, especially the worst case delay/noise (WCD/WCN) in detailed PEEC model is computationally expensive. To efficiently evaluate the WCD/WCN, the Prima [15] based model order reduction is applied to reduce the detailed PEEC model and obtain a compact macro-model for twisted wires. We first

<sup>1</sup>For the simplicity of illustration in the figure, we have not shown the full coupling of inductance although we consider it in our implement:





Fig. 10. The WCD/WCN comparison for each signal net of COPS/TWB/STWB structures with signal/shield ratio 3:1.

separate the interconnect parasitic with the non-linear driver. The parasitic part can be formulated in the MNA form

$$(G+sC)x(s) = Bv_p \qquad i_p = L^T x(s) \tag{18}$$

where G, C the are conductance and susceptance matrix, x(s) is the state variable, B,  $L^T$  are the incidence matrix at ports (n ports), and  $i_p$ ,  $u_p$  are the port current/voltage variables. By applying the congruent transformation with a low-order orthogonal basis X of the Krylov space,

$$\widetilde{G} = X^T G X \qquad \widetilde{C} = X^T C X \tag{19}$$
$$\widetilde{B} = X^T B \qquad \widetilde{L} = X^T L$$

we can obtain the transfer function (in admittance form) for the model-order reduced system

$$\widetilde{Y}(s) = \widetilde{L}^{T} (\widetilde{G} + \widetilde{C})^{-1} \widetilde{B}$$

$$= \begin{bmatrix} c^{1,1} + \sum_{i=1}^{q} \frac{k_{i}^{1,1}}{s-p_{i}} & \cdots & c^{1,n} + \sum_{i=1}^{q} \frac{k_{i}^{1,n}}{s-p_{i}} \\ \vdots & \vdots & \vdots \\ c^{n,1} + \sum_{i=1}^{q} \frac{k_{i}^{n,1}}{s-p_{i}} & \cdots & c^{n,n} + \sum_{i=1}^{q} \frac{k_{i}^{n,n}}{s-p_{i}} \end{bmatrix}$$
(20)

where q is the number of poles (model order) for the approximation,  $k_i$  and  $p_i$  are the residues and poles. For the SPICE compatible time-domain simulation, the reduced system is realized by a Jordancanonical form based synthesis [16]. It can be realized by RC elements and voltage-controlled current sources. The discussion is omitted here due to the limited space but will be included in a technical report. As shown in the section below, the obtained macromodel is computation efficient to determine the WCN/WCD during the STWB synthesis.

### **IV. EXPERIMENTAL RESULTS**

With the PEEC model to describe these staggered twisted-bundle structure, we can accurately analyze the delay and noise in the high frequency range. Below, we first present a small example by the PEEC model to compare the worst WCD/WCN for three structures: COPS, TWB [8], and STWB. Furthermore, we study the large circuits with multiple staggering numbers  $(2 * N_{stag})$  and different signal/shielding ratios, where we use the reduced macro-model for the time-domain simulation.

### A. Experiment Setting

We use 180nm (IBM) and 70nm (Berkeley Predictive Model) copper technology, where in both cases the via resistivity is comparable to that for the interconnect metal and is much smaller than aluminum based technology. We assume that M6 is used to layout the signals and shields, and the minimum wire width (0.45um for 180nm, 0.2um for 70nm) and spacing (0.5um for 180nm, 0.2um



Fig. 11. The changes of WCD/WCN when increasing the number of staggering stages for 6 signal nets with signal/shield ratio 3:1.

for 70*nm*). The via is chosen as  $2 \times 2$  array of the minimum size  $(0.2um^2)$  due to the reliability concern. The wire length is 4000*um*, and driver size is about 100X to the minimum inverter size. Note that in our design, the driver strength and interconnect resistive loss are both less than the interconnect characteristic impedance. Therefore the inductive effect can not be ignored. Furthermore, an exponential voltage source with 50ps rising time is used as input signals. The non-linear driver is modeled by Berkeley BSIM3 model. The wire capacitance is extracted by FastCap [17], and the partial inductance is extracted by FastHenry [13].

In this paper, we use the method in [18] to find the WCN by the aggressor alignment considering the switching pattern introduced polarity changes, where we assume the victim is quiet and the peak noise is measured at the far-end of the victim, i.e., at the inputs of receivers. The exhaustive search is applied for the determination of the WCD, where all signal nets are switching and the delay is measured at 65% of Vdd level. In both cases, we consider the aggressor and victim having a switching window of 200ps, i.e., the earliest and latest arrival times differ by 200ps. The computational time linearly depends on the number of signal nets and the SPICE simulation time for each alignment. To reduce the simulation time for large circuits, we apply the model order reduction and stamp in the macro-model for the time-domain analysis.

### B. WCD/WCN Comparison of COPS, TWB, and STWB

As shown in Fig. 1, we assume 6 signal wires with signal/shield ratio 3:1 in 180nm technology. 3 shields are used for COPS and 2 shields for TWB and STWB, respectively. We use the detailed PEEC model (but not macro-model) for the 6 signal nets in Section 5.2 and 5.3. Fig. 10 compares the WCD and WCN when each wire acts as the victim for all aforementioned three structures. According to Fig. 10 (a), we find that WCD variation between signal nets is smaller in STWB than those in COPS and TWB. It is due to the fact that the inductance current-loop and capacitance coupling length are more uniformly distributed in STWB structure. In terms of the overall WCD among all 6 bits, STWB has delay 11ps smaller than the COPS (51ps vs. 62ps). Moreover, we find the WCN (See Fig. 10 (b)) of the STWB is also uniform among 6 signal nets as well. Their values are small and comparable to those of COPS. For the WCN of TWB, however, the WCNs of signal nets in normal group (net 4, net 5, net6) are much larger (averaged 15% difference) than those in twisted group (net1, net2, net3) due to the large capacitive coupling among normal wires. Therefore, STWB structure is best in terms of the delay and noise reduction.

### C. Impact of Staggering Number

We then study the effect of staggering numbers by the example of 6 signals with signal/shield ratio 3:1 for both 180nm and 70nm technologies. As shown in Fig. 11, we studied the WCD/WCN when increasing the staggering number from 2 to 32. The WCD :

OMPUTER

SOCIETY





Fig. 12. The waveform comparison between the macro-model and the original one for 1-stage 18 signal nets with signal/shield ratio 9:1. The input is a 1.8 V exponential voltage source with rising time 50ps.

both decrease when the staggering number increases initially because the effective capacitive coupling length is reduced by staggering. The optimum point where the WCD achieves the minimum (47.2ps) occurs when the staggering number is 8 for the 180nm technology, and a minimum WCD of 48.3ps occurs at staggering number of 4 for the 70nm technology. Beyond the optimal staggering numbers, the WCD instead increases slightly when staggering number becomes larger. It is due to that the via and dogleg resistances become nonnegligible since the lengths of the via and dogleg are comparable to those of the wire segments.

#### D. Impact of Signal/Shielding Ratio

We further study the impact of signal/shielding ratio by an example of 1-stage 18b signal nets using reduced macro-model. Note that the dummy wire in the matrix of 6:1 (signal net number is even) is terminated with  $50\Omega$  to the ground.

Fig.12 shows the waveform by macro-models and the detailed PEEC model for the signal/shield ratio of 9:1. It is a large circuit example with  $8 \times 10^4$  circuit elements. 20, 40, 80 poles are used for the reduction, respectively. The reduced model is connected back with the original non-linear driver, and the far-end responses at signal net 1 and shield net 1 are observed. Clearly, the reduced model with 80-pole approximation captures well both the delay and peek voltage yet with 258X speedup (82.9s vs. 20282.41s) compared to the original model, where the simulation time (82.9s) includes the model reduction time (12.2s). The lower-order (20, 40) models, on the other hand, have non-negligible error in delay or peak voltage and hence are not suitable for accurate WCD/WCN analysis. Therefore, we studied the impact of signal/shielding ratio (18:1, 9:1, 3:1, 1:1) with the macromodel of order 80 for both 180nm and 70nm technologies.

Table 2 further compares the WCN/WCD for COPS, TWB, and STWB structures. We find the WCD and WCN both decrease when adding more shields for all structures. STWB has the smallest WCD and is up to 10ps smaller than COPS (with signal/shield ratio 3:1) for the 180nm technology, and 12ps smaller (with signal/shield ratio 6:1) for the 70nm technology. Moreover, we find that STWB has better WCN than COPS when the signal/shield ratio is large (18:1, 9:1, 6:1) for both technologies. It is due to that the shield is more uniformly distributed (and hence the inductance current-loop and the capacitance coupling length) for STWB but not for COPS when there are multiple signal nets sharing one shield. COPS achieves the least WCN when the signal/shield ratio is 1:1. Because the area overhead is large, the signal/shielding ratio of 1:1 is seldom used in practice. TWB however has up to 9% larger WCN (signal/shield ratio 9:1) for the 180nm technology and 12% (signal/shield ratio 6:1) for the 70nm technology than the other two structures. Furthermore, for the delay and noise variations among each signal net, we also observed that compared to COPS and TWB, STWB has up to 26% and 19% less delay variation, and 17% and 28% less noise variation.

| 180nm Cu 1              | .8V                                        | 18:1                                         | 9:1                                          | 6:1                                          | 3:1                                          | 1:1                                          |
|-------------------------|--------------------------------------------|----------------------------------------------|----------------------------------------------|----------------------------------------------|----------------------------------------------|----------------------------------------------|
|                         | COPS                                       | 85.9                                         | 83.1                                         | 81.7                                         | 79.5                                         | 65.3                                         |
| WCD (ps)                | TWB                                        | 82.6                                         | 82.1                                         | 78.9                                         | 75.9                                         | 64.9                                         |
|                         | STWB                                       | 82.4                                         | 80.6                                         | 75.5                                         | 70.1                                         | 60.2                                         |
|                         | COPS                                       | 95.6                                         | 89.5                                         | 67.2                                         | 61.8                                         | 40.5                                         |
| WCN (% Vdd)             | TWB                                        | 96.9                                         | 92.5                                         | 72.2                                         | 70.9                                         | 50.5                                         |
|                         | STWB                                       | 89.7                                         | 83.1                                         | 64.5                                         | 62.9                                         | 42.6                                         |
| 70nm Cu 1               | .0V                                        | 18:1                                         | 9:1                                          | 6:1                                          | 3:1                                          | 1:1                                          |
|                         |                                            |                                              |                                              |                                              |                                              |                                              |
|                         | COPS                                       | 82.2                                         | 80.2                                         | 78.4                                         | 76.3                                         | 70.9                                         |
| WCD (ps)                | COPS<br>TWB                                | 82.2<br>79.2                                 | 80.2<br>73.5                                 | 78.4<br>71.8                                 | 76.3<br>71.2                                 | 70.9<br>63.0                                 |
| WCD (ps)                | COPS<br>TWB<br>STWB                        | 82.2<br>79.2<br>79.7                         | 80.2<br>73.5<br>73.9                         | 78.4<br>71.8<br>66.1                         | 76.3<br>71.2<br>64.8                         | 70.9<br>63.0<br>62.4                         |
| WCD (ps)                | COPS<br>TWB<br>STWB<br>COPS                | 82.2<br>79.2<br>79.7<br>94.1                 | 80.2<br>73.5<br>73.9<br>81.7                 | 78.4<br>71.8<br>66.1<br>68.9                 | 76.3<br>71.2<br>64.8<br>55.8                 | 70.9<br>63.0<br>62.4<br>32.5                 |
| WCD (ps)<br>WCN (% Vdd) | COPS<br>TWB<br>STWB<br>COPS<br>TWB         | 82.2<br>79.2<br>79.7<br>94.1<br>93.8         | 80.2<br>73.5<br>73.9<br>81.7<br>83.9         | 78.4<br>71.8<br>66.1<br>68.9<br>78.5         | 76.3<br>71.2<br>64.8<br>55.8<br>63.9         | 70.9<br>63.0<br>62.4<br>32.5<br>43.2         |
| WCD (ps)<br>WCN (% Vdd) | COPS<br>TWB<br>STWB<br>COPS<br>TWB<br>STWB | 82.2<br>79.2<br>79.7<br>94.1<br>93.8<br>89.8 | 80.2<br>73.5<br>73.9<br>81.7<br>83.9<br>75.1 | 78.4<br>71.8<br>66.1<br>68.9<br>78.5<br>65.2 | 76.3<br>71.2<br>64.8<br>55.8<br>63.9<br>51.4 | 70.9<br>63.0<br>62.4<br>32.5<br>43.2<br>35.5 |

TABLE I

COMPARISONS OF WCD/WCN FOR COPS, TWB, AND STWB WITH DIFFERENT SIGNAL/SHIELDING RATIO FOR 1-STAGE 18 SIGNAL NETS.

# V. CONCLUSIONS

We have presented a novel staggered twisted-bundle for the crosstalk and delay reduction. This new structure is different from the previously proposed twisted bundle wires with one group of twisted wires and another group of normal wires. Experiments show that the staggered twisted-bundle has the smallest worst case delay, up to 20% and 5% smaller than the coplanar shielding and twisted bundle, respectively. It also has the smallest worst case noise, up to 6% and 12% less than coplanar shielding and twisted bundle. We conclude that without increasing routing area, the staggered twisted bundle is better than the twisted bundle and coplanar shielding in terms of performance and noise.

#### REFERENCES

- [1] T. Gao and C. L. Liu, "Minimum crosstalk channel routing," in ICCAD, 1993.
- A. Kahng, S. Muddu, and E. Sarto, "Tuning strategies for global inter-[2] connects in high-performance deep submicron IC's," in VLSI Design, 1999
- [3] P. Gupta and A. Kahng, "Wire swizzling to reduce delay uncertainty due to capacitive coupling," in IEEE Intl. Conf. on VLSI Design, 2004.
- [4] K. L. Shepard, D. Sitaram, and Y. Zheng, "Full-chip, three-dimensional, shapes-based RLC extraction," in ICCAD, 2000.
- [5] Y. Massoud, S. Majors, T. Bustami, and J. White, "Layout techniques for minimizing on-chip interconnect self inductance," in DAC, 1998
- [6] L. He and K. M. Lepak, "Simultaneous shield insertion and net ordering for capacitive and inductive coupling minimization," in ISPD, 2000.
- [7] H. Kaul, D. Sylvester, and D. Blaauw, "Clock net optimization using active shielding," in IEEE European Solid-State Circuits Conference, 2003
- G. Zhong, C.K.Koh, and K. Roy, "A twisted-bundle layout structure for minimizing inductive coupling noise," in *ICCAD*, 2000. [8]
- [9] L. Deng and M. Wong, "Optimal algorithm for minimizing the number of twists in an on-chip bus," in *DATE*, 2004.
  [10] H. Hidaka and et. al., "Twisted bit-line architecture for multi-negabit
- DRAMs," IEEE J. Solid-State Cirucits, p. 21 28, 1989.
- [11] T. Sato and H. Masuda, "Design and measurement of an inductanceoscillator for analyzing inductance impact on on-chip interconnect delay," in ISQED, 2003.
- Q. Chen, J. A. Davis, P. Ha, and J. D. Meindl, "A compact physical via [12] blockage model," IEEE T-VLSI, vol. 8, no. 6, pp. 689 -692, 2000.
- M. Kamon, M. Tsuk, and J. White, "FastHenry: a multipole-accelerated 3D inductance extraction program," *IEEE T-MTT*, pp. 1750–1758, Sept. [13] 1994
- [14] A. E. Ruehli, "Equivalent circuits models for three dimensional multiconductor systems," *IEEE T-MTT*, pp. 216–220, 1974. [15] A. Odabasioglu, M. Celik, and L. Pileggi, "PRIMA: Passive reduced-
- order interconnect macromodeling algorithm," IEEE T-CAD, pp. 645-654, 1998.
- [16] R. Achar and M. Nakhla, "Simulation of high-speed interconnect," *Proceedings of the IEEE*, vol. 89, pp. 693 –727, May 2001.
  [17] K. Narbos and J. White, "FastCap: A multipole accelerated 3D capacitance extraction program," *IEEE T-CAD*, vol. 10, no. 11, pp. 1447–1459, 1991
- [18] J. Chen and L. He, "Determination of worst-case crosstalk noice for non-switching victims in GHz+ interconnects," in ASPDAC, 2002.

