# Wideband Passive Multiport Model Order Reduction and Realization of RLCM Circuits

Zhenyu Qi, Student Member, IEEE, Hao Yu, Student Member, IEEE, Pu Liu, Student Member, IEEE, Sheldon X.-D. Tan, Senior Member, IEEE, and Lei He, Member, IEEE

Abstract—This paper presents a novel compact passive modeling technique for high-performance RF passive and interconnect circuits modeled as high-order resistor-inductorcapacitor-mutual inductance circuits. The new method is based on a recently proposed general s-domain hierarchical modeling and analysis method and vector potential equivalent circuit model for self and mutual inductances. Theoretically, this paper shows that s-domain hierarchical reduction is equivalent to implicit moment matching at around s = 0 and that the existing hierarchical reduction method by one-point expansion is numerically stable for general tree-structured circuits. It is also shown that hierarchical reduction preserves the reciprocity of passive circuit matrices. Practically, a hierarchical multipoint reduction scheme to obtain accurate-order reduced admittance matrices of general passive circuits is proposed. A novel explicit waveform-matching algorithm is proposed for searching dominant poles and residues from different expansion points based on the unique hierarchical reduction framework. To enforce passivity, state-space-based optimization is applied to the model order reduced admittance matrix. Then, a general multiport network realization method to realize the passivity-enforced reduced admittance based on the relaxed one-port network synthesis technique using Foster's canonical form is proposed. The resulting modeling algorithm can generate the multiport passive SPICE-compatible model for any linear passive network with easily controlled model accuracy and complexity. Experimental results on an RF spiral inductor and a number of high-speed transmission line circuits are presented. In comparison with other approaches, the proposed reduction is as accurate as passive reduced-order interconnect macromodeling algorithm in the high-frequency domain due to the enhanced multipoint expansion, but leads to smaller realized circuit models. In addition, under the same reduction ratio, realized models by the new method have less error compared with reduced circuits by time-constant-based reduction techniques in time domain.

Index Terms—Behavioral modeling, circuit simulation, determinant decision diagrams, model order reduction, realization.

Manuscript received December 9, 2004; revised April 2, 2005. This work was supported by the National Science Foundation (NSF) under CAREER Award CCF-0448534 and Grant OISE-0451688, and by the UC Regent's Faculty Fellowship (04–05). The work of H. Yu and L. He was supported in part by NSF CAREER Award CCR-0401682, SRC Grant 1100, a UC MICRO grant sponsored by Analog Devices, Fujitsu Laboratories of America, Intel, and LSI Logic, and a Faculty Partner Award by IBM. This paper was presented in part at the 10th Asia and South Pacific Design Automation Conference (ASPDAC'05), Shanghai, China, January 2005 [30]. This paper was recommended by Associate Editor W. Schoenmaker.

Z. Qi, P. Liu, and S. X.-D. Tan are with the Department of Electrical Engineering, University of California at Riverside, Riverside, CA 92521 USA (e-mail: zhenyu@ee.ucr.edu; uliu@ee.ucr.edu; stan@ee.ucr.edu).

H. Yu and L. He are with the Department of Electrical Engineering, University of California at Los Angeles, Los Angeles, CA 90095 USA (e-mail: hy255@ee.ucla.edu; lhe@ee.ucla.edu).

Digital Object Identifier 10.1109/TCAD.2005.855937

#### I. INTRODUCTION

S VERY large scale integration (VLSI) technology advances with increased operating frequency and decreased feature size, parasitics from on-chip interconnects and off-chip packaging will detune the performance of high-speed circuits in terms of slew rate, phase margin, and bandwidth [2]. Reduction of design complexity especially for those extracted high-order resistor—inductor—capacitor—mutual inductance (*RLCM*) networks is important for efficient VLSI design verification.

Compact modeling of passive *RLC* interconnect networks has been a research-intensive area in the past decade due to increasing signal integrity effects and interconnect dominant delay in current system-on-a-chip (SOC) design [22]. Existing approaches can be classified into two categories. The first category is based on subspace projection [11], [12], [20], [26], [28], [37]. The projection-based method was pioneered by the asymptotic waveform evaluation (AWE) algorithm [28], where explicit moment matching was used to compute dominant poles at low frequency. Pade via Lanczos (PVL) [11] and the Arnoldi transformation method [37] improved the numerical stability of AWE, and the congruence transformation method [20] and passive reduced-order interconnect macromodeling algorithm (PRIMA) [26] can further produce passive models. However, reduced circuit matrices by PRIMA are larger than direct pole marching (having more poles than necessary) [1], and PRIMA does not preserve certain important circuit properties like reciprocity [12]. The latest development by structured projection can preserve reciprocity [12], but it does not realize the reduced circuit matrices. An efficient first few order moments matchingbased realization for interconnect RLC circuit is proposed in [19]. In general, no systematic approach has been proposed for realizing order-reduced circuit matrices.

Another quite different approach to circuit complexity reduction is by means of local node elimination and realization [3], [10], [31], [34], [35]. The major advantage of these methods over projection-based methods is that the reduction can be done in a local manner, no overall solution of the entire circuit is required, and reduced models can be easily realized using RLCM elements. This idea was first explored by selective node elimination for RC circuits [10], [34], where time-constant analysis is used to select nodes for elimination. Node reduction for magnetic coupling interconnect (RLCM) circuits has recently become an active research area. Generalized  $Y-\Delta$  transformation [31], RLCK circuit crunching [3], and branch merging [35] have been developed based on nodal analysis (NA), where inductance becomes susceptance

in the admittance matrix. Since mutual inductance is coupled via branch currents, to perform nodal reduction, an equivalent six-susceptance NA model is introduced in [31] to reduce two coupling current variables and template matching via geometrical programming is used to realize the model order reduced admittances, but its accuracy depends heavily on the selection of templates and only one-port realization has been reported. Meanwhile, RLCK circuit crunching and branch merging methods are first-order approximations based on nodal time-constant analysis. The drawbacks for this first-order approximation are as follows: 1) Error is controlled in a local manner and will be accumulated; hence, it is difficult to control the global error due to reduction. 2) Not too many nodes can be reduced if the elimination condition is not satisfied.

Another way to model and characterize complex interconnect structures in high frequency (in RF or even microwave ranges) is by means of rational approximation based on direct measurements or rigorous full-wave electromagnetic simulation [1], [7], [9], [13], [14], [24], [33]. Many of those methods have been used in RF and microwave circuit modeling as they are very flexible to be applied to different interconnect structures and wideband modeling.

In this paper, we focus on the realizable modeling of RLCMcircuits. We propose a new passive reduction and realization framework for general passive high-order RLCM circuits. Our method starts with large RLCM circuits that are extracted by existing geometry extraction tools like FastCap [25] and FastHenry [17] under some relaxation conditions of full-wave Maxwell equations (like electro-quasi-static for FastCap or magneto-quasi-static for FastHenry) instead of measured or simulated data. It is our ultimate goal to obtain compact models directly from a complex interconnect geometry without measurement or full-wave simulations. The new modeling method is based on the general s-domain hierarchical model reduction algorithm [39], [41] and an improved vector potential equivalent circuit (VPEC) [44] model for self and mutual inductance, which can be easily sparsified and are hierarchical reduction friendly.

On the theoretical side, we show that the s-domain hierarchical reduction is equivalent to the implicit moment matching at around s = 0 and that the existing hierarchical reduction method by one-point expansion [39], [41] is numerically stable for general tree-structured circuits. We also show that the proposed hierarchical reduction preserves the reciprocity of passive circuit matrices. Practically, we propose a hierarchical multipoint reduction scheme to obtain accurate-order reduced admittance matrices of general passive circuits. A novel explicit waveform-matching algorithm is proposed for searching dominant poles and residues from different expansion points based on the unique hierarchical reduction framework. To enforce passivity, a state-space-based convex programming optimization technique [7] is applied to the model order reduced admittance matrix. To realize the passivity-enforced admittance, we propose a general reciprocity-preserving passivity-preserving multiport network realization method based on the relaxed one-port network synthesis technique using Foster's canonical form in an error-free manner. The resulting modeling algorithm can take, in general, RLCM SPICE netlists and generate out SPICE netlists of passive multiport models for any linear passive network with easily controlled model accuracy and complexity.

The rest of this paper is organized as follows. Section II reviews the hierarchical reduction algorithm. Section III shows some theoretical results regarding hierarchical reduction. Section IV proves that hierarchical reduction can preserve the reciprocity. Section V presents a new hierarchical multipoint expansion scheme and a novel explicit waveform-matching algorithm for searching dominant poles and residues from different expansion points. Section VI briefly reviews the VPEC model and the comparison with nodal susceptance-based inductance models. Section VII presents the state-space-based convex programming for enforcing the passivity of the order reduced admittance circuit matrices. In Section VIII, we will describe the general n-port network realization method. Experimental results on an on-chip spiral inductor, high-speed partial electrical equivalent circuit (PEEC)-modeled bus circuits, and comparison with existing reduction approaches will be presented in Section IX. Section X concludes this paper.

## II. REVIEW OF HIERARCHICAL CIRCUIT REDUCTION ALGORITHM

Assume that the modified NA (MNA) is used for circuit matrix formulation. With this, the system equation set MX=b can be rewritten in the form (Schur's decomposition)

$$\begin{bmatrix} M^{II} & M^{IB} & 0 \\ M^{BI} & M^{BB} & M^{BR} \\ 0 & M^{RB} & M^{RR} \end{bmatrix} \begin{bmatrix} x^I \\ x^B \\ x^R \end{bmatrix} = \begin{bmatrix} b^I \\ b^B \\ b^R \end{bmatrix}. \tag{1}$$

The matrix  $M^{II}$  is the internal matrix associated with the internal variable vector  $\boldsymbol{x}^{I}$ .

Hierarchical reduction is used to eliminate all the variables in  $x^I$  and transform (1) into the reduced set of equations

$$\begin{bmatrix} M^{BB*} & M^{BR} \\ M^{RB} & M^{RR} \end{bmatrix} \begin{bmatrix} x^B \\ x^R \end{bmatrix} = \begin{bmatrix} b^{B*} \\ b^R \end{bmatrix}$$
 (2)

where  $M^{BB*}=M^{BB}-M^{BI}(M^{II})^{-1}M^{IB}$  and  $b^{B*}=b^B-M^{BI}(M^{II})^{-1}b^I$ . Suppose that the number of internal variables is t and the number of boundary variables is m. Then each matrix element in  $M^{BB*}$  and  $b^{B*}$  can be written in the expanded forms

$$a_{u,v}^{BB*} = a_{u,v}^{BB} - \frac{1}{\det(M^{II})} \sum_{k_1,k_2=1}^{m} a_{u,k_1}^{BI} \Delta_{k_2,k_1}^{II} a_{k_2,v}^{IB}$$
 (3)

where  $u,v=1,\ldots,m$ . We call  $a_{u,k_1}^{BI}\Delta_{k_2,k_1}^{II}a_{k_2,v}^{IB}/\det(M^{II})$  a composite admittance due to MNA formulation and

$$b_u^{B*} = b_u^B - \frac{1}{\det(M^{II})} \sum_{k_1, k_2 = 1}^m a_{u, k_1}^{BI} \Delta_{k_2, k_1}^{II} b_{k_2}^I \tag{4}$$

where u = 1, ..., m and  $\Delta_{u,v}$  is the first-order cofactor of det(M) with respect to  $a_{u,v}$ .

The hierarchical node reduction algorithm computes the new admittance  $a_{u,v}^{BB*}$  and new right-hand side element  $b_u^{B*}$  in terms of order-limited rational functions of s hierarchically.

One critical issue during hierarchical reduction is cancellation. Two types of cancellations have been observed [39], namely: 1) symbolic term cancellation, where two product terms consisting of composite admittances cancel out and 2) symbolic common-factor cancellation, where the numerator and the denominator of the resulting product term consist of composite admittances having a common symbolic factor, which happens when at least two first-order cofactors exist in a product term. A general s-domain reduction algorithm based on MNA was proposed in [39] and [41], where graph-based decancellation is carried out numerically in the frequency domain (s-domain). The hierarchical reduction algorithm is as that in [39] and [41].

## III. HIERARCHICAL REDUCTION VERSUS MOMENT MATCHING

In this section, we first discuss how the s-domain hierarchical reduction is related to implicit moment matching. Then, we discuss the numerical stability and reciprocity-preserving property of the hierarchical reduction process.

#### A. Moment Matching Connection

Consider a linear system with n state variables in vector x, i.e.,

$$sx = Ax + b \tag{5}$$

where A is an  $n \times n$  system matrix and b is the input vector to the circuit. Then we can obtain  $x = (Is - A)^{-1}b$ . Let us consider single-input single-output (SISO) systems where we have only one input  $b_j$  and we are interested in the state response at node i. In this case, we have

$$x_i(s) = H_{ij}(s)b_j = \frac{\Delta_{ij}}{\det(Is - A)}b_j \tag{6}$$

where  $\Delta_{ij}$  is the first-order cofactor of matrix M=(Is-A) with respect to the element at row i and column j, and  $H_{ij}(s)$  is the transfer function. So the exact solution of any state variable or its transfer function in s-domain can be represented by a rational function of s.

Hierarchical reduction is basically reducing the  $n \times n$  matrix M into a very small  $m \times m$  matrix M' based on block Gaussian elimination such that  $x_i$  can be trivially solved symbolically by using (6). During this reduction process, all the rational functions involved are truncated up to a fixed maximum order and the final solution will be a rational function with the same order for its numerator and its denominator. We then have the following theoretical result for the computed state variable  $x_i'(s)$  from the hierarchical reduction process in s-domain.

Theorem 1: The state variable  $x'_i(s)$  computed by s-domain hierarchical reduction with q as the maximum order for all the rational functions will match the first q moments of the exact solution  $x_i(s)$  expanded by Taylor series at s=0.

*Proof:* As we know, the exact solution of  $x_i(s)$  is a rational function as shown in (6). Due to truncation, the solution computed by the hierarchical reduction process will be given by

$$x_i'(s) = \frac{a_0 + a_1 s + \dots + a_q s^q}{b_0 + b_1 s + \dots + b_q s^q}, \qquad i = 1, \dots, q. \quad (7)$$

It was proven in [40] that a cancellation-free rational expression from the hierarchical reduction process is the exact expression obtained from the flat circuit matrix. If we do not perform any truncation, then  $x_i'(s)$  will be the exact solution  $x_i(s)$ , which is obtained from the flat circuit matrix by (6) when all cancellations are removed numerically during the hierarchical reduction process (under the assumption that no numerical error is introduced). With truncation, all the coefficients  $a_0,\ldots,a_q$  and  $b_0,\ldots,b_q$  are still exactly the same as that in  $x_i(s)$ . If we compute the moments of  $x_i'(s)=m_0+m_1s+\ldots$ , the first q moments can be uniquely determined by the 2q coefficients  $a_0,\ldots,a_q$  and  $b_0,\ldots,b_q$ , i.e.,

$$m_{i} = \frac{a_{i} - \sum_{k+l=i, k \leq i, l \leq i, k \neq 0}^{i} b_{k} m_{l}}{b}.$$
 (8)

If  $b_0$  is not zero, b is simply  $b_0$ , otherwise b will be the first nonzero coefficient  $b^t$  and the first q moments become the coefficients of  $s^{-t}(t>0)$  to that of  $s^{-t+q}$ . So the theorem is proved. Hence, the transfer function  $H'_{ij}(s)$  will also match the exact one up to the first q moments.

For a general multi-input and multi-output (MIMO) system, each element in the reduced  $m \times m$  admittance matrix M'(s) becomes a rational function [41]

$$a_{u,v}^{BB*} = \frac{\det(M[1,\dots,m,u|1,\dots,m,v])}{\det(M^{II})}$$
 (9)

where  $M[1,\ldots,m,u|1,\ldots,m,v]$  is a matrix that consists of matrix  $M^{II}$ . It is  $M[1,\ldots,m|1,\ldots,m]$  plus row u and column v of matrix M. Then we have the following results.

Corollary 1: Each rational admittance function  $a'^{BB*}_{u,v}(s)$  in the reduced  $m \times m$  matrix M'(s) by the hierarchical reduction process will match the first q moments of the exact rational function  $a^{BB*}_{u,v}(s)$  expanded by Taylor series at s=0.

#### B. Numerical Stability of the Hierarchical Reduction

The hierarchical reduction process is essentially equivalent to the implicit moment matching at s=0. As a result, the frequency response far away from s=0 will become less accurate due to the truncation of high-order terms. Another source of numerical error comes from the numerical decancellation process where polynomial division is required for removing the common factors (cancellation) in the newly generated rational function, which will in turn introduce error from the numerical term cancellation (the sum of two symbolic terms should have been zero, but it is not zero due to numerical error). Such numerical noise will cause the higher-order terms to be less accurate even if we try to keep them. In Fig. 1, we show the responses from the three-way two-level partitioned  $\mu$ A741 circuit [42] under different maximum reduction orders of rational functions. As we can see, increasing the rational



Fig. 1. Responses of  $\mu$ A741 circuit under different reduction orders.

function order does not increase the accuracy of the response after the order reaches 8. This is the typical numerical stability problem with the moment matching method [28]. However, unlike explicit moment matching methods, hierarchical reduction is numerically stable for tree-structured circuits. We then show the following results.

*Theorem 2:* For tree-structured circuits, the hierarchical reduction process can be performed such that there is no common-factor cancellation in the generated rational functions.

*Proof:* For tree-structured circuits, we can always partition the circuit in such a way that each subcircuit has only one node shared with its parent circuit. As a result, there is only one composite admittance  $a_{u,k_1}^{BI} \Delta_{k_2,k_1}^{II} a_{k_2,v}^{IB} / \det(M^{II})$  in (3) generated in its parent circuit for each subcircuit. According to the common-factor cancellation condition [39], at least four composite elements from the same subcircuit reduction are required for the existence of common-factor cancellation. So no common-factor cancellation will occur under such partitioning. The theorem is proved.

The significance of Theorem 2 is that the hierarchical reduction process becomes numerically stable for almost an arbitrary order of tree circuits. The only cancellation left is the term cancellation, where the sum of two symbolic terms is zero, which will not introduce any noticeable numerical error in the reduction process. Fig. 2 shows the voltage gain response (real part) of an *RC* tree with about 100 nodes (also 100 capacitors) under different reduction orders. As can be seen, the reduced voltage gain will match the exact one well when the kept orders reach about 60.

The fact that no common-factor decancellation (polynomial division) is required was also exploited in the direct truncation of transfer function method (DTT) [16], where only polynomial addition is required to compute the truncated transfer functions for tree-structured RLC circuits. The DTT reduction process can be viewed as a special case of our method. But for general nontree-structured circuits, polynomial division is required in node-elimination-based reduction methods due to common-factor cancellation, and polynomial division due to truncation



Fig. 2. Responses of an RC tree circuit under different reduction orders.

will not be numerically stable for a very high frequency range far away from dc as shown before.

To mitigate this problem, we propose using multipoint expansion for obtaining accurate rational functions or reduced admittance matrices for modeling a general MIMO linear system as will be shown in Section V.

#### IV. PRESERVATION OF RECIPROCITY

A reciprocal network is one in which the power losses are the same between any two ports regardless of the direction of propagation [43]. Mathematically, this is equivalent to the requirement that the circuit admittance matrix is symmetric (or scattering parameter S21=S12, S13=S31, etc.). A network is known to be reciprocal if it is passive and contains only isotropic materials. Reciprocity is an important network property. For hierarchical reduction, we have the following results.

*Theorem 3:* The hierarchical reduction method preserves the reciprocity of a passive circuit matrix.

*Proof:* The proof can be found by using (9) again. We first study a circuit with circuit matrix M. The circuit has one subcircuit with circuit matrix  $M^{II}$ . Assume that the original circuit matrix is symmetric (its subcircuit is also symmetric, i.e., both M and  $M^{II}$  are symmetric) due to reciprocity.

After reduction, the reduced circuit matrix becomes an  $m \times m$  matrix, where each matrix element at row u and column v appears in (9). Then we look at the element at row v and column u, which is

$$a_{u,v}^{BB*} = \frac{\det(M[1,\dots,m,u|1,\dots,m,v])}{\det(M^{II})}.$$
 (10)

Notice that matrix M is symmetric, so row u in (9) and column u in (10) are same. This is true for column v in (9) and row v in (10). As a result, we have

$$M[1, ..., m, v | 1, ..., m, u]$$

$$= M[1, ..., m, u | 1, ..., m, v]^{T}$$

$$\det(M[1, ..., m, v | 1, ..., m, u])$$

$$= \det(M[1, ..., m, u | 1, ..., m, v]^{T}).$$
(12)

Hence,  $a_{u,v}^{BB*} = a_{v,u}^{BB*}$  and reciprocity is preserved in the reduced circuit matrix when a subcircuit is reduced. In hierarchical reduction, we reduce one subcircuit at a time and the reduced circuit matrix is still symmetric after reduction. So the reduced circuit matrix after all the subcircuits are reduced is still symmetric. The theorem is proved.

#### V. MULTIPOINT EXPANSION HIERARCHICAL REDUCTION

The multipoint expansion scheme by real or complex frequency shift has been exploited before in projection-based reduction approaches for improving the modeling accuracy [8], [15]. The basic idea for such a strategy is that dominant poles close to an expansion point are more accurately captured than poles that are far away from the expansion point in the moment-matching-based approximation framework. Therefore, instead of expanding at only one point, we can expand at multiple points along the real or complex axis to accurately capture all the dominant points in the given frequency range.

In this paper, we extend this concept to the hierarchical reduction algorithm. Specifically, at each expansion point, the driving point function or each rational admittance function in a reduced admittance matrix can be written into the partial fraction form

$$f(s) = \sum_{i}^{n} k_i / (s - p_i).$$
 (13)

By intelligently selecting poles and their corresponding residues from different expansions and combining them into one rational function, we can obtain a more accurate rational function for a very high frequency range. In this paper, we propose an explicit waveform-matching scheme based on the hierarchical reduction framework to find dominant poles and their residues for both SISO and MIMO systems. It is shown experimentally to be superior that the existing pole searching algorithm.

#### A. Multipoint Expansion in Hierarchical Reduction

To expand the circuit at an arbitrary location in the complex s-plane, say  $s_k=\alpha_k+\omega_k j$ , we can simply substitute s in (6) by  $s+s_k$ . Then (6) becomes

$$x_i(s) = H_{ij}(s)b_j = \frac{\Delta_{ij}(s+s_k)}{\det(I(s+s_k) - A)}b_j.$$
 (14)

As shown in [8], poles that dominate the transient response in interconnect circuits are near the imaginary axis with large residues. Hence, we expand along the imaginary axis for RF passive and interconnect circuits. Since only capacitors and inductors are associated with the complex frequency variable s, expansion at a real point  $\alpha$  or a complex point  $\omega_i j$  point is essentially equivalent to analyzing a new circuit where each capacitor C has a new resistor (with real value  $\alpha_i C$  or complex value  $\omega_i C j$ ) connected in parallel with it and each inductor L

has a new resistor (with real value  $\alpha_i L$  or complex value  $\omega_i L j$ ) connected in series with it [29].

In this paper, we show that multipoint expansion can be done very efficiently in the hierarchical reduction framework. The rational functions are constructed in bottom-up fashion in Y-parameter determinant decision diagrams (YDDDs) in the hierarchical reduction algorithm [39]. When a capacitor C or an inductor L (its YDDD node) is visited, we build a simple polynomial 0 + Cs or 0 + Ls to multiply or add it with existing polynomials seen at that DDD node. In the presence of a nonzero expansion point  $\alpha_i$  or  $\omega_i j$ , we can simply build a new polynomial  $\alpha_i C + Cs$  or  $\omega_i Cj + Cs$  for the capacitor and  $\alpha_i L + Ls$  or  $\omega_i Lj + Ls$  for the inductor, respectively. So we do not need to rebuild the circuit matrix or the YDDD graphs used for reduction at s=0. Instead, we only need to rebuild the rational functions by visiting every YDDD node once, which has the time complexity linear with the YDDD graph size, a typical time complexity for DDD graph-based methods [36].

#### B. Explicit Waveform-Matching Algorithm

One critical issue in multipoint expansion is to determine at each expansion point which poles are accurate and should be included in the final rational function. In the complex frequency hopping method [8], a binary search strategy was used, where poles (common poles) seen by two expansion points and poles with distance to the expansion points shorter than the common poles are selected. Such a common-pole matching algorithm, however, is very sensitive to the numerical distance criteria for judging if two poles are actually the same poles. For accurately detecting common poles, a small distance is desirable, but it will lead to more expansion points, and even worse is that the same poles may be treated as different poles seen by two different expansion points. Also, this method may fail to detect some dominant poles as the circle for searching accurate poles might be too small as shown in our experimental results.

In this paper, we propose a new reliable pole searching algorithm, which is based on explicit frequency waveform matching. The new algorithm is based on the observation that a complex pole  $p_i$  and its residue  $k_i$  in partial fraction form  $k_i/(s-p_i)$  have the largest impact at frequency  $f_i$  when the imaginary part of the pole equals  $2\pi f_i$ . Fig. 3 shows a typical response of  $k_i/(s-p_i)$ , where  $k_i=2.78\times 10^{12}+2.34\times 10^{10}j$  and  $p_i=-4.93\times 10^8+2.58\times 10^{10}j$ . The peaks of both real (absolute value) and magnitude are around  $4.11\times 10^9$ , which is equal to  $2.58\times 10^{10}/(2\pi)$ . The reason for this is that both real and imaginary parts of  $k_i/(s-p_i)$  reach a peak when their denominator  $(pr)^2+(\omega-pi)^2$  reaches a minimum at  $\omega=pi$ , where  $s=\omega j$  and  $p_i=pr+pij$ . A complex pole with negative imaginary part typically will not have a significant impact on the upper half complex plane.

The idea of frequency waveform matching is to explicitly match the approximate frequency waveform with that of exact ones. Specifically, at an expansion point  $f_i$ , we perform the hierarchical reduction and then determine an accurate maximum frequency range  $[f_i, f_{i+1}]$  such that the errors between responses (magnitude) of the reduced rational function and that



Fig. 3. Responses of a typical  $k_i/(s-p_i)$ .

of the exact one are bounded by a prespecified error bound. The error is computed as

$$err = \frac{|dB20(V_e) - dB20(V_a)|}{|dB20(V_e)|}$$
(15)

where  $dB20(x)=20\log_{10}\left(|x|\right)$  and |x| is the magnitude of a complex number x.  $V_e$  is the exact response and  $V_a$  is the approximate response. If  $|dB20(V_e)|=0$ , then we use  $|dB20(V_a)|$  as the denominator in (15) if it is not zero. If  $|dB20(V_a)|=0$ , we have err =0.

Then,  $f_{i+1}$  will be the next expansion point. All the poles whose imaginary part falls within the range  $[2\pi f_i, 2\pi f_{i+1}]$  will be selected because their contribution in this frequency range is the largest. The new algorithm does not have the duplicate pole issue as accurate poles can only be located at one place. The accuracy of the found poles is assured by explicit waveform matching. Experimental results show that it tends to use less expansion points than the common-pole matching method and less CPU time.

#### C. Multipoint Expansion for MIMO System Reduction

For a MIMO system, by using MNA, the reduced circuit matrix  $M'(s) = [y_{ij}(s)]_{m \times m}$  will become an  $m \times m$  admittance matrix. Each admittance  $y_{ij}$  is a complex rational function with real or complex (if expansion points are on imaginary axis) coefficients. In this case, we explicitly watch for the error between each approximate rational admittance and the exact value of the admittance at each frequency. The exact value of each admittance can be computed by visiting the DDD graph representing the admittance. Since there is a lot of sharing among those admittances, the cost of evaluating all the admittances is similar to evaluating one admittance, considering that every DDD node just needs to be visited once at each frequency point [42].

#### VI. INDUCTANCE MODELS IN HIERARCHICAL REDUCTION

In this section, we discuss the VPEC used in our reduction method. We compare the VPEC model with another

inductance-based nodal susceptance concept. We will show that the inductance model by nodal susceptance is not physically equivalent to inductance as unwanted dc paths are created at low frequency.

#### A. Inductance Formulation in Hierarchical Reduction

For an *RLCM* circuit, when we assume that only independent current sources exist at external ports, the circuit matrix in *s*-domain starting with MNA formulation can be written as

$$\mathcal{G}x + s\mathcal{C}x = Bi(s) \quad v(s) = B^T x$$
 (16)

where x, v, and i are the state variable, output voltage, and input current vectors, and  $\mathcal{G}$ ,  $\mathcal{C}$ , and B are state and input—output matrices, respectively. Equation (16) can be further written as

$$\begin{bmatrix} G & A_l^T \\ -A_l & 0 \end{bmatrix} \begin{bmatrix} v_n \\ i_l \end{bmatrix} + s \begin{bmatrix} C & 0 \\ 0 & sL \end{bmatrix} \begin{bmatrix} v_n \\ i_l \end{bmatrix} = \begin{bmatrix} A_i i_n(s) \\ 0 \end{bmatrix}$$
 (17)

where G and C are the admittance matrices for resistors and capacitors, L is the inductance matrix, which includes the mutual inductance,  $v_n$  is a vector of node voltage,  $i_l$  is a branch current vector of inductors,  $A_l$  is the adjacency matrix for all inductors, and  $A_i$  is the adjacency matrix for all port current sources.

Circuit reduction means applying Gaussian elimination for state variables like node voltage  $v_n$  and branch current  $i_l$ . If we first reduce the branch current vector  $i_l$ , we actually result in a state equation with only nodal voltage variables, i.e.,

$$G + sC + \frac{1}{s} A_l S A_l^T [v_n] = [A_i i_n(s)].$$
 (18)

This is exactly like the NA formulation, where  $S=L^{-1}$  is the susceptance [5] and  $\Gamma=(1/s)A_lSA_l^T$  is the admittance form for the mutual inductance under NA. Circuit reduction by further eliminating the nodal voltage variable  $v_n$  is exactly like the  $Y-\Delta$  transformation in [31].

#### B. Inductance Models by Nodal-Susceptance

The nodal susceptance in (18) actually creates a dc path at the low-frequency range. We illustrate this with a 2-bit interconnect example shown in Fig. 4. The nodal voltage equation of susceptance at four nodes (A,B,C,D) becomes

$$\frac{S_{11}}{s}V_A - \frac{S_{11}}{s}V_B + \frac{S_{21}}{s}V_C - \frac{S_{21}}{s}V_D = I_1$$

$$-\frac{S_{11}}{s}V_A + \frac{S_{11}}{s}V_B - \frac{S_{12}}{s}V_C + \frac{S_{12}}{s}V_D = -I_1$$

$$\frac{S_{12}}{s}V_A - \frac{S_{12}}{s}V_B + \frac{S_{22}}{s}V_C - \frac{S_{22}}{s}V_D = I_2$$

$$-\frac{S_{12}}{s}V_A + \frac{S_{12}}{s}V_B - \frac{S_{22}}{s}V_C + \frac{S_{22}}{s}V_D = -I_2. \quad (19)$$

As shown in Fig. 5, it is mathematically equivalent to stamping six susceptance elements into the admittance matrix [31] when  $s \neq 0$ . However, the susceptance element  $S_{ij}/s$  approaches infinity (thus 0 impedance or short circuit) when s = 0; there exist four unwanted dc paths between nodes



Fig. 4. Example of a coupled 2-bit *RLCM* circuit under PEEC model.



Fig. 5. Example of a coupled 2-bit RLCM circuit under nodal susceptance model.



Fig. 6. Frequency responses of the PEEC model in SPICE, susceptance under NA, and VPEC models for the 2-bit bus.

(A, B, C, D), which do not exist before. As a result, it leads to wrong dc values and inaccurate low-frequency simulation results even for the 2-bit bus example in Fig. 4.

We compute the exact driving-point impedance responses using inductance under MNA and nodal susceptance under NA, respectively, using the symbolic analysis tool [36].

As shown in Fig. 6, NA formulation (by using nodal susceptance for inductance) gives the exact response as SPICE does in the high-frequency range, but the response is not correct in the low-frequency range. When s approaches zero, the actual driving-point impedance in Fig. 4 should be dominated by three



Fig. 7. Example of a coupled 2-bit *RLCM* circuit under the VPEC model.

capacitors with total capacitance value 43fF. However, for Fig. 5 at dc, the driving point impedance becomes a resistor with a total resistance value of 234  $\Omega$  [or 49  $\Omega$  (dB)] due to unwanted dc paths.

The reason for such a discrepancy is that when s=0,  $L^{-1}$  cannot be computed as L becomes singular. As a result, the NA formulation of inductance, which is based on  $L^{-1}$ , is no longer equivalent to the original circuit matrix. Hence, circuit reduction starting with nodal susceptance formulation cannot give the correct low-frequency response, in general, and is not suitable for generating the wideband macromodel of interconnects.

#### C. Formulation by VPEC Model

From the above discussion, we know that inductance formulation by nodal susceptance leads to an inaccurate low-frequency response. It is not suitable for generating the reduced interconnect model for wideband applications. However, directly handling mutual inductance in a dense MNA formulation as in [3] will be computationally expensive. As shown in [44], the sparsified VPEC model actually not only achieves the runtime speedup but also has the high accuracy compared to the original full model. Therefore, we use1 the VPEC model to represent inductance in our circuit reduction flow, as it enables passive presparsification [27], [44].

The significant difference between VPEC and nodal susceptance models for mutual inductance is that VPEC is a physically equivalent model and can exactly represent the original system [44]. As shown in Fig. 7, this model consists of an electrical circuit (PEEC resistance and capacitance) and a magnetic circuit (VPEC effective resistance and controlled source). It includes the following components: 1) wire resistance and capacitance are the same as in the PEEC model; 2) a dummy voltage source (sensing electrical current  $I_i$ ) to control  $\hat{I}_i$ ; 3) a voltage-controlled current source to relate  $\hat{V}_i$  and  $\hat{I}_i$  with gain g=1; 4) an electrical voltage source  $V_i$  controlled by  $\hat{V}_i$ ; 5) effective resistors including ground  $\hat{R}_{i0}$  and coupling  $\hat{R}_{ij}$  to consider the strength of inductances; and 6) a unit inductance  $L_i$  to a) take into account the time derivative of  $A_i$  and b) preserve the magnetic energy from the electronic circuit.

Clearly, this SPICE-compatible implementation does not introduce unwanted dc paths when s=0 as that by the nodal

susceptance. Moreover, Fig. 6 shows the response of the VPEC model for the 2-bit circuit, which is identical to SPICE for the entire frequency range. Detailed analysis also shows that the impedance function of the 2-bit circuit modeled by VPEC model is the same as the impedance function by the PEEC model [45].

As shown in [44], although the VPEC model introduces more circuit elements, it has faster runtime because this model dramatically reduces reactive elements (i.e., inductors), leads to less numerical derivatives and integrals, and makes the simulation converge faster. To further improve the sparsified VPEC model extraction without full inversion as in [44], we extend a windowing technique [5]. It reduces the computation complexity to  $[O(Nb^3)]$ , where b is the size of the window (i.e., the size of the submatrix). Note that although the VPEC model enables efficient inductance simulation, the order of the circuit matrix is still high. Moreover, its SPICE-compatible model still contains controlled sources and cannot be handled by the existing realizable circuit reduction approaches [3], [31].

#### VII. PASSIVITY ENFORCEMENT

In this section, we present the state-space-based passivity enforcement method, which is based on the method used in [7]. But we show how this method can be used in our hierarchical model order reduction (HMOR) framework to enforce passivity of the model order reduced admittance matrix  $\tilde{\mathbf{Y}}(s)$ .

Passivity is an important property of many physical systems. Brune [6] has proved that the admittance and impedance matrices of an electrical circuit consisting of an interconnection of a finite number of positive R, positive C, positive L, and transformers are passive if and only if (iff) its rational functions are positive real (PR). It can be proved that the following statements are equivalent:

- 1) A transfer function matrix  $\mathbf{Y}(s)$  is PR.
- 2) Let (ABCD) be a minimal controllable state-space representation of  $\mathbf{Y}(s)$ .  $\exists K$

$$K = K^T, \qquad K \ge 0 \tag{20}$$

such that the linear matrix inequality (LMI)

$$\begin{bmatrix} A^T K + KA & KB - C^T \\ B^T K - C & -D - D^T \end{bmatrix} \ge 0 \tag{21}$$

holds.

Constraint (21) actually comes from the Kalman Yakubovich Popov (KYP) Lemma, which establishes important relations between state-space and frequency-domain objects. KYP was introduced in control theory and later used in network synthesis [4].

If we include the term proportional to s in the transfer function, which means we need to know what happens in infinite frequency, we can write the admittance matrix in terms of (ABCD) as

$$\mathbf{Y}(s) = sY^{\infty} + D + C(sI - A)^{-1}B$$
 (22)

where I denotes the identify matrix with the same dimension as A. To keep the transfer function PR,  $Y^{\infty}$  must satisfy

$$Y^{\infty} = (Y^{\infty})^T, \qquad Y^{\infty} \ge 0. \tag{23}$$

Therefore, we can transform the problem of checking whether the admittance matrix  $\mathbf{Y}(s)$  is PR into the problem of checking whether its corresponding state-space model in terms of (ABCD) is positive semidefinite. More important is that we can use the PR criterion in terms of the state-space form to enforce the passivity of the reduced circuit matrices as shown in the next section.

### A. State-Space Model Representation of $\tilde{\mathbf{Y}}(s)$

After the multipoint hierarchical model reduction, an n-port-order reduced admittance matrix is generated as

$$\tilde{\mathbf{Y}}(s) = \begin{bmatrix} \tilde{\mathbf{Y}}_{1,1} & \cdots & \tilde{\mathbf{Y}}_{1,n} \\ \vdots & \ddots & \vdots \\ \tilde{\mathbf{Y}}_{n,1} & \cdots & \tilde{\mathbf{Y}}_{n,n} \end{bmatrix}$$
(24)

where each  $\mathbf{\tilde{Y}}_{p,q}$  is a rational function of s. The reduction process can capture the entire dominant complex poles, which means that there are no poles in the right hand plane (RHP) of the complex plane.

The first step that we do is to transform the admittance matrix  $\tilde{\mathbf{Y}}(s)$  into its state-space representation. We assume that all rational functions in the matrix share the common poles of the system. If there are private poles appearing on the leading diagonal element, we can separate them and their residues from the whole rational function after partial fraction decomposition and realize them separately.

Given a multivariable n-port network, each rational function  $\tilde{\mathbf{Y}}_{p,q}$  is considered as an SISO subsystem and mapped to its state-space representation in the controllable canonical form, which corresponds to  $(A_{q,q}B_{q,q}C_{p,q}D_{p,q})$  in the matrix of (ABCD), respectively. Now we can write its state-space representation as

$$A = \begin{bmatrix} A_{1,1} & \cdots & 0 \\ \vdots & \ddots & \vdots \\ 0 & \cdots & A_{n,n} \end{bmatrix}$$

$$B = \begin{bmatrix} B_{1,1} & \cdots & 0 \\ \vdots & \ddots & \vdots \\ 0 & \cdots & B_{n,n} \end{bmatrix}$$

$$C = \begin{bmatrix} C_{1,1} & \cdots & C_{1,n} \\ \vdots & \ddots & \vdots \\ C_{n,1} & \cdots & C_{n,n} \end{bmatrix}$$

$$D = \begin{bmatrix} D_{1,1} & \cdots & D_{1,n} \\ \vdots & \ddots & \vdots \\ D_{n,1} & \cdots & D_{n,n} \end{bmatrix}.$$
(25)

Also, this mapping process could be viewed as an n set of single-input multiple-output (SIMO) subsystems. If we choose

the *m*th port as the input port, the *m*th column admittance rational function can be mapped into  $(A_{m,m}B_{m,m}C_{:,m}D_{:,m})$ .

#### B. Passivity Enforcement Optimization

In this section, we briefly mention how passivity enforcement is done via a convex optimization process on the state-space representation of the admittance matrix.

Assume that we have obtained the admittance matrix of a model order reduced system  $\tilde{\mathbf{Y}}(s)$  with a set of N sampling points. Let  $\tilde{\mathbf{Y}}_{p,q}(s)$  denote the (p,q) entry of the transfer function  $\tilde{\mathbf{Y}}(s)$ . Let  $\hat{\mathbf{Y}}_{p,q}(s_k)$  be the exact values of admittance at the entry (p,q) at the kth frequency point, which can be obtained by the exact hierarchical symbolic analysis [42].

The optimization problem is to determine C, D, and  $Y^{\infty}$  such that a cost function is minimized with constraints on the error  $(\hat{\mathbf{Y}}_{p,q} - \tilde{\mathbf{Y}}_{p,q})$ . Here, the constraints are on the weighted least square error taken over N frequencies as

$$\sum_{k=1}^{N} w_{k,p,q} \left\| \hat{\mathbf{Y}}_{p,q}(s_k) - \tilde{\mathbf{Y}}_{p,q}(s_k) \right\|_{2}^{2} \le t_{p,q}.$$
 (26)

It is shown in [7] that the optimization problem in (26) subject to constrains in (21) can be transformed into a convex programming problem, which can be solved efficiently by some existing convex programming programs.

We notice that passivity enforcement was done by the compensation-based approach proposed in [1]. But this method does not ensure accurate matching because the compensated part may have significant impacts on the frequency range that we are interested.

#### VIII. MULTIPORT CIRCUIT REALIZATION

Once the new C, D, and  $Y^{\infty}$  are obtained by convex programming, the new passivity-enforced  $\mathbf{Y}(s)$  are constructed again by (22). We now discuss how to generate realized macromodels for both frequency and time-domain simulation.

#### A. Relaxed One-Port Realization

We start with one-port network realization. For a one-port model with driving-point admittance function, we propose to use a generalized Foster's canonical form based realization to directly synthesize the admittance function.

To synthesize the one-port model from the driving-point admittance rational function Y(s), we first rewrite it in Foster's canonical form [43]

$$Y(s) = sY_{\infty} + Y_0 + \sum_{m=1}^{M} \frac{a_m}{s - p_m} + \sum_{n=1}^{N} \left( \frac{a_n}{s - p_n} + \frac{a_n^*}{s - p_n^*} \right)$$
(27)

where we expand the rational function into the partial fraction form with N conjugate-poles  $p_n$  and M real poles  $p_m$ .



Fig. 8. One-port Foster admittance realization.

The admittance function in Foster's canonical form can then be synthesized by an equivalent circuit in Fig. 8 with the relations to determine R, L, C, and G elements given as

$$G_{s} = Y_{0}$$

$$C_{s} = Y_{\infty}$$

$$R_{m_{-}m} = \frac{1}{a_{m}}$$

$$L_{m_{-}m} = -\frac{p_{m}}{a_{m}}$$

$$L_{n_{-}n} = \frac{1}{2Re\{a_{n}\}}$$

$$L_{n_{-}n}C_{n_{-}n}|p_{n}|^{2} = R_{n_{-}n}G_{n_{-}n} + 1$$

$$\frac{G_{n_{-}n}}{C_{n_{-}n}} = -\frac{Re\{a_{n}p_{n}^{*}\}}{Re\{a_{n}\}}$$

$$\frac{R_{n_{-}n}}{L_{n_{-}n}} = \frac{Re\{a_{n}p_{n}^{*}\}}{Re\{a_{n}\}} - 2Re\{p_{n}\}.$$
(28)

Some existing works like PRIME [24] require every complex pole pairs to be physically realizable (every RCL element is positive), which is over constrained and may lead to significant errors when unrealizable pole pairs are discarded or their residues are changed. In our approach, we relax those constraints by allowing some negative RLC elements. But the passivity of the admittance function will still be guaranteed by the passivity enforcement procedure since the realization is error free and reversible and does not change the passivity of the realized system.

#### B. Multiport Realization

For a passive multiport order reduced admittance matrix, we propose a general complete graph structure (in case of full admittance matrix) to realize the admittance matrix based on one-port realization. In the following, we first illustrate how a two-port network is realized and then extend this concept for general n-port network realization.

Given a  $2 \times 2$  passive admittance matrix, which is obtained by the HMOR method

$$Y_{2\times 2}(s) = \begin{bmatrix} y_{11}(s) & y_{12}(s) \\ y_{21}(s) & y_{22}(s) \end{bmatrix}$$
 (29)

it can be realized exactly by using the  $\Pi$  structure template shown in Fig. 9, where each branch admittance will be realized by the one-port Foster's expansion method shown in Fig. 8.



Fig. 9. General two-port realization  $\Pi$  model.



Fig. 10. Six-port realization based on the  $\Pi$  structure.

Based on this template, such a realization can be easily extended to the multiport case, i.e.,

$$Y_{n\times n}(s) = \begin{bmatrix} y_{11}(s) & \cdots & y_{1n}(s) \\ \vdots & \ddots & \vdots \\ y_{n1}(s) & \cdots & y_{nn}(s) \end{bmatrix}.$$
 (30)

Generally, for a reduced n-port network with a full  $n \times n$  admittance matrix as shown in (30), the realized network will be a complete graph where each branch represents an admittance, which is realized by the one-port realization method. For instance, Fig. 10 shows a realization of a synthesized sixport network. The branch admittance of the mth port branch (the branch between the port and ground) is the sum of all the mth row admittances, and the admittance of the branch between the port and any other port is the negative value of the corresponding admittance.

Notice that our realized circuits automatically preserve the reciprocity of the original circuit matrix as it requires the admittance to be symmetric.

#### IX. EXPERIMENTAL RESULTS

The proposed algorithm has been implemented in C++, and all the data are collected on a Linux workstation with dual 1.6-GHz AMD Althon CPUs and 2-GB memory. The convex programming problem is solved using some standard optimization packages. We use SeDuMi [38] and SeDuMi Interface to solve the convex programming problem for passivity enforcement.

#### A. One-Port Macromodel for Spiral Inductor

We first construct the detailed PEEC model for a three-turn spiral inductor with its substrate. We assume copper ( $\rho=1.7\times10^{-8}~\Omega\cdot \text{m}$ ) for the metal and the low-k dielectric ( $\epsilon=2.0$ ). The substrate is modeled as a lossy ground plane (heavily doped) with  $\rho=1.0\times10^{-5}~\Omega\cdot \text{m}$ . The conductor is volume



Fig. 11. Frequency response of the three-turn spiral inductor and its reduced model by using waveform matching and common-pole method.

discretized according to skin depth and longitudinal segmented by one tenth of a wavelength. The substrate is also discretized as in [23]. The capacitance is extracted by FastCap [25], and only adjacent capacitive coupling is considered since capacitive coupling is short range. Partial inductance is extracted by FastHenry at 50 GHz [17]. Inductive coupling between any pair of segments (including segments in the same line) is considered. Then, we generate the distributed PEEC model by  $\pi$ -type of RLC topology to connect each segment, and it results in a SPICE netlist with 232 passive RLCM elements. The substrate parasitic contribution (Eddy current loss) is lumped into the above conductor segment. Note that for more accurate extraction at ultrahigh frequency, it needs full-wave PEEC model description [18]. For mutual inductance, a VPEC is used [44], which is more hierarchical reduction friendly as no coupling inductor branch currents are involved and circuit partition can be done easily.

1) Comparison With Common-Pole Matching Method in Frequency Domain: For the spiral inductor, the driving point impedance is obtained by the multipoint hierarchical reduction process. We use both the common-pole matching algorithm in the complex frequency hopping method and the new waveformmatching algorithm to search for dominant poles along the imaginary axis.

For a fair comparison, we make sure that the resulting rational functions will have similar accuracy. For a commonpole matching algorithm, if two poles are located within 1% of their magnitude, they are regarded as the same pole. For the waveform-matching algorithm, the error bound between the approximate one and the exact is set to 0.1%. As a result, a common-pole approach takes 26 expansions with 37.1 s, while the waveform marching method uses 15 expansions with 22.57 s. The responses obtained using both methods versus the exact response up to 100 GHz are shown in the Fig. 11. The responses from both methods match the exact ones very well all the way to 100 GHz. Our experience shows that CPU time of the common-pole method highly depends on the



Fig. 12. Colpitts LC oscillator with spiral inductors.



Fig. 13. Time domain comparison between original and synthesized models for a Colpitts LC oscillator with a three-turn spiral inductor.

common-pole detection criteria. For instance, if we set the criteria for common-pole detection to 0.5%, then 65 expansions are carried out. Also, as more expansions are carried out, the chances that a single pole is seen by two consecutive expansion points becomes larger, but it may be treated as different poles due to a small distance criteria, which in turn leads to significant distortion of the frequency response.

2) Time-Domain Simulation of a LC Oscillator: We further demonstrate the accuracy and efficiency of the inductor macromodel in the time-domain harmonic simulation. Note that the synthesized one-port macromodel can be used to efficiently predict the critical performance parameters of the spiral inductor, such as  $\omega_T$ , Q factor, and even the resonance starting condition for an oscillator [32].

We use Colpitts LC oscillator as an example as shown in Fig. 12(b), where the active circuit behaves like a negative resistance to make the oscillator work as shown in Fig. 12(a).

In this experiment, the synthesized one-port model is from a 25-order rational function with 24 poles and results in a macromodel with 40 RLC elements. As shown in Fig. 13, the waveform at steady state of synthesized model and original model matches very well, but we observe a  $10\times (5.17 \text{ s versus } 0.52 \text{ s})$  runtime speedup by using the reduced model.



Fig. 14. Frequency responses of  $Y_{11}$  of a 2-bit transmission line.



Fig. 15. Frequency responses of  $Y_{12}$  of a 2-bit transmission line.

#### B. Multiport Macromodel of Coupled Transmission Line

We then use a 2-bit coupled transmission line as the example for multiport reduction and synthesis. The original PEEC model contains 42 resistors, 63 capacitors, 40 self-inductors, and 760 mutual inductors, where we consider inductive coupling between any two segments including those in the same line. Still, for mutual inductance, a VPEC is used.

The matching frequency is up to 31 GHz, and we find 24 dominant poles in this range. There are 150 RLC elements in the synthesized circuit compared with 364 devices in the original circuit, which represent a 58.79% reduction rate. The frequency responses for  $Y_{11}(s)$  and  $Y_{12}(s)$  are shown in Figs. 14 and 15, respectively. If we only match to 14 GHz, 12 poles are required and we can achieve a 78.5% reduction rate instead. The time-domain step responses from the original circuit, the 14-GHz synthesized circuit, and the 31-GHz synthesized circuit



Fig. 16. Transient responses of a 2-bit transmission line.



Fig. 17. Frequency responses of a 2-bit transmission line at two ports.

are shown in Fig. 16. The difference among these three circuits is fairly small.

In Fig. 17, we further compare the frequency responses of the 24th-order macromodel by HMOR, the 24th-order macromodel by PRIMA, and the time-constant-based reduction (with similar reduction ratio) with the original circuit. The frequency response of port-1 is observed at the input port of the first bit, and that of port-2 is at the far-end of the first bit. Due to the preserved reciprocity, the reduced mode is easily realized by Foster's synthesis, and the model size is half of the SPICE-compatible circuit by PRIMA (via recursive convolution for each  $Y_{ij}$ ). Moreover, as shown in Fig. 17, the accuracy of the 24th-order model by H-reduction can match up to 30 GHz, but the same order model by PRIMA can only match up to 20 GHz. Note that under similar reduction ratio with H-reduction, the time-constant-based reduction can only match up to 5 GHz.

TABLE I COMPARISON OF REDUCTION CPU TIMES

| ſ | Circuits | #Elements | PRIMA(sec) | HMOR(sec) |
|---|----------|-----------|------------|-----------|
|   | ckt1     | 84        | 0.042      | 0.6       |
|   | ckt2     | 258       | 0.077      | 11.7      |
| ſ | ckt3     | 905       | 0.16       | 17.8      |
| ſ | ckt4     | 5255      | 1.73       | 44.4      |
|   | ckt5     | 20505     | 32.37      | 96.8      |

#### C. Scalability Comparison With Existing Methods

Table I gives the reduction CPU time comparison for the two methods. HMOR denotes the CPU times of hierarchical reduction. We notice that HMOR is slower than PRIMA. But the difference becomes less for large circuits. Theoretically, PRIMA and the one-point hierarchical model reduction have the same time complexity, that is, time of complexity of one Gaussian elimination. In PRIMA, we have to solve the circuit matrix at least once using Gaussian elimination or LU decomposition to solve for all the Krylov space base vectors (or moments). In the hierarchical reduction method, if we reduce one node at a time, it becomes the Gaussian elimination process. All the polynomial operations with fixed order have fixed computing costs. The efficiency difference is mainly due to expensive recursive operations used in graph operations, which can be further improved, and multipoint matching. However, multipoint matching makes our method closed-loop, which gives us good control on the model accuracy. For PRIMA, the model accuracy cannot be determined without several trials using different reduction orders.

We finally present a scalability comparison in Table II by the time-domain transient simulation for the following aspects: 1) runtime of simulation; 2) realization efficiency (realized model size); and 3) accuracy in terms of delay. Several different sized *RLCM* circuits are used. We compare the our method with the time-constant-based circuit reduction [3] and projection-based reduction PRIMA implemented at [21]. The same number of poles are used for the reduction when we compare our H-reduction with PRIMA. The reduced model by time-constant reduction is obtained with the similar model size as H-reduction.

First, we found that our realized RLCG circuit model size is up to  $10\times$  smaller on average than the SPICE-compatible circuit from PRIMA. Therefore, a similar simulation speedup  $(8\times)$  is observed when we run both circuits in SPICE3. When we further compare the simulation time of our reduced models with the PEEC circuits, a significant speedup (up to 2712x for ckt5) is obtained. Furthermore, the waveform accuracy in terms of delay is given in columns 12–14. The reduced models are very accurate with the worst case delay error being -1.04% even with  $478\times(957.9~\text{kb})$  versus 1.92~kb) reduction ratio in terms of model size. But for the same reduction ratio as our reduction, we found that time-constant-based reduction introduces large errors (up to 12.91%) because too many nodes are to be eliminated and the reduction criteria cannot be satisfied.

Note that the sparsification in the VPEC model can dramatically reduce the number of mutual inductive couplings but can also maintain the accuracy [44]. As a result, we use this technique during our reduction for larger circuits. For example,

|          |           |        |              | -           |       |          |                |             | 10    |          | 10              | 1.2         | 1.1    |
|----------|-----------|--------|--------------|-------------|-------|----------|----------------|-------------|-------|----------|-----------------|-------------|--------|
| I        | 2         | 3      | 4            | 5           | 6     | 7        | 8              | 9           | 10    | 11       | 12              | 13          | 14     |
| Circuits | #Elements | #Poles | Simu-time(s) |             |       |          | Model-size(Kb) |             |       |          | Delay-error (%) |             |        |
|          |           |        | H-redu       | time-const. | Prima | Original | H-redu         | time-const. | Prima | Original | H-redu          | time-const. | Prima  |
| ckt1     | 84        | 10     | 0.15         | 0.12        | 0.50  | 0.51     | 0.85           | 0.86        | 3.51  | 3.52     | -0.16%          | -0.86%      | -0.15% |
| ckt2     | 258       | 10     | 0.15         | 0.15        | 0.92  | 5.31     | 0.85           | 0.86        | 3.51  | 11.23    | -0.24%          | -1.12%      | -0.22% |
| ckt3     | 905       | 25     | 0.32         | 0.43        | 2.25  | 19.51    | 1.68           | 1.70        | 19.8  | 41.1     | -0.41%          | -4.43%      | -0.37% |
| ckt4     | 5255      | 30     | 0.52         | 0.89        | 3.13  | 661.46   | 1.92           | 2.23        | 28.2  | 243.6    | -0.62%          | -6.83%      | -0.58% |
| ckt5     | 20505     | 30     | 0.52         | 1.08        | 5.98  | 1356.66  | 1.92           | 2.53        | 28.2  | 957.9    | -1.04%          | -12.91%     | -0.83% |

TABLE II
SIMULATION EFFICIENCY COMPARISON BETWEEN ORIGINAL
AND SYNTHESIZED MODELS

in the case of ckt5 (the largest one) in Table II, we obtain a 97.5% sparsification from 19 900 to 498 mutual inductors. Due to this sparsification, it reduces the reduction time by  $10 \times (365.4-47.8 \text{ s})$ .

#### X. CONCLUSION

We have proposed a new hierarchical multipoint reduction algorithm for the wideband modeling of high-performance RF passive and linear(ized) analog circuits. In the theoretical side, we showed that s-domain hierarchical reduction is equivalent to the implicit moment matching at around s = 0 and that hierarchical one-point reduction is numerically stable for general tree-structured circuits. We also showed that hierarchical reduction preserves the reciprocity of passive circuit matrices. In the practical side, we have proposed a hierarchical multipoint reduction scheme for the high-fidelity wideband modeling of general passive and active linear circuits. A novel explicit waveform-matching algorithm is proposed for searching dominant poles and their residues from different expansion points, which is shown to be more efficient than the existing pole search algorithm. The passivity of reduced models is enforced by the state-space-based optimization method. We also proposed a general multiport network realization framework to generate SPICE-compatible circuits as the macromodels of the reduced circuit admittance matrices. The resulting modeling algorithm can generate the multiport passive SPICE-compatible model for any linear passive network with easily controlled model accuracy and complexity. Experimental results on a number of passive RF and interconnect circuits have shown that the new proposed macromodeling technique generates more compact models given the same accuracy requirements than existing approaches like PRIMA and time-constant methods.

#### REFERENCES

- [1] R. Achar, P. K. Gunupudi, M. Nakhla, and E. Chiprout, "Passive interconnect reduction algorithm for distributed/measured networks," *IEEE Trans. Circuits Syst. II, Analog Digit. Signal Process.*, vol. 47, no. 4, pp. 287–301, Apr. 2000.
- [2] D. Allstot, K. Choi, and J. Park, Parasitic-Aware Optimization of CMOS RF Circuits. Boston, MA: Kluwer, 2003.
- [3] C. S. Amin, M. H. Chowdhury, and Y. I. Ismail, "Realizable RLCK circuit crunching," in Proc. DAC, Anaheim, CA, 2003, pp. 226–231.
- [4] B. D. O. Anderson and S. Vongpanitlerd, Network Analysis and Synthesis. Englewood Cliffs, NJ: Prentice-Hall, 1973.
- [5] M. Beattie and L. Pileggi, "Efficient inductance extraction via windowing," in *Proc. Eur. DATE*, Munich, Germany, 2001, pp. 430–436.
- [6] O. Brune, "Synthesis of a finite two-terminal network whose driving point impedance is a prescribed function of frequency," *J. Math. Phys.*, vol. 10, no. 3, pp. 191–236, 1931.
- [7] J. P. C. P. Coelho and L. M. Silveira, "A convex programming approach for generating guaranteed passive approximations to tabulated frequency-

- data," *IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst.*, vol. 23, no. 2, pp. 293–301, Feb. 2004.
- [8] E. Chiprout and M. S. Nakhla, "Analysis of interconnect networks using complex frequency hopping," *IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst.*, vol. 14, no. 2, pp. 186–200, Feb. 1995.
- [9] C. P. Coelho, J. R. Phillips, and L. M. Silveira, "A convex programming approach to positive real rational approximation," in *Proc. ICCAD*, San Jose, CA, Nov. 2001, pp. 245–251.
- [10] P. J. H. Elias and N. P. van der Meijs, "Including higher-order moments of RC interconnections in layout-to-circuit extraction," in Proc. Eur. DATE, Paris, France, Mar. 1996, pp. 362–366.
- [11] P. Feldmann and R. W. Freund, "Efficient linear circuit analysis by Padé approximation via the Lanczos process," *IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst.*, vol. 14, no. 5, pp. 639–649, May 1995.
- [12] R. W. Freund, "SPRIM: Structure-preserving reduced-order interconnect macromodeling," in *Proc. ICCAD*, San Jose, CA, 2004, pp. 80–87.
- [13] S. Grivet-Talocia, I. A. Maio, and F. Canavero, "Recent advances in reduced-order modeling of complex interconnects," in *Proc. Dig. Electrical Performance Electronic Packaging*, Cambridge, MA, Oct. 2001, vol. 10, pp. 243–246.
- [14] B. Gustavsen and A. Semlyen, "Rational approximation of frequency domain responses by vector fitting," *IEEE Trans. Power Del.*, vol. 14, no. 3, pp. 1052–1061, Mar. 1999.
- [15] X. Huang, V. Rahjavan, and R. A. Rohrer, "Awesim: A program for efficient analysis of linear(ized) circuits," in *Proc. ICCAD*, Santa Clara, CA, 1990, pp. 534–537.
- [16] Y. Ismail and E. G. Friedman, "DTT: Direct truncation of the transfer function—An alternative to moment matching for tree structured interconnect," *IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst.*, vol. 21, no. 2, pp. 131–144, Feb. 2003.
- [17] M. Kamon, M. Tsuk, and J. White, "FastHenry: A multipole-accelerated 3-D inductance extraction program," *IEEE Trans. Microw. Theory Tech.*, vol. 42, no. 9, pp. 1750–1758, Sep. 1994.
- [18] S. Kapur and D. Long, "IES<sup>3</sup>: A fast integral equation solver for efficient 3-dimensional extraction," in *Proc. ICCAD*, San Jose, CA, 1997, pp. 448–455.
- [19] C. Kashyap and B. Krauter, "A realizable driving point model for on-chip interconnect with inductance," in *Proc. DAC*, Los Angeles, CA, 2000, pp. 190–195.
- [20] K. J. Kerns and A. T. Yang, "Stable and efficient reduction of large, multiport RC network by pole analysis via congruence transformations," IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst., vol. 16, no. 7, pp. 734–744, Jul. 1998.
- [21] Y. Kim and D. Sylvester, *Indigo: PEEC based RLC extraction and analysis tool for High-Speed On-Chip Interconnect, http://www.eecs.umich.edu/~kimyz/peec.htm*, EECS Dept. Univ. Michigan.
- [22] J. Lillis, C. Cheng, S. Lin, and N. Chang, *Interconnect Analysis and Synthesis*. New York: Wiley, 1999.
- [23] Y. Massoud and J. White, "Simulation and modeling of the effect of substrate conductivity on coupling inductance and circuit crosstalk," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 10, no. 3, pp. 286–291, Jun. 2002.
- [24] J. Morsey and A. C. Cangellaris, "PRIME: Passive realization of interconnect models from measured data," in *Proc. Electrical Performance Electronic Packaging*, Cambridge, MA, Oct. 2001, pp. 47–50.
- [25] K. Narbos and J. White, "FastCap: A multipole accelerated 3D capacitance extraction program," *IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst.*, vol. 10, no. 11, pp. 1447–1459, Nov. 1991.
- [26] A. Odabasioglu, M. Celik, and L. Pileggi, "PRIMA: Passive reducedorder interconnect macromodeling algorithm," *IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst.*, vol. 17, no. 8, pp. 645–654, Aug. 1998.
- [27] A. Pacelli, "A local circuit topology for inductive parasitics," in *Proc. ICCAD*, San Jose, CA, 2002, pp. 208–214.

- [28] L. T. Pillage and R. A. Rohrer, "Asymptotic waveform evaluation for timing analysis," *IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst.*, vol. 9, no. 4, pp. 352–366, Apr. 1990.
- [29] L. T. Pillage, R. A. Rohrer, and C. Visweswariah, Electronic Circuit and System Simulation Methods. New York: McGraw-Hill, 1994.
- [30] Z. Qi, S. X.-D. Tan, H. Yu, L. He, and P. Liu, "Wideband modeling of RF/analog circuits via hierarchical multi-point model order reduction," in *Proc. ASPDAC*, Shanghai, China, Jan. 2005, pp. 224–229.
- [31] Z. Qin and C. Cheng, "Realizable parasitic reduction using generalized  $Y-\Delta$  transformation," in *Proc. DAC*, Anaheim, CA, 2003, pp. 220–225.
- [32] B. Razavi, *RF Microelectronics*. Upper Saddle River, NJ: Prentice-Hall, 1998
- [33] D. Saraswat, R. Achar, and M. Nakhla, "A fast algorithm and practical considerations for passive macromodeling of measured/simulated data," *IEEE Trans. Adv. Packag.*, vol. 27, no. 1, pp. 57–70, Feb. 2004.
- [34] B. N. Sheehan, "TICER: Realizable reduction of extracted *RC* circuits," in *Proc. ICCAD*, San Jose, CA, 1999, pp. 200–203.
- [35] —, "Branch merge reduction of RLCM networks," in Proc. ICCAD, San Jose, CA, 2003, pp. 658–664.
- [36] C.-J. Shi and X.-D. Tan, "Canonical symbolic analysis of large analog circuits with determinant decision diagrams," *IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst.*, vol. 19, no. 1, pp. 1–18, Jan. 2000.
- [37] M. Silveira, M. Kamon, I. Elfadel, and J. White, "A coordinate-transformed Arnoldi algorithm for generating guaranteed stable reduced-order models of *RLC* circuits," in *Proc. ICCAD*, San Jose, CA, 1996, pp. 288–294.
- [38] J. Sturm, "Using SeDuMi 1.02, a Matlab Toolbox for optimization over symmetric cones," *Optim. Methods. Softw.*, vol. 11–12, pp. 625–653, 1999.
- [39] S. X.-D. Tan, "A general s-domain hierarchical network reduction algorithm," in Proc. ICCAD, San Jose, CA, 2003, pp. 650–657.
- [40] S. X.-D. Tan, W. Guo, and Z. Qi, "Hierarchical approach to exact symbolic analysis of large analog circuits," in *Proc. DAC*, San Diego, CA, Jun. 2004, pp. 860–863.
- [41] S. X.-D. Tan, Z. Qi, and H. Li, "Hierarchical modeling and simulation of large analog circuits," in *Proc. Eur. DATE*, Paris, France, Feb. 2004, pp. 740–741.
- [42] X.-D. Tan and C.-J. Shi, "Hierarchical symbolic analysis of large analog circuits via determinant decision diagrams," *IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst.*, vol. 19, no. 4, pp. 401–412, Apr. 2000.
- [43] G. C. Temes and J. Lapatra, *Introduction to Circuit Synthesis and Design*. New York: McGraw-Hill, 1977.
- [44] H. Yu and L. He, "Vector potential equivalent circuit based on PEEC inversion," in *Proc. DAC*, Anaheim, CA, 2003, pp. 723–781.
- [45] H. Yu, L. He, Z. Qi, and S. X.-D. Tan, "A wideband realizable circuitreduction for *RLCM* interconnects," in *Proc. ASPDAC*, Shanghai, China, Jan. 2005, pp. 111–114.



Zhenyu Qi (S'04) received the B.S. degree in electrical engineering from Fudan University, Shanghai, China, in 2003, the M.S. degree in electrical engineering from the University of California at Riverside, in 2004, and is currently working toward the Ph.D. degree at the University of California at Riverside.

His research interests include interconnect modeling, high-performance on-chip power/ground network optimization, and fast simulation for RF and thermal circuits.



**Hao Yu** (S'02) received the B.S. degree in physics from Fudan University, Shanghai, China, in 1999, and is currently working toward the Ph.D. degree in electrical engineering at the University of California at Los Angeles.

In the summer of 2001–2003, he was with the Mixed-Mode CAD Development Group, Analog Devices Inc., Norwood, MA. His research interests include computer-aided design of very large scale integration (VLSI) circuits and systems, interconnect modeling/simulation, and signal integrity characterization/optimization.



**Pu Liu** (S'05) received the M.S. degree in control theory and control engineering from the Beijing Institute of Technology, Beijing, China, in 2002, and is currently working toward the Ph.D. degree in electrical engineering at the University of California at Riverside.

His research interests focus on behavioral modeling and model order reduction techniques for linear, linear time varying system, and nonlinear system, fast electrical and thermal analysis, and optimization.



**Sheldon X.-D. Tan** (S'96–M'99–SM'06) received the B.S. and M.S. degrees in electrical engineering from Fudan University, Shanghai, China, and the Ph.D. degree in electrical and computer engineering from the University of Iowa, Iowa City, in 1992, 1995, and 1999, respectively.

He is an Associate Professor in the Department of Electrical Engineering, University of California (UC), Riverside. He was a Faculty Member in the Electrical Engineering Department of Fudan University from 1995 to 1996. He was with Monterey

Design Systems Inc., Mountain View, CA, from 1999 to 2001, and Altera Corporation, San Jose, CA, from 2001 to 2002. His research interests include several aspects of design automation for very large scale integration (VLSI) integrated circuits, namely modeling and simulation of analog/radio frequency/mixed-signal VLSI circuits, high-performance power and clock distribution network simulation and design, signal integrity, power modeling, thermal modeling, thermal optimization in VLSI physical and architecture levels, and embedded system designs based on field-programmable gate array platforms. He is a coauthor of the book, *Symbolic Analysis and Reduction of VLSI Circuits* (Springer/Kluwer, 2005).

Dr. Tan is the recipient of the NSF CAREER Award in 2004. He also received the UC Regent's Faculty Fellowship in 2004. He received a Best Paper Award Nomination from the 2005 IEEE/ACM Design Automation Conference, the Best Paper Award from the 1999 IEEE/ACM Design Automation Conference, and the Best Poster Award from the 1999 Spring Meeting of the NSF Center for Design of Analog and Digital Integrated Circuits. He is a Technical Program Committee Member of ASPDAC'05, BMAS'05, ASPDAC'06, ISQED'06, and ICCAD'06.



**Lei He** (S'01–M'04) received the Ph.D. degree in computer science from the University of California at Los Angeles (UCLA), in 1999.

From 1999 to 2001, he was a faculty member at the Electrical and Computer Engineering Department, University of Wisconsin–Madison. Since 2002, he has been an Assistant Professor at the Electrical Engineering Department, UCLA. He has published over 100 technical papers. His research interests include the modeling and design considering inductance effects and process variations for in-

tegrated circuits and packages, programmable logic fabrics and reconfigurable systems, and power-efficient circuits and systems.

Dr. He is a TPC Member of the Design Automation Conference, International Symposium on Low Power Electronics and Design, and the International Symposium on Field Programmable Gate Array. He was the recipient of the National Science Foundation CAREER award in 2000, the UCLA Chancellor's Faculty Career Development Award (highest class) in 2003, and the IBM Faculty Partner Award in 2003.