# Wideband Passive Multi-Port Model Order Reduction and Realization of RLCM Circuits

Zhenyu Qi, Student Member, IEEE, Hao Yu, Student Member, IEEE, Pu Liu, Student Member, IEEE Sheldon X.-D. Tan, Member, IEEE and Lei He Member, IEEE

Abstract—This paper presents a novel compact passive modeling technique for high-performance RF passive and interconnect circuits modeled as high-order RLCM circuits. The new method is based on a recently proposed general s-domain hierarchical modeling and analysis method and VPEC (vector potential equivalent circuit) model for self and mutual inductance. Theoretically, we show that the s-domain hierarchical reduction is equivalent to implicit moment matching around s = 0, and that the existing hierarchical reduction method by onepoint expansion is numerically stable for general tree-structured circuits. We also show that the hierarchical reduction preserves reciprocity of the passive circuit matrices. Practically, we propose a hierarchical multi-point reduction scheme to obtain accurate order reduced admittance matrices of general passive circuits. A novel explicit waveform matching algorithm is proposed for searching dominant poles and residues from different expansion points based on the unique hierarchical reduction framework. To enforce passivity, state-space based optimization is applied to the model order reduced admittance matrix. Then we propose a general multi-port network realization method to realize the passivity-enforced reduced admittance based on relaxed oneport network synthesis technique using Foster's canonical form. The resulting modeling algorithm can generate the multiple-port passive SPICE-compatible model for any linear passive network with easily controlled model accuracy and complexity.

Experimental results on an RF spiral inductor and a number of high-speed transmission line circuits are presented. In comparison with other approaches, the proposed reduction is as accurate as PRIMA in high frequency-domain due to the enhanced multipoint expansion, but leads to smaller realized circuit models. In addition, under the same reduction ratio, realized models by the new method have less error compared with reduced circuits by the time-constant based reduction techniques in time domain.

*Index Terms*—Circuit Simulation, Model Order Reduction, Determinant Decision Diagrams, Realization, Behavioral Modeling

### I. INTRODUCTION

As VLSI technology advances with increased operating frequency and decreased feature size, parasitics from onchip interconnects and off-chip packaging will de-tune the

Some preliminary results of this paper appeared in *Proc. Asia South Pacific Design Automation Conference (ASPDAC'05) [30], [45]* 

Zhenyu Qi, Pu Liu and Sheldon X.-D. Tan are with Department of Electrical Engineering, University of California, Riverside, Riverside, CA 92521 USA (e-mail: {zhenyu,uliu,stan}@ee.ucr.edu)

Hao Yu and Lei He are with Department of Electrical Engineering, University of California, Los Angeles, CA 90095 USA (e-mail: {hy255,lhe}@ee.ucla.edu)

This work is funded by NSF CAREER Award CCF-0448534, NSF Grant OISE-0451688 and UC Regent's Faculty Fellowship(04-05). The work of H. Yu and L. He is partially supported by NSF CAREER award CCR-0401682, SRC grant 1100, a UC MICRO grant sponsored by Analog Devices, Fujitsu Laboratories of America, Intel and LSI Logic, and a Faculty Partner Award by IBM.

performance of high-speed circuits in terms of slew rate, phase margin and bandwidth [2]. Reduction of design complexity especially for those extracted high-order RLCM network is important for efficient VLSI design verification.

Compact modeling of passive RLC interconnect networks has been an research intensive area in the past decade due to increasing signal integrity effects and interconnect-dominant delay in current system on a chip (SoC) design [22]. Existing approaches can be classified into two categories. The first category is based on subspace projection [11], [12], [20], [26], [28], [37]. Projection-based method was pioneered by Asymptotic Waveform Evaluation (AWE) algorithm [28] where explicit moment matching was used to compute dominant poles at low frequency. Pade via Lanczos (PVL) [11], Arnoldi Transformation method [37] improved the numerical stability of AWE, congruence transformation method [20] and PRIMA [26] can further produce passive models. However, reduced circuit matrices by PRIMA are larger than direct pole marching (having more poles than necessary) [1] and PRIMA does not preserve certain important circuit properties like reciprocity [12]. The latest development by structured projection can preserve reciprocity [12], but it does not realize the reduced circuit matrices. An efficient first few order moments matching based realization for interconnect RLC circuit is proposed in [19]. In general, no systematic approach has been proposed for realizing order-reduced circuit matrices.

Another quite different approach to circuit complexity reduction is by means of local node elimination and realization [3], [10], [31], [34], [35]. The major advantage of these methods over projection-based methods is that the reduction can be done in a local manner and no overall solution of the entire circuit is required and reduced models can be easily realized using RLCM elements. This idea was first explored by selective node elimination for RC circuits [10], [34], where time-constant analysis is used to select nodes for elimination. Node reduction for magnetic coupling interconnect (RLCM) circuits has recently become an active research area. Generalized Y- $\Delta$  transformation [31], RLCK circuit crunching [3], and branch merging [35] have been developed based on nodal analysis (NA), where inductance becomes susceptance in the admittance matrix. Since mutual inductance is coupled via branch currents, to perform nodal reduction, an equivalent 6-susceptance NA model is introduced in [31] to reduce two coupling current variables and template matching via geometrical programming is used to realize the model order reduced admittances, but its accuracy depends heavily on the selection of templates and only 1-port realization has been

reported. Meanwhile, RLCK circuit crunching and branch merging methods are first-order approximation based on the nodal time-constant analysis. The drawbacks for this first-order approximation are: (1) error is controlled in a local manner and will be accumulated, hence it is difficult to control the global error due to reduction; (2) not too many nodes can be reduced if the elimination condition is not satisfied.

Another way to model and characterize complex interconnect structures in high frequency (in RF or even microwave ranges) is by means of rational approximation based on the direct measurements or rigorous full-wave electromagnetic simulation [1], [7], [9], [13], [14], [24], [33]. Many of those methods have been used in the RF and microwave circuits modeling as they are very flexible to be applied to different interconnect structures and wideband modeling.

In this paper, we focus on realizable modeling of RLCM circuits. We propose a new passive reduction and realization framework for general passive high-order RLCM circuits. Our method starts with large RLCM circuits which are extracted by existing geometry extraction tools like FastCap [25] and FastHenry [17] under some relaxation conditions of the fullwave Maxwell equations (like electro-quasi-static for FastCap or magneto-quasi-static for FastHenry) instead of measured or simulated data. It is our ultimate goal that we can obtain the compact models directly from complex interconnect geometry without measurement or full-wave simulations. The new modeling method is based on the general s-domain hierarchical model reduction algorithm [39], [41] and an improved VPEC (vector potential equivalent circuit) [44] model for self and mutual inductance, which can be easily sparsified and is hierarchical reduction friendly.

On the theoretical side, we show that the s-domain hierarchical reduction is equivalent to implicit moment matching around s = 0, and that the existing hierarchical reduction method by one-point expansion [39], [41] is numerically stable for general tree-structured circuits. We also show that the proposed hierarchical reduction preserves reciprocity of passive circuit matrices. Practically, we propose a hierarchical multipoint reduction scheme to obtain accurate order reduced admittance matrices of general passive circuits. A novel explicit waveform matching algorithm is proposed for searching dominant poles and residues from different expansion points based on the unique hierarchical reduction framework. To enforce passivity, state-space based convex programming optimization technique [7] is applied to the model order reduced admittance matrix. To realize the passivity-enforced admittance, we propose a general, reciprocity-preserving, passivity-preserving, multi-port network realization method based on relaxed oneport network synthesis technique using Foster's canonical form in an error-free manner. The resulting modeling algorithm can take in general RLCM SPICE netlists and generate out SPICE netlists of passive multiple-port models for any linear passive network with easily controlled model accuracy and complexity.

The rest of this paper is organized as follows. Section II reviews the hierarchical reduction algorithm. Section III shows some theoretical results regarding the hierarchical reduction. Section IV proves that the hierarchical reduction can preserve the reciprocity. Section V presents a new hierarchical

multi-point expansion scheme and a novel explicit waveform matching algorithm for searching dominant poles and residues from different expansion points. Section VI briefly reviews the VPEC model and comparison with nodal-susceptance based inductance models. Section VII present the state-space based convex programming for enforcing the passivity of the order reduced admittance circuit matrices. In section VIII, we will describe the general *n*-port network realization method. Experimental results on an on-chip spiral inductor, high-speed PEEC modeled bus circuits and comparison with existing reduction approaches will be presented in Section IX. Section X concludes the paper.

## II. REVIEW OF HIERARCHICAL CIRCUIT REDUCTION Algorithm

Assume that modified nodal analysis (MNA) is used for circuit matrix formulation. With this, the system-equation set MX = b, can be rewritten in the following form (Schur's decomposition):

$$\begin{bmatrix} M^{II} & M^{IB} & 0\\ M^{BI} & M^{BB} & M^{BR}\\ 0 & M^{RB} & M^{RR} \end{bmatrix} \begin{bmatrix} x^{I}\\ x^{B}\\ x^{R} \end{bmatrix} = \begin{bmatrix} b^{I}\\ b^{B}\\ b^{R} \end{bmatrix}.$$
 (1)

The matrix,  $M^{II}$ , is the *internal* matrix associated with internal variable vector  $x^{I}$ .

Hierarchical reduction is to eliminate all the variables in  $x^{I}$ , and transform (1) into the following reduced set of equations:

$$\begin{bmatrix} M^{BB*} & M^{BR} \\ M^{RB} & M^{RR} \end{bmatrix} \begin{bmatrix} x^B \\ x^R \end{bmatrix} = \begin{bmatrix} b^{B*} \\ b^R \end{bmatrix}, \quad (2)$$

where  $M^{BB*} = M^{BB} - M^{BI}(M^{II})^{-1}M^{IB}$  and  $b^{B*} = b^B - M^{BI}(M^{II})^{-1}b^I$ . Suppose that the number of internal variables is t, and the number of boundary variables is m. Then each matrix element in  $M^{BB*}$  and  $b^{B*}$  can be written in the following expanded forms:

$$a_{u,v}^{BB*} = a_{u,v}^{BB} - \frac{1}{det(M^{II})} \sum_{k_1,k_2=1}^{m} a_{u,k_1}^{BI} \Delta_{k_2,k_1}^{II} a_{k_2,v}^{IB},$$
(3)

where u, v = 1, ..., m. We call  $a_{u,k_1}^{BI} \Delta_{k_2,k_1}^{II} a_{k_2,v}^{IB} / det(M^{II})$  a *composite admittance* due to MNA formulation, and

$$b_u^{B*} = b_u^B - \frac{1}{det(M^{II})} \sum_{k_1, k_2=1}^m a_{u, k_1}^{BI} \Delta_{k_2, k_1}^{II} b_{k_2}^I, \tag{4}$$

where u = 1, ..., m and  $\Delta_{u,v}$  is the first-order *cofactor* of det(M) with respect to  $a_{u,v}$ .

Hierarchical node reduction algorithm computes the new admittance  $a_{u,v}^{BB*}$  and new right-hand side element  $b_u^{B*}$  in terms of order-limited rational functions of *s* hierarchically.

One critical issue during the hierarchical reduction is cancellation. Two type of cancellations have been observed [39]: *symbolic term cancellation*, where two product terms consisting of composite admittances cancel out; *symbolic commonfactor* cancellation, where the numerator and the denominator of the resulting product term consist of composite admittances have a common symbolic factor, which happens when there are at least two first-order cofactors exist in a product term. A general s-domain reduction algorithm based on MNA (modified nodal analysis) was proposed in [39], [41] where graphbased de-cancellation is carried out numerically in frequency domain (s-domain). The hierarchical reduction algorithm [39], [41].

## III. HIERARCHICAL REDUCTION VERSUS MOMENT MATCHING

In this section, we first discuss how the s-domain hierarchical reduction is related to the implicit moment matching. Then we discuss the numerical stability and reciprocity-preserving property of the hierarchical reduction process.

## A. Moment Matching Connection

Consider a linear system with n state variables in vector x, the system is given by

$$sx = Ax + b, (5)$$

where A is an  $n \times n$  system matrix, b is the input vector to the circuit. Then we can obtain  $x = (Is - A)^{-1}b$ . Let's consider single-input single-output systems where we have only one input  $b_j$  and we are interested in state response at node *i*. In this case we have

$$x_i(s) = H_{ij}(s)b_j = \frac{\Delta_{ij}}{det(Is - A)}b_j,$$
(6)

where  $\Delta_{ij}$  is the first-order cofactor of matrix M = (Is - A)with respect to the element at the row *i* and column *j*.  $H_{ij}(s)$  is the transfer function. So the exact solution of any state variable or its transfer function in *s*-domain can be represented by a rational function of *s*.

Hierarchical reduction basically is to reduce the  $n \times n$ matrix M into a very smaller  $m \times m$  matrix M' based on block Gaussian elimination such that  $x_i$  can be trivially solved symbolically by using Eq. (6). During this reduction process, all the rational functions involved are truncated up to a fixed maximum order and the final solution will be a rational function with the same order for its numerator and its denominator. We then have the following theoretical result for the computed state variable  $x'_i(s)$  from the hierarchical reduction process in s-domain.

Theorem 1: The state variable  $x'_i(s)$  computed by the sdomain hierarchical reduction with q as the maximum order for all the rational functions will match the first q moments of the exact solution  $x_i(s)$  expanded by Taylor series at s = 0. **Proof:** As we know that the exact solution of  $x_i(s)$  is a rational function as shown in Eq.(6). Due to the truncation, the solution computed by the hierarchical reduction process will be given by

$$x_i'(s) = \frac{a_0 + a_1s + \dots + a_qs^q}{b_0 + b_1s + \dots + b_qs^q}, i = 1, \dots, q.$$
(7)

It is proved in [40] that a cancellation-free rational expression from the hierarchical reduction process are the exact expressions obtained from the flat circuit matrix. If we do not perform any truncation, then  $x'_i(s)$  will be the exact solution,  $x_i(s)$ , which is obtained from the flat circuit matrix by Eq.(6) when all cancellations are removed numerically during the hierarchical reduction process (under assumption that no numerical error is introduced). With truncation, all the coefficients  $a_0, ..., a_q$  and  $b_0, ..., b_q$  are still exactly the same as that in  $x_i(s)$ . If we compute the moments of  $x'_i(s) = m_0 + m_1 s + ...$ , the first q moments can be uniquely determined by the 2q coefficients  $a_0, ..., a_q$  and  $b_0, ..., b_q$ :

$$m_{i} = \frac{a_{i} - \sum_{k+l=i, k \le i, l \le i, k \ne 0}^{i} b_{k} m_{l}}{b}.$$
 (8)

If  $b_0$  is not zero, b is simply  $b_0$ , otherwise b will be the first non-zero coefficient  $b^t$  and the first q moments become the coefficients of  $s^{-t}(t > 0)$  to that of  $s^{-t+q}$ . So the theorem is proved. Hence the transfer function  $H'_{ij}(s)$  will also match the exact one up to the first q moments.

For a general multi-input and multi-output system, each element in the reduced  $m \times m$  admittance matrix M'(s) becomes a rational function [41]:

$$a_{u,v}^{BB*} = \frac{\det(M[1,...,m,u|1,...,m,v])}{\det(M^{II})},$$
(9)

where, M[1, ..., m, u|1, ..., m, v] is a matrix that consists of matrix  $M^{II}$ . It is M[1, ..., m|1, ..., m], plus row u and column v of matrix M. Then we have the following results:

Corollary 1: Each rational admittance function  $a'_{u,v}^{BB*}(s)$  in the reduced  $m \times m$  matrix, M'(s), by the hierarchical reduction process, will match the first q moments of the exact rational function  $a_{u,v}^{BB*}(s)$  expanded by Taylor series at s = 0.

#### B. Numerical Stability of the Hierarchical Reduction

The hierarchical reduction process is essentially equivalent to implicit moment matching at s = 0. As a result, the frequency response far away from s = 0 will become less accurate due to the truncation of high order terms. Another source of numerical error comes from the numerical decancellation process where polynomial division are required for removing the common-factors (cancellation) in the newly generated rational function, which will in turn introduces error from numerical term cancellation (the sum of two symbolic terms should have been zero, but is not zero due to numerical error). Such numerical noise will cause the higher order terms less accurate even we try to keep them. In Fig. 1, we show that the responses from the 3-way, 2-level partitioned  $\mu A741$ circuit [42] under different maximum reduction orders of rational functions. As we can see that increasing the rational function order does not increase the accuracy of the response after the order reaches 8. This is the typical numerical stability problem with the moment matching method [28]. However, unlike explicit moment matching methods, the hierarchical reduction is numerical stable for tree-structured circuits. We then show the following results:

*Theorem 2:* For tree-structured circuits, the hierarchical reduction process can be performed such that there is no common-factor cancellation in the generated rational functions.

**Proof:** For tree structured circuits, we can always partition the circuit in such a way that each subcircuit only has one node shared with its parent circuit. As a result, there is only



Fig. 1. Responses of  $\mu A741$  circuit under different reduction orders.

one composite admittance  $a_{u,k_1}^{BI} \Delta_{k_2,k_1}^{II} a_{k_2,v}^{IB}/det(M^{II})$  in Eq.(3) generated in its parent circuit for each subcircuit. According to common-factor cancellation condition [39], at least four composite elements from the same subcircuit reduction are required for the existence of common-factor cancellation. So no common-factor cancellation will occur under such partitioning. The theorem is proved.

The significance of Theorem 2 is that the hierarchical reduction process becomes numerical stable for almost arbitrary order for tree circuits. The only cancellation left is the term cancellation, where the sum of two symbolic terms is zero, which will not introduce any noticeable numerical error in the reduction process. Fig. 2 shows the voltage gain response (real part) of a RC tree with about 100 nodes (also 100 capacitors) under different reduction orders. As can be seen, the reduced voltage gain will match the exact one well when the kept orders reach about 60.



Fig. 2. Responses of a RC tree circuit under different reduction orders.

The fact that no common-factor de-cancellation (polynomial division) is required was also exploited in the direct truncation of transfer function method (DTT) [16] where only polynomial addition is required to compute the truncated transfer functions for tree-structured RLC circuits only. The DTT reduction process can be viewed as a special case of our method. But for general non-tree structured circuits, polynomial division is required in node elimination based reduction methods due to common-factor cancellation, and polynomial division due to truncation will not be numerically stable for very high frequency range far away from DC as shown before.

To mitigate this problem, we propose using multi-point expansion for obtaining accurate rational functions or reduced admittance matrices for modeling a general multi-input and multi-output linear system as will be shown in Section V.

## IV. PRESERVATION OF RECIPROCITY

A reciprocal network is one in which the power losses are the same between any two ports regardless of direction of propagation [43]. Mathematically, this is equivalent to the requirement that the circuit admittance matrix is symmetric (or scattering parameter S21=S12, S13=S31, etc). A network is known to be reciprocal if it is passive and contains only isotropic materials. Reciprocity is an important network property. For the hierarchical reduction, we have the following results:

*Theorem 3:* The hierarchical reduction method preserves the reciprocity of a passive circuit matrix.

**Proof:** The proof can be found by using Eq.(9) again. We first study a circuit with circuit matrix M. The circuit has one subcircuit with circuit matrix  $M^{II}$ . Assume that original circuit matrix is symmetric (its subcircuit is also symmetric, i.e. both M and  $M^{II}$  are symmetric) due to reciprocity.

After reduction, the reduced circuit matrix becomes an  $m \times m$  matrix where each matrix element at row u and column v appears in Eq.(9). Then we look at the element at row v and column u, which is

$$a_{v,u}^{BB*} = \frac{\det(M[1,...,m,v|1,...,m,u])}{\det(M^{II})},$$
(10)

Notice that matrix M is symmetric, so the row u in Eq.(9) and column u in Eq.(10) are same. This is true for column v in Eq.(9) and row v in Eq.(10). As a result we have

$$M[1,...,m,v|1,...,m,u] = M[1,...,m,u|1,...,m,v]^T$$
(11)

$$det(M[1,...,m,v|1,...,m,u]) = det(M[1,...,m,u|1,...,m,v]^T) \quad (12)$$

Hence,  $a_{u,v}^{BB*} = a_{v,u}^{BB*}$  and reciprocity is preserved in the reduced circuit matrix when a subcircuit is reduced. In the hierarchical reduction, we reduce one subcircuit at a time and the reduced circuit matrix is still symmetric after reduction. So the reduced circuit matrix after all the subcircuits are reduced is still symmetric. This theorem is proved.

#### V. MULTI-POINT EXPANSION HIERARCHICAL REDUCTION

The multi-point expansion scheme by real or complex frequency shift has been exploited before in projection based reduction approaches for improving the modeling accuracy [8], [15]. The basic idea for such a strategy is that dominant poles close to an expansion point is more accurately captured than the poles that are far away from the expansion point in the moment matching based approximation framework. Therefore, instead of expanding at only one point, we can expand at multiple points along real or complex axis to accurately capture all the dominant points in the given frequency range.

In this paper, we extend this concept to the hierarchical reduction algorithm. Specifically, at each expansion point, the driving point function or each rational admittance function in a reduced admittance matrix can be written into the partial fraction form

$$f(s) = \sum_{i}^{n} k_i / (s - p_i).$$
(13)

By intelligently selecting poles and their correspond residues from different expansions and combining them into one rational function, we can obtain a more accurate rational function for very high frequency range. In this paper, we propose an explicit waveform matching scheme based on hierarchical reduction framework to find dominant poles and their residues for both SISO (single-input single output) and MIMO (multiinput multi-output) systems. It is shown experimentally to be superior to the existing pole searching algorithm.

## A. Multi-Point Expansion In Hierarchical Reduction

To expand the circuit at an arbitrary location in the complex s-plane, say  $s_k = \alpha_k + \omega_k j$ , we can simply substitute s in Eq.(6) by  $s + s_k$ . Then Eq.(6) becomes

$$x_{i}(s) = H_{ij}(s)b_{j} = \frac{\Delta_{ij}(s+s_{k})}{det(I(s+s_{k})-A)}b_{j}.$$
 (14)

As shown in [8], poles that dominate the transient response in interconnect circuits are near the imaginary axis with large residues. Hence we expand along the imaginary axis for RF passive and interconnect circuits. Since only capacitors and inductors are associated with the complex frequency variable s, expansion at a real point  $\alpha$  or a complex point  $\omega_i j$  point is essentially equivalent to analyzing a new circuit where each capacitor C has a new resistor (with real value  $\alpha_i C$  or complex value  $\omega_i C j$ ) connected in parallel with it and each inductor L has a new resistor (with real value  $\alpha_i L$  or complex value  $\omega_i L j$ ) connected in series with it [29].

In this paper, we show that the multi-point expansion can be done very efficiently in the hierarchical reduction framework. The rational functions are constructed in a bottom up fashion in a Y-parameter determinant decision diagrams (YDDDs) in the hierarchical reduction algorithm [39]. When a capacitor C or an inductor L (its YDDD node) is visited, we build a simple polynomial 0 + Cs or 0 + Ls to multiply or add it with existing polynomials seen at that DDD node. In the presence of a non-zero expansion point,  $\alpha_i$  or  $\omega_i j$ , we can simply build a new polynomial  $\alpha_i C + Cs$  or  $\omega_i Cj + Cs$  for the capacitor and  $\alpha_i L + Ls$  or  $\omega_i Lj + Ls$  for the inductor respectively. So we do not need to rebuild the circuit matrix or the YDDD graphs used for reduction at s = 0. Instead we only need to rebuild the rational functions by visiting every YDDD node once, which has the time complexity linear with the YDDD graph size, a typical time complexity for DDD graph based methods [36].

#### B. Explicit Waveform Matching Algorithm

One critical issue in multi-point expansion is to determine at each expansion point which poles are accurate and should be included in the final rational function. In the complex frequency hopping method [8], a binary search strategy was used where poles (common poles) seen by two expansion points and poles with distance to the expansion points shorter than the common poles are selected. Such a common-pole matching algorithm, however, is very sensitive to the numerical distance criteria for judging if two poles are actually a same pole. For accurately detecting common poles, small distance is desirable, but it will lead to more expansion points; and even worse is that the same pole may be treated as different poles seen by two different expansion points. Also this method may fail to detect some dominant poles as the circle for searching accurate poles might be too small as shown in our experimental results.

In this paper, we propose a new reliable pole searching algorithm, which is based on explicit frequency waveform matching. The new algorithm is based on the observation that a complex pole  $p_i$  and its residue  $k_i$  in the partial fraction form,  $k_i/(s - p_i)$ , has the largest impact at frequency  $f_i$  when the imaginary part of the pole equals  $2\pi f_i$ . Fig. 3 shows a typical response of  $k_i/(s - p_i)$ , where  $k_i = 2.78 \times 10^{12} + 2.34 \times 10^{10} j$  and  $p_i = -4.93 \times 10^8 + 2.58 \times 10^{10} j$ . The peaks of both real (absolute value) and magnitude are around  $4.11 \times 10^9$ , which is equal to  $2.58 \times 10^{10}/(2\pi)$ . The reason for this is that both



Fig. 3. Responses of a typical  $k_i/(s-p_i)$ .

real and imaginary parts of  $k_i/(s-p_i)$  reach a peak when their denominator  $(pr)^2 + (\omega - pi)^2$  reaches a minimum at  $\omega = pi$ , where  $s = \omega j$  and  $p_i = pr + pij$ . Complex pole with negative imaginary part typically will not have significant impact on the upper half complex plane.

The idea of *frequency waveform matching* is to explicitly match the approximate frequency waveform with that of exact ones. Specifically, at an expansion point,  $f_i$ , we perform the hierarchical reduction and then determine an accurate maximum frequency range  $[f_i, f_{i+1}]$  such that the error between responses (magnitude) of the reduced rational function and that of the exact one are bounded by a pre-specified error bound. The error is computed as follows:

$$err = \frac{|dB20(V_e) - dB20(V_a)|}{|dB20(V_e)|}$$
(15)

where  $dB20(x) = 20 * log_{10}(|x|)$  and |x| is the magnitude of a complex number x.  $V_e$  is the exact response and  $V_a$  is the approximate response. If  $|dB20(V_e)| = 0$ , then we use  $|dB20(V_a)|$  as the denominator in Eq.(15 if it is not zero. If  $|dB20(V_a)| = 0$ , we have err = 0.

Then  $f_{i+1}$  will be the next expansion point. All the poles whose imaginary part fall within the range  $[2\pi f_i, 2\pi f_{i+1}]$  will be selected because their contribution in this frequency range is the largest. The new algorithm does not have the duplicate pole issue as accurate poles can only be located at one place. The accuracy of the found poles is assured by the explicit waveform matching. Experimental results show that it tends to use less expansion points than the common-pole matching method, and less CPU time.

## C. Multi-Point Expansion for MIMO System Reduction

For a multi-input multi-output system, by using the modified nodal analysis, the reduced circuit matrix  $M'(s) = [y_{ij}(s)]_{m \times m}$  will become an  $m \times m$  admittance matrix. Each admittance  $y_{ij}$  is a complex rational function with real or complex (if expansion points are on imaginary axis) coefficients. In this case, we explicitly watch for the error between each approximate rational admittance and the exact value of the admittance at each frequency. The exact value of each admittance can be computed by visiting the DDD graph representing the admittance. Since there is a lot of sharing among those admittances, the cost of evaluating all the admittances are similar to evaluating one admittance, considering that every DDD node just needs to be visited once at each frequency point [42].

## VI. INDUCTANCE MODELS IN HIERARCHICAL REDUCTION

In this section, we discuss the VPEC (vector potential equivalent circuit) used in our reduction method. We compare the VPEC model with another inductance based nodal susceptance concept. We will show that the inductance model by nodal-susceptance is not physically equivalent to inductance as unwanted dc paths are created at low frequency.

#### A. Inductance Formulation In Hierarchical Reduction

For an RLCM circuit, when we assume only independent current sources exist at external ports, the circuit matrix in *s*-domain starting with MNA formulation can be written as

$$\mathcal{G}x + s\mathcal{C}x = Bi(s), \qquad v(s) = B^T x$$
 (16)

where x, v, i are the state variable, output voltage and input current vectors, and  $\mathcal{G}, \mathcal{C}, B$  are state and input-output matrices, respectively. (16) can be further written as:

$$\begin{bmatrix} G & A_l^T \\ -A_l & 0 \end{bmatrix} \begin{bmatrix} v_n \\ i_l \end{bmatrix} + s \begin{bmatrix} C & 0 \\ 0 & sL \end{bmatrix} \begin{bmatrix} v_n \\ i_l \end{bmatrix} = \begin{bmatrix} A_i i_n(s) \\ 0 \end{bmatrix}, (17)$$

where G and C are the admittance matrices for resistors and capacitors, L is the inductance matrix, which includes mutual inductance,  $v_n$  is a vector of node voltage,  $i_l$  is a branch



Fig. 4. An example of coupled 2-bit RLCM circuit under PEEC model.

current vector of inductors,  $A_l$  is the adjacency matrix for all inductors, and  $A_i$  is the adjacency matrix for all port current sources.

The *circuit-reduction* means applying the Gaussian elimination for state variables like node voltage  $v_n$  and branch current  $i_l$ . If we first reduce the branch current vector  $i_l$ , we actually result in the state equation only with nodal voltage variables.

$$[G + sC + \frac{1}{s}A_l SA_l^T][v_n] = [A_i i_n(s)].$$
(18)

This is exactly the nodal analysis formulation, where  $S = L^{-1}$  is the susceptance [5], and  $\Gamma = \frac{1}{s}A_lSA_l^T$  is the admittance form for the mutual inductance under NA. The circuit-reduction by further eliminating the nodal voltage-variable  $v_n$  is exactly the  $Y - \Delta$  transformation in [31].

### B. Inductance Models by Nodal-Susceptance

The nodal-susceptance in (18) actually creates dc path at low-frequency range. We illustrate this with a 2-bit interconnect example shown in Fig. 4. The nodal voltage equation of susceptance at four nodes (A,B,C,D) becomes

$$\frac{S_{11}}{s}V_A - \frac{S_{11}}{s}V_B + \frac{S_{21}}{s}V_C - \frac{S_{21}}{s}V_D = I_1$$

$$-\frac{S_{11}}{s}V_A + \frac{S_{11}}{s}V_B - \frac{S_{12}}{s}V_C + \frac{S_{12}}{s}V_D = -I_1$$

$$\frac{S_{12}}{s}V_A - \frac{S_{12}}{s}V_B + \frac{S_{22}}{s}V_C - \frac{S_{22}}{s}V_D = I_2$$

$$-\frac{S_{12}}{s}V_A + \frac{S_{12}}{s}V_B - \frac{S_{22}}{s}V_C + \frac{S_{22}}{s}V_D = -I_2$$
(19)

As shown in Fig. 5, it is mathematically equivalent to stamping six susceptance elements into the admittance matrix [31] when  $s \neq 0$ . However, the susceptance element  $S_{ij}/s$  approaches infinite (thus 0 impedance or short circuit) when s = 0, there exist four unwanted *dc*-paths between nodes (A,B,C,D), which do not exist before. As a result it leads to the wrong *dc* values and inaccurate low-frequency simulation results even for the 2-bit bus example in Fig. 4.

We compute the exact driving-point impedance responses using inductance under MNA and nodal-susceptance under NA, respectively using the symbolic analysis tool [36].

As shown in Fig. 6, NA formulation (by using nodal susceptance for inductance) gives the exact response as SPICE does in the high-frequency range, but the response is not correct in the low-frequency range. When *s* approaches zero, the actual driving-point impedance in Fig.4 should be dominated by three capacitors with total capacitance value 43fF. However, for Fig.5 at *dc*, the driving-point impedance becomes a resistor



Fig. 5. An example of coupled 2-bit RLCM circuit under nodal susceptance model.

with total resistance value 234 ohms (or 49 ohms(dB)) due to the unwanted dc paths.



Fig. 6. Frequency responses of PEEC model in SPICE, susceptance under NA and VPEC models for the 2-bit bus.

The reason for such discrepancy is that when s = 0,  $L^{-1}$  can't be computed as L becomes singular. As a result, the NA formulation of inductance, which is based on  $L^{-1}$ , is no longer equivalent to the original circuit matrix. Hence circuit-reduction starting with nodal-susceptance formulation can not give the correct low-frequency response in general, and it is not suitable for generating the wideband macro-model of interconnects.

#### C. Formulation by VPEC Model

From the above discussion, we know that the inductance formulation by the nodal-susceptance leads to inaccurate lowfrequency response. It is not suitable for generating the reduced interconnect model for wideband applications. However, directly handling mutual inductance in a dense MNA formulation as in [3] will be computationally expensive. As shown in [44], the sparsified VPEC model actually not only achieves the runtime speedup, but also has the high accuracy compared to the original full model. Therefore, we usel the VPEC model to represent inductance in our circuit-reduction flow, as it enables passive pre-sparsification [27], [44].



Fig. 7. An example of coupled 2-bit RLCM circuit under VPEC model.

The significant difference between VPEC and nodalsusceptance models for mutual inductance is that VPEC is a physically equivalent model, and it can exactly represent the original system [44]. As shown in Fig.7, this model consists of an electrical circuit (PEEC resistance and capacitance) and magnetic circuit (VPEC effective resistance and controlled source). It includes the following components: (1) the wire resistance and capacitance the same as in the PEEC model; (2) a dummy voltage source (sensing electrical current  $I_i$ ) to control  $\hat{I}_i$ ; (3) a voltage controlled current source to relate  $\hat{V}_i$ and  $I_i$  with gain g = 1; (4) an electrical voltage source  $V_i$ controlled by  $V_i$ ; (5) effective resistors including ground  $R_{i0}$ and coupling  $\hat{R}_{ij}$  to consider the strength of inductances; and (6) a unit inductance  $L_i$  to: (i) take into account of the time derivative of  $A_i$ ; and (ii) preserve the magnetic energy from the electronic circuit.

Clearly, this SPICE compatible implementation does not introduce unwanted dc paths when s = 0 as by the nodalsusceptance. Moreover, Fig. 6 shows the response of VPEC model for the 2-bit circuit, which is identical to SPICE for the entire frequency range. Detailed analysis also shows the impedance function of the 2-bit circuit modeled by VPEC model is the same as the impedance function by PEEC model [45].

As shown in [44], although VPEC model introduces more circuit elements, it has faster runtime because this model dramatically reduces reactive elements (i.e., inductors), and leads to less numerical derivatives and integrals and makes the simulation converge faster. To further improve the sparsified VPEC model extraction without full inversion as in [44], we extend a windowing technique [5]. It reduces the computation complexity to  $(O(Nb^3))$ , where b is the size of the window (i.e. the size of sub-matrix). Note that although VPEC model enables efficient inductance simulation, the order of the circuit matrix is still high. Moreover, its SPICE compatible model still contains controlled sources, and can not be handled by the existing realizable circuit-reduction approaches [3], [31].

#### VII. PASSIVITY ENFORCEMENT

In this section, we present the state-space based passivity enforcement method, which is based on the method used in [7]. But we show how this method can be used in our hierarchical model order reduction framework to enforce passivity of the model order reduced admittance matrix  $\tilde{\mathbf{Y}}(s)$ .

Passivity is an important property of many physical systems. O. Brune [6] has proved that the admittance and impedance matrix of an electrical circuit consisting of an interconnection of a finite number of positive R, positive C, positive L, and transformers are passive iff (if and only if) its rational function are positive real (PR). It can be proved that the following statements are equivalent:

(a) a transfer function matrix  $\mathbf{Y}(s)$  is positive real.

(b) Let  $(A \ B \ C \ D)$  be a minimal controllable state space representation of  $\mathbf{Y}(s)$ .  $\exists K$ ,

$$K = K^T, K \ge 0, \tag{20}$$

such that the Linear Matrix Inequality (LMI)

$$\begin{bmatrix} A^T K + KA & KB - C^T \\ B^T K - C & -D - D^T \end{bmatrix} \ge 0$$
(21)

holds.

Constraint (21) actually comes from the Kalman Yakubovich Popov (KYP) Lemma, which establishes important relations between state space and frequency domain objects. KYP was introduced in control theory and later used in the network synthesis [4].

If we include the term proportional to s in the transfer function, which means we need to know what happens in infinite frequency, we can write the admittance matrix in terms of  $(A \ B \ C \ D)$  as

$$\mathbf{Y}(s) = sY^{\infty} + D + C(sI - A)^{-1}B \tag{22}$$

where I denote the identify matrix with the same dimension as A.To keep the transfer function positive real,  $Y^{\infty}$  must satisfy

$$Y^{\infty} = (Y^{\infty})^T, Y^{\infty} \ge 0 \tag{23}$$

Therefore, we can transform the problem of checking whether the admittance matrix  $\mathbf{Y}(s)$  is positive real into the problem of checking whether its corresponding state space model in terms of  $(A \ B \ C \ D)$  is positive semidefinite. More important is that we can use the PR criterion in terms of the state-space form to enforce the passivity of the reduced circuit matrices as shown in the next section.

# A. State-Space Model Representation of $\tilde{\mathbf{Y}}(s)$

After the multi-point hierarchical model reduction, an *n*-port order reduced admittance matrix is generated as shown in Eq.(24), where each  $\tilde{\mathbf{Y}}_{p,q}$  is a rational function of *s*. The reduction process can capture the entire dominant complex poles, which means there is no poles in the RHP (right hand plane) of the complex plane.

$$\tilde{\mathbf{Y}}(s) = \begin{bmatrix} \tilde{\mathbf{Y}}_{1,1} & \cdots & \tilde{\mathbf{Y}}_{1,n} \\ \vdots & \ddots & \vdots \\ \tilde{\mathbf{Y}}_{n,1} & \cdots & \tilde{\mathbf{Y}}_{n,n} \end{bmatrix}$$
(24)

The first step we do is to transform the admittance matrix  $\tilde{\mathbf{Y}}(s)$  into its state-space representation. We assume that all rational functions in the matrix share the common poles

of the system. If there are private poles appearing on the leading diagonal element, we can separate them and their residues from the whole rational function after partial fraction decomposition and realize them separately.

Given a multivariable *n*-port network, each rational function  $\tilde{\mathbf{Y}}_{p,q}$  is considered as a single-input and single-output (SISO) subsystem and mapped to its state-space representation in the controllable canonical form, which corresponds to  $(A_{q,q} \ B_{q,q} \ C_{p,q} \ D_{p,q})$  in the matrix of  $(A \ B \ C \ D)$  respectively. Now we can write its state-space representation as Eq.(25)

$$A = \begin{bmatrix} A_{1,1} & \cdots & 0 \\ \vdots & \ddots & \vdots \\ 0 & \cdots & A_{n,n} \end{bmatrix} \quad B = \begin{bmatrix} B_{1,1} & \cdots & 0 \\ \vdots & \ddots & \vdots \\ 0 & \cdots & B_{n,n} \end{bmatrix}$$
$$C = \begin{bmatrix} C_{1,1} & \cdots & C_{1,n} \\ \vdots & \ddots & \vdots \\ C_{n,1} & \cdots & C_{n,n} \end{bmatrix} \quad D = \begin{bmatrix} D_{1,1} & \cdots & D_{1,n} \\ \vdots & \ddots & \vdots \\ D_{n,1} & \cdots & D_{n,n} \end{bmatrix}$$

Also this mapping process could be viewed as n set singleinput and multiple-output (SIMO) subsystems. If we choose the *m*-th port as input port, the *m*-th column admittance rational function can be mapped into  $(A_{m,m} B_{m,m} C_{:,m} D_{:,m})$ .

#### **B.** Passivity Enforcement Optimization

In this subsection, we briefly mention how passivity enforcement is done via a convex optimization process on the state-space representation of the admittance matrix.

Assume that we have obtained the admittance matrix of a model order reduced system  $\tilde{\mathbf{Y}}(s)$  with a set of N sampling points. Let  $\tilde{\mathbf{Y}}_{p,q}(s)$  denote the (p,q) entry of the transfer function  $\tilde{\mathbf{Y}}(s)$ . Let  $\hat{\mathbf{Y}}_{p,q}(s_k)$  be the exact values of the admittance at the entry (p,q) at the *k*th frequency point, which can be obtained by the exact hierarchical symbolic analysis [42].

The optimization problem is to determine C, D,  $Y^{\infty}$  such that a cost function is minimized with constraints on the error  $(\hat{\mathbf{Y}}_{p,q} - \tilde{\mathbf{Y}}_{p,q})$ . Here the constraints are on the weighted least square error, taken over N frequencies

$$\sum_{k=1}^{N} w_{k,p,q} \| \hat{\mathbf{Y}}_{p,q}(s_k) - \tilde{\mathbf{Y}}_{p,q}(s_k) \|_2^2 \le t_{p,q}.$$
 (26)

It is shown in [7] that the optimization problem in Eq.(26) subject to constrains in Eq.(21) can be transformed into a convex programming problem, which can be solved efficiently by some existing convex programming programs.

We notice that passivity enforcement was done by the compensation-based approach proposed in [1]. But this method does not ensure the accurate matching because the compensated part may have significant impacts on the frequency range that we are interested.

## VIII. MULTI-PORT CIRCUIT REALIZATION

Once the new C, D,  $Y^{\infty}$  are obtained by the convex programming, the new passivity-enforced  $\mathbf{Y}(s)$  are constructed again by Eq.(22). We now discuss how to generate realized macro-models for both frequency and time domain simulation.

#### A. Relaxed one-port realization



Fig. 8. One-port Foster admittance realization.

We start with one-port network realization. For a one-port model with driving-point admittance function, we propose to use a generalized Foster's canonical form based realization to directly synthesize the admittance function.

To synthesize the one-port model from the driving-point admittance rational function Y(s), we first rewrite it in the Foster's canonical form [43]:

$$Y(s) = sY_{\infty} + Y_0 + \sum_{m=1}^{M} \frac{a_m}{s - p_m} + \sum_{n=1}^{N} \left(\frac{a_n}{s - p_n} + \frac{a_n^*}{s - p_n^*}\right)$$
(27)

where we expand the rational function into the partial fraction form with N conjugate-poles  $p_n$  and M real-poles  $p_m$ .

The admittance function in Foster's canonical-form can be then synthesized by an equivalent circuit in Fig. 8 with the following relations to determine R, L, C, G elements:

$$G_{s} = Y_{0}, \qquad C_{s} = Y_{\infty};$$

$$R_{m\_m} = \frac{1}{a_{m}}, \qquad L_{m\_m} = -\frac{p_{m}}{a_{m}};$$

$$L_{n\_n} = \frac{1}{2Re\{a_{n}\}}, \qquad L_{n\_n}C_{n\_n}|p_{n}|^{2} = R_{n\_n}G_{n\_n} + 1,$$

$$\frac{G_{n\_n}}{C_{n\_n}} = -\frac{Re\{a_{n}p_{n}^{*}\}}{Re\{a_{n}\}}, \qquad \frac{R_{n\_n}}{L_{n\_n}} = \frac{Re\{a_{n}p_{n}^{*}\}}{Re\{a_{n}\}} - 2Re\{p_{n}\}.$$
(28)

Some existing works like PRIME [24], require every complex pole pair to be physically realizable (every RCL element is positive), which is over constrained and may lead to significant error when unrealizable pole pairs are discarded or their residues are changed. In our approach, we *relax* those constraints by allowing some negative RLC elements. But the passivity of the admittance function will still be guaranteed by the passivity enforcement procedure since the realization is error-free and reversible and does not change the passivity of the realized system.

#### B. Multiple-port realization

For passive multi-port order reduced admittance matrix, we propose a general complete-graph structure (in case of full admittance matrix) to realize the admittance matrix based on the one-port realization. In the following, we first illustrate how a 2-port network is realized and then we extend this concept for general *n*-port network realization.

Given a  $2 \times 2$  passive admittance matrix, which is obtained by the hierarchical model order reduction method,

$$Y_{2\times 2}(s) = \begin{bmatrix} y_{11}(s) & y_{12}(s) \\ y_{21}(s) & y_{22}(s) \end{bmatrix},$$
 (29)

it can be realized exactly by using the  $\Pi$ -structure template shown in Fig. 9, where each branch admittance will be realized by the one-port Foster's expansion method shown in Fig. 8. Based on this template, such realization can be easily extended to the multi-port case.



Fig. 9. A general two-port realization Π model.

$$Y_{n \times n}(s) = \begin{bmatrix} y_{11}(s) & \cdots & y_{1n}(s) \\ \vdots & \ddots & \vdots \\ y_{n1}(s) & \cdots & y_{nn}(s) \end{bmatrix}$$
(30)

Generally, for a reduced *n*-port network with a full  $n \times n$  admittance matrix as shown in Eq.(30), the realized network will be a complete graph where each branch represents an admittance, which is realized by one-port realization method. For instance, Fig. 10 shows an realization of a synthesized 6-port network. The branch admittance of the *m*th port branch (the branch between the port and ground) is the sum of all the *m*th row admittances, and the admittance of the branch between the port is the negative value of the corresponding admittance.

Notice that our realized circuits automatically preserve the reciprocity of the original circuit matrix as it requires the admittance to be symmetric.



Fig. 10. A six-port realization based on Π-structure.

#### IX. EXPERIMENTAL RESULTS

The proposed algorithm has been implemented in C++ and all the data are collected on a Linux workstation with dual 1.6GHz AMD Althon CPUs and 2G memory. The convex programming problem is solved using some standard optimization packages. We use SeDuMi [38] and SeDuMi Interface to solve the convex programming problem for passivity enforcement.

## A. One-port Macro-model for Spiral Inductor

We first construct the detailed PEEC model for a 3-turn spiral inductor with its substrate. We assume copper ( $\rho = 1.7 \times$  $10^{-8}\Omega \cdot m$ ) for the metal and the low-k dielectric ( $\epsilon = 2.0$ ). The substrate modeled as a lossy ground plane (heavily doped) with  $\rho = 1.0 \times 10^{-5} \Omega \cdot m$ . The conductor is volume-discretized according to the skin-depth, and longitudinal segmented by one 10th of wave-length. The substrate is also discretized as in [23]. The capacitance is extracted by FastCap [25] and only adjacent capacitive coupling is considered since capacitive coupling is short-range. The partial inductance is extracted by FastHenry at 50GHz [17]. The inductive coupling between any pair of segments (including segments in the same line) is considered. Then we generate the distributed PEEC model by  $\pi$ -type of RLC-topology to connect each segment, and it results in a SPICE netlist with 232 passive RLCM elements. The substrate parasitic contribution (Eddy current loss) is lumped into the above conductor segment. Note that for more accurate extraction at ultra-high frequency, it needs full-wave PEEC model description [18]. For mutual inductance, a vector potential equivalent model (VPEC) is used [44], which is more hierarchical reduction friendly as no coupling inductor branch currents are involved and circuit partition can be done easily.

1) Comparison with Common-Pole Matching Method in Frequency Domain: For the spiral inductor, driving point impedance is obtained by the multi-point hierarchical reduction process. We use both the common-pole matching algorithm in the complex frequency hopping method and the new waveform matching algorithm to search for dominant poles along the imaginary axis.

For a fair comparison, we make sure the resulting rational functions will have similar accuracy. For common-pole matching algorithm, if two poles are located within 1% of their magnitude, they are regarded as the same pole. For waveform matching algorithm, the error bound between the approximate one and the exact is set to 0.1%. As a result, common-pole approach takes 26 expansion with 37.1 seconds, waveform marching method use 15 expansion with 22.57 seconds. The responses obtained using both method versus the exact response up to 100GHz are shown in the Fig. 11. The responses from both methods match the exact ones very well all the way to 100GHz. Our experience shows that CPU time of common-pole method is highly depends on the commonpole detection criteria. For instance if we set the criteria for common-pole detection to 0.5%, then 65 expansions are carried out. Also as more expansions are carried out, chances that a single pole is seen by two consecutive expansion points become larger, but it may be treated as different poles due to a small distance criteria, which in turn leads to significant distortion of the frequency response.

2) Time-domain Simulation of a LC Oscillator: We further demonstrate the accuracy and efficiency of the inductor macromodel in the time-domain harmonic simulation. Note that the synthesized one-port macro-model can be used to efficiently predict the critical performance parameters of spiral inductor, such as the  $\omega_T$ , Q factor, and even the resonance startingcondition for an oscillator [32].



Fig. 11. Frequency Response of the 3-turn spiral inductor and its reduced model by using waveform matching and common-pole method.

We use the Colpitts LC oscillator as an example as shown in Fig. 13 (b) where the active circuit behaves like *negative resistance* to make the oscillator work as shown in Fig. 13 (a).

In this experiment, the synthesized one-port model is from a 25-order rational function with 24 poles and results in a macro-model with 40 RLC elements. As shown in Fig. 12, the waveform at steady state of synthesized model and original model match very well but we observe a 10X times (5.17s vs. 0.52s) runtime speedup by using the reduced model.



Fig. 12. Time-domain comparison between original and synthesized models for a Colpitts LC oscillator with a 3-turn spiral inductor.

#### B. Multi-Port Macro-model of Coupled Transmission Line

We then use a 2-bit coupled transmission line as the example for multi-port reduction and synthesis. The original PEEC model contains 42 resistors, 63 capacitors, 40 self-inductors, and 760 mutual-inductors, where we consider inductive coupling between any two segments including those in the same line. Still, for mutual inductance, a vector potential equivalent model (VPEC) is used.



Fig. 13. Colpitts LC oscillator with spiral inductors.



Fig. 14. Frequency Responses of  $Y_{11}$  of a 2-bit Transmission Line.



Fig. 15. Frequency Responses of  $Y_{12}$  of a 2-bit Transmission Line.

The matching frequency is up to 31GHz and we find 24 dominant poles in this range. There are 150 RLC elements in the synthesized circuit compared with 364 devices in the original circuit, which represent a 58.79% reduction rate. The frequency response for  $Y_{11}(s)$  and  $Y_{12}(s)$  are shown in Fig. 14 and Fig. 15, respectively. If we only match to 14G, 12 poles are required and we can achieve a 78.5% reduction rate instead. The time domain step responses from the original circuit, the 14GHz synthesized circuit and the 31GHz synthesized circuit are shown in Fig. 16. The difference among these three circuits is fairly small.

In Fig. 17, we further compare the frequency responses of the 24th-order macro-model by hierarchical model order reductoin (HMOR), 24th-order macro-model by PRIMA, and time-constant based reduction (with similar reduction ratio) with the original circuit. The frequency response of port-1 is observed at the input port of first bit, and that of port-2 is at the far-end of the first bit. Due to the preserved reciprocity, the reduced mode is easily realized by Foster's synthesis, and the model size is half of the SPICE-compatible circuit by PRIMA (via recursive convolution for each  $Y_{ij}$ ). Moreover, as shown in Fig. 17 the accuracy of the 24th-order model by H-reduction can match up to 30GHz but the same order model by PRIMA can only match up to 20GHz. Note that under the similar reduction ratio with H-reduction, the time-constant based reduction can only match up to 5GHz.



Fig. 16. Transient Responses of a 2-bit Transmission Line.

#### C. Scalability Comparison with Existing Methods

Table II gives the reduction CPU time comparison for two methods. *HMOR* denotes the CPU times of hierarchical reduction. We notice that the HMOR is slow than the PRIMA. But the difference become less for large circuits. Theoretically PRIMA and the one-point hierarchical model reduction have the same time complexity, that is time of complexity of one Gaussian eliminations. In PRIMA, we have to solve the circuit matrix at least once using Gaussian elimination or LU decomposition to solve for all the Krylov space base vectors

| 1        | 2         | 3      | 4            | 5           | 6     | 7              | 8      | 9           | 10    | 11              | 12     | 13          | 14     |
|----------|-----------|--------|--------------|-------------|-------|----------------|--------|-------------|-------|-----------------|--------|-------------|--------|
| Circuits | #Elements | #Poles | Simu-time(s) |             |       | Model-size(Kb) |        |             |       | Delay-error (%) |        |             |        |
|          |           |        | H-redu       | time-const. | Prima | Original       | H-redu | time-const. | Prima | Original        | H-redu | time-const. | Prima  |
| ckt1     | 84        | 10     | 0.15         | 0.12        | 0.50  | 0.51           | 0.85   | 0.86        | 3.51  | 3.52            | -0.16% | -0.86%      | -0.15% |
| ckt2     | 258       | 10     | 0.15         | 0.15        | 0.92  | 5.31           | 0.85   | 0.86        | 3.51  | 11.23           | -0.24% | -1.12%      | -0.22% |
| ckt3     | 905       | 25     | 0.32         | 0.43        | 2.25  | 19.51          | 1.68   | 1.70        | 19.8  | 41.1            | -0.41% | -4.43%      | -0.37% |
| ckt4     | 5255      | 30     | 0.52         | 0.89        | 3.13  | 661.46         | 1.92   | 2.23        | 28.2  | 243.6           | -0.62% | -6.83%      | -0.58% |
| ckt5     | 20505     | 30     | 0.52         | 1.08        | 5.98  | 1356.66        | 1.92   | 2.53        | 28.2  | 957.9           | -1.04% | -12.91%     | -0.83% |

TABLE I

SIMULATION EFFICIENCY COMPARISON BETWEEN ORIGINAL AND SYNTHESIZED MODEL.



Fig. 17. Frequency responses of a 2-bit transmission line at two ports.

| Circuits | #Elements | PRIMA(sec) | HMOR(sec) |
|----------|-----------|------------|-----------|
| ckt1     | 84        | 0.042      | 0.6       |
| ckt2     | 258       | 0.077      | 11.7      |
| ckt3     | 905       | 0.16       | 17.8      |
| ckt4     | 5255      | 1.73       | 44.4      |
| ckt5     | 20505     | 32.37      | 96.8      |

TABLE II THE COMPARISON OF REDUCTION CPU TIMES.

(or moments). In hierarchical reduction method, if we reduce one node at a time, it becomes the Gaussian elimination process. All the polynomial operations with fixed order have fixed computing costs. The efficiency difference is mainly due to expensive recursive operations used in graph operations, which can be further improved, and multi-point matching. However the multi-point matching makes our method closedloop, which gives us the good control on the model accuracy. For PRIMA, the model accuracy can't be determined without several trials using different reduction orders.

We finally present a scalability comparison in TableI by the time-domain transient simulation for the following aspects: (i) runtime of simulation; (ii) realization efficiency (realized model size); and (iii) accuracy in terms of delay. Several different sized RLCM circuits are used. We compare the our method with the time-constant based circuit-reduction [3] and projection-based reduction PRIMA implemented at [21]. The same number of poles are used for the reduction when we compare our H-reduction with PRIMA. The reduced model by time-constant reduction is obtained with the similar model size as H-reduction.

First, we find our realized RLCG circuit model size is up to 10X smaller on average than the SPICE compatible circuit from PRIMA. Therefore, similar simulation speedup (8X) is observed when we run both circuits in SPICE3. When we further compare the simulation time of our reduced models with the PEEC circuits, a significant speedup (up to 2712X for ckt5) is obtained. Furthermore, the waveform accuracy in terms of delay is given in Column 12-14. The reduced models are very accurate with the worst case delay error being -1.04% even with 478X (957.9Kb vs. 1.92Kb) reduction ratio in terms of model size. But for the same reduction ratio as our reduction, we find the time-constant based reduction introduces large error (up to 12.91%) because too many nodes are to be eliminated and the reduction criteria cannot be satisfied.

Note that the sparsification in the VPEC model can dramatically reduce the number of mutual inductive couplings, but can also maintain the accuracy [44]. As a result, we use this technique during our reduction for larger circuits. For example, in the case of ckt5 (the largest one) in TableI, we obtain a 97.5% sparsification from 19,900 to 498 mutual-inductors. Due to this sparsification, it reduces the reduction time by 10X (365.4s to 47.8s).

## X. CONCLUSION

We have proposed a new hierarchical multi-point reduction algorithm for wideband modeling of high-performance RF passive and linear(ized) analog circuits. In the theoretical side, we showed that the s-domain hierarchical reduction is equivalent to the implicit moment matching around s = 0and showed that the hierarchical one-point reduction is numerically stable for general tree-structured circuits. We also showed that the hierarchical reduction preserves reciprocity of passive circuit matrices. In the practical side, we have proposed a hierarchical multi-point reduction scheme for highfidelity, wideband modeling of general passive and active linear circuits. A novel explicit waveform matching algorithm is proposed for searching dominant poles and their residues from different expansion points, which is shown to be more efficient than the existing pole search algorithm. The passivity of reduced models are enforced by state-space based optimization method. We also proposed a general multi-port network realization framework to generate SPICE-compatible circuits as the macro models of the reduced circuit admittance matrices. The resulting modeling algorithm can generate the multiple-port passive SPICE-compatible model for any linear passive networks with easily controlled model accuracy and complexity. Experimental results on a number of passive RF and interconnect circuits have shown that the new proposed macro modeling technique generate more compact models given the same accuracy requirements than existing approaches like PRIMA and time-constant methods.

#### REFERENCES

- [1] R. Achar, P. K. Gunupudi, M. Nakhla, and E. Chiprout, "Paasive K. Achar, P. K. Gunupudi, M. Nakhia, and E. Chiprout, Paasive interconnect reduction algorithm for distributed/measured networks," *IEEE Trans. on Circuits and Systems II: analog and digital signal* processing, vol. 47, no. 4, pp. 287–301, April 2000.
   D. Allstot, K. Choi, and J. Park, *Parasitic-Aware Optimization of CMOS RF circuits.* Kluwer Academic Publishers, 2003.
   C. S. Amin, M. H. Chowdhury, and Y. I. Ismail, "Realizable RLCK circuit comprehens" in *Proc. Ductor Actornation Conf.* (MCC) 2002.
- [2]
- [3] circuit crunching," in Proc. Design Automation Conf. (DAC), 2003, pp. 226-231.
- [4] B. D. O. Anderson and S. Vongpanitlerd, Network Analysis and Synthe-
- B. D. O. Alevoid Chiffs, NJ: Prentice-Hall, 1973.
   M. Beattie and L. Pileggi, "Efficient inductance extraction via windowing," in *Proc. European Design and Test Conf. (DATE)*, 2001, pp. 430– [5] 436
- [6] O. Brune, "Synthesis of a finite two-terminal network whose driving
- [6] O. Brune, "Synthesis of a finite two-terminal network whose driving point impedance is a prescribed function of frequency," *Journal of Math. and Phys.*, vol. 10, pp. 191–236, 1931.
  [7] J. P. C. P. Coelho and L. M. Silveira, "A convex programming approach for generating guaranteed passive approximations to tabulated frequency-data," *IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems*, vol. 23, no. 2, pp. 293–301, Feb. 2004.
  [8] E. Chiprout and M. S. Nakhla, "Analysis of interconnect networks using complex frequency hopping," *IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems*, vol. CAD-14, no. 2, pp. 186–200, Feb 1995
- Of Integrated Circuits and Systems, vol. CAD-14, no. 2, pp. 100–200, Feb. 1995.
  C. P. Coelho, J. R. Phillips, and L. M. Silveira, "A convex programming approach to positive real rational approximation," in *Proc. Int. Conf. on Computer Aided Design (ICCAD)*, Nov. 2001, pp. 245–251.
  P. Elias and N. van der Meijs, "Including higher-order moments of RC interconnections in layout-to-circuit extraction," in *Proc. European Device* and *Caref (DATE)* 1006, pp. 262, 266. [9]
- [10]
- [11] P. Feldmann and R. W. Freund, "Efficient linear circuit analysis by pade approximation via the lanczos process," *IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems*, vol. 14, no. 5, pp. 639–649,
- Design of Integrated Circuits and Cystern, May 1995.
  [12] R. W. Freund, "SPRIM: structure-preserving reduced-order interconnect macromodeling," in *Proc. Int. Conf. on Computer Aided Design (IC-CAD)*, 2004, pp. 80–87.
  [13] S. Grivet-Talocia, I. A. Maio, and F. Canavero, "Recent advances in reduced-order modeling of complex interconnects," in *Proc. Dig.* 2020.
- in reduced-order modeling of complex interconnects," in Proc. Dig. Electrical Performance Electronic Packaging, Oct. 2001, pp. 243-246, vol. 10.
- 14] B. Gustavsen and A. Semlyen, "Rational approximation of frequency domain responses by vector fitting," *IEEE Transactions on Power Delivery*, vol. 14, no. 3, pp. 1052–1061, March 1999.
  [15] X. Huang, V. Rahjavan, and R. A. Rohrer, "Awesim: a program for efficient analysis of linear(ized) circuits," in *Proc. Int. Conf. on Computer Aided Device (ICCAD)*, 1000.
- [16] cient analysis of linear(ized) circuits," in *Proc. Int. Conf. on Computer Aided Design (ICCAD)*, 1990.
  [16] Y. Ismail and E. G. Friedman, "DTT: direct truncation of the transfer function an alternative to moment matching for tree structured interconnect," *IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems*, vol. 21, no. 2, pp. 131–144, Feb. 2003.
  [17] M. Kamon, M. Tsuk, and J. White, "FastHenry: a multipole-accelerated 3D inductance extraction program," *IEEE Trans. on Microwave Theory and Techniques*, pp. 1750–1758, Sept. 1994.
  [18] S. Kapur and D. Long, "IES3: A fast integral equation solver for efficient 3-dimentsional extraction," in *Proc. Int. Conf. on Computer Aided Design (ICCAD)*, 1997.
  [19] C. Kashyap and B. Krauter, "A realizable driving point model for onchip interconnect with inductance," in *Proc. Design Automation Conf. (DAC)*, 2000.
  [20] K. J. Kerns and A. T. Yang, "Stable and efficient reduction of large,

- (DAC), 2000.
  [20] K. J. Kerns and A. T. Yang, "Stable and efficient reduction of large, multiport rc network by pole analysis via congruence transformations," *IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems*, vol. 16, no. 7, pp. 734–744, July 1998.
  [21] Y. Kim and D. Sylvester, "Indigo: http://vlsida.eecs.umich.edu/," *EECS Dept. at Univ. Of Michigan.*[22] I. Iilis, C. Cheng, S. Lin, and N. Chang, Interconnect, analysis, and
- [22] J. Lillis, C. Cheng, S. Lin, and N. Chang, *Interconnect analysis and synthesis*. John Wiley, 1999.
  [23] Y. Massoud and J. White, "Simulation and modeling of the effect of

- [23] Y. Massoud and J. White, "Simulation and modeling of the effect of substrate conductivity on coupling inductance and circuit crosstalk," *IEEE Trans. on Very Large Scale Integration (VLSI) Systems*, 2002.
  [24] J. Morsey and A. C. Cangellaris, "PRIME: passive realization of interconnect models from measured data," *Electrical Performance of Electronic Packaging*, pp. 47–50, Oct. 2001.
  [25] K. Narbos and J. White, "FastCap: A multipole accelerated 3D capacitance extraction program," *IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems*, vol. 10, no. 11, pp. 1447–1459, 1991.

- [26] A. Odabasioglu, M. Celik, and L. Pileggi, "PRIMA: Passive reduced-order interconnect macromodeling algorithm," IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems, pp. 645-1998.

- [27] A. Pacelli, "A local circuit topology for inductive parasitics," in *Proc. Int. Conf. on Computer Aided Design (ICCAD)*, 2002, pp. 208–214.
  [28] L. T. Pillage and R. A. Rohrer, "Asymptotic waveform evaluation for timing analysis," *IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems*, pp. 352–366, April 1990.
  [29] L. T. Pillage, R. A. Rohrer, and C. Visweswariah, *Electronic Circuit and System Simulation Methods.* New York: McGraw-Hill, 1994.
  [30] Z. Qi, S. X.-D. Tan, H. Yu, L. He, and P. Liu, "Wideband modeling of RF/analog circuits via hierarchical multi-point model order reduction," in *Proc. Asia South Pacific Design Automation Conf. (ASPDAC)*, Jan. 2005, pp. 224–229.
- 2005, pp. 224–229. [31] Z. Qin and C. Cheng, "Realizable parasitic reduction using generalized  $Y \Delta$  transformation," in *Proc. Design Automation Conf. (DAC)*, 2003, 220-22
- B. Razavi, *RF Microelectronics*. Prentice Hall, 1998.
   D. Saraswat, R. Achar, and M. Nakhla, "A fast algorithm and practical considerations for passive macromodeling of measured/simulated data," [33] IEEE Trans. on Advanced Packaging, vol. 27, no. 1, pp. 57-70, Feb. 2004
- [34] B. N. Sheehan, "TICER: Realizable reduction of extracted RC circuits," in Proc. Int. Conf. on Computer Aided Design (ICCAD), 1999, pp. 200-203.
- [35]
- 205. "Branch merge reduction of RLCM networks," in *Proc. Int. Conf.* on *Computer Aided Design (ICCAD)*, 2003, pp. 658–664.
   C.-J. Shi and X.-D. Tan, "Canonical symbolic analysis of large analog circuits with determinant decision diagrams," *IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems*, vol. 19, no. 1, pp. 1–18, Jan. 2000. [36]
- circuits with determinant decision diagrams, *TEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems*, vol. 19, no. 1, pp. 1–18, Jan. 2000.
  [37] M. Silveira, M. Kamon, I. Elfadel, and J. White, "A coordinate-transformed Arnoldi algorithm for generating guaranteed stable reduced-order models of RLC circuits," in *Proc. Int. Conf. on Computer Aided Design (ICCAD)*, 1996, pp. 288–294.
  [38] J. Sturm, "Using SuDuMi 1.02, a matlab toolbox for optimization over symmetric cones," *Optim. Meth. Softw.*, vol. 10, pp. 625–653, 1999.
  [39] S. X.-D. Tan, "A general s-domain hierarchical network reduction algorithm," in *Proc. Int. Conf. on Computer Aided Design (ICCAD)*, 2003, pp. 650–657.
  [40] S. X.-D. Tan, W. Guo, and Z. Qi, "Hierarchical approach to exact symbolic analysis of large analog circuits," in *Proc. Design Automation Conf. (DAC)*, June 2004.
  [41] S. X.-D. Tan, Z. Qi, and H. Li, "Hierarchical modeling and simulation of large analog circuits," in *Proc. European Design and Test Conf. (DATE)*, Feb. 2004, pp. 740–741.
  [42] X.-D. Tan and C.-J. Shi, "Hierarchical symbolic analysis of large analog circuits via determinant decision diagrams," *IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems*, vol. 19, no. 4, pp. 401–412, April 2000.
  [43] G. C. Temes and J. Lapatra, *Introduction to circuit synthesis and design*. McGraw-Hill Book Company, 1977.

- [43] G. C. Temes and J. Lapatra, Introduction to circuit synthesis and design. McGraw-Hill Book Company, 1977.
  [44] H. Yu and L. He, "Vector potential equivalent circuit based on PEEC inversion," in Proc. Design Automation Conf. (DAC), 2003, pp. 781–723.
  [45] H. Yu, L. He, Z. Qi, and S. X.-D. Tan, "A wideband realizable circuit-reduction for rlcm interconnects," in Proc. Asia South Pacific Design Automation Conf. (ASPDAC), Jan. 2005, pp. 111–114.



Zhenyu Qi (S'04) received his B.S. and M.S. degree

**Zhenyu Qi** (S<sup>'04)</sup> received his B.S. and M.S. degree in Electrical Engineering from Fudan University, Shanghai, P.R. China, and University of California Riverside in 2003 and 2004 respectively. He is currently a Ph.D. candidate at University of California at Riverside. His research interests in-clude interconnect modeling, high-performance on-chip power/ground network optimization, and fast simulation for RF and thermal circuits.



**Hao Yu** (S'02) received B.S. degree in physics from Fudan University, Shanghai, China, in 1999. He is currently a Ph.D. candidate in the electrical engineering department at University of California at Los Angeles (UCLA).

He was with mixed-mode CAD development group at Analog Devices Inc. (MA) in the sum-mer of (2001-2003). His research interests include computer-aided design of VLSI circuits and sys-tems, interconnect modeling/simulation, and signal intercrity characterization/continuitation integrity characterization/optimization.



Pu Liu (S'05) received M.S. degree in control throry

and control engineering from Beijing Institute of Technology, Beijing, China, in 2002. He is currently a Ph.D. candidate in Electrical Engineering at University of California at Riverside. His research interests focus on behavioral modeling and model order reduction techniques for linear, linear time varying system and nonlinear system, fast eletrical and thermal analysis and optimization.



Sheldon X.-D. Tan (S'96-M'99) received his B.S. and M.S. degrees in electrical engineering from Fu-dan University, Shanghai, China in 1992 and 1995, respectively and the Ph.D. degree in electrical and computer engineering from the University of Iowa, Iowa City, in 1999.

Iowa City, in 1999. He is an Assistant Professor in the Department of Electrical Engineering, University of California, Riverside. He was a faculty member in the Electrical Engineering Department of Fudan University from 1995 to 1996. His research interests include several aspects of design automation for VLSI integrated circuits – modeling, analysis and optimization of systems, signal integrity issues in VLSI physical design, high performance power/ground distribution network design and optimization and VLSI thermal analysis and optimization.

power/ground distribution network design and optimization and VLSI thermal analysis and optimization. Dr. Tan is the recipient of NSF CAREER Award in 2005. He also received the UC Regent's Faculty Fellowship in 2004. Dr. Tan received a Best Paper Award Nomination from 2005 IEEE/ACM Design Automation Conference, Best Paper Award from 1999 IEEE/ACM Design Automation Conference and the Best Poster Award from 1999 Spring Meeting of the NSF Center for Design of Analog and Digital Integrated Circuits (CDADIC). He also co-authored book "Symbolic Analysis and Reduction of VLSI Circuits" by Springer/Kluwer 2005. He is a technical program committee member of ISCAS'04, ASPDAC'05, ISCAS'05, BMAS'05.



**Dr. Lei He** obtained Ph.D. degree in computer science from UCLA in 1999, and has been an assistant professor at electrical engineering, UCLA since 2002. From 1999-2001, he was a faculty member at electrical and computer engineering, University of His research interests include (i) modeling and

design considering inductance effects and process variations for integrated circuits and packages,(ii) programmable logic fabrics and reconfigurable sys-tems, and (iii) power-efficient circuits and systems.

He has published over 100 technical papers and is TPC members of Design Automation Conference, International Symposium on Low Power Electronics and Design, and Inter-

International Symposium on Low Power Electronics and Design, and Inter-national Symposium on Field Programmable Gate Array. He was granted the NSF CAREER award in 2000, UCLA Chancellor's faculty career development award (highest class) in 2003, and IBM faculty partner award in 2003.