## **Timing-Driven Interplane Via Placement in 3-D ICs**

## ABSTRACT

The interconnect delay in 3-D circuits can be minimized by optimally placing the through-silicon vias. Due to obstacles that limit the placement of the through-silicon vias, the task of determining the via locations for minimum interconnect delay is described as a constrained optimization problem under the distributed Elmore delay model. Two different approaches are applied. First, a geometric programming solver is utilized to determine the optimum via locations. Second, a heuristic based on simple analytic expressions and the electrical properties of the interconnect is introduced and an efficient algorithm that provides optimum or near-optimum solutions is developed. The algorithm performs computationally faster as compared to the geometric programming solver, reducing the runtime by about two orders of magnitude. Simulation results indicate delay improvements for relatively short interconnects of up to 16% where the throughsilicon vias are optimally placed as compared to the case where the through-silicon vias are placed in the middle of the allowed regions and up to 32% as compared to a random through-silicon via placement. The proposed algorithm can be integrated into a design flow for 3-D circuits to enhance placement and routing where timing is the primary design objective.

#### I. INTRODUCTION

With the increase in die size due to the demand for greater functionality and increase in interconnect resistance due to technology scaling, interconnect delay has become the primary bottleneck in modern ICs. These performance challenges call for novel design paradigms that expand the boundaries of the IC design space.

Three-dimensional (3-D) integration is a promising alternative that can mitigate the next generation performance and integration challenges. By vertically stacking the die rather than horizontally expanding the integrated circuit, a considerable reduction in interconnect length is possible. Furthermore, components from dissimilar technologies and fabrication processes can be integrated onto a single multiplane system, offering high yield for each individual component. 3-D integration affects various design abstraction levels. At the package level, bare die are stacked on top of each other and communication among the die is achieved with wires bonded at

This research is supported in part by the Semiconductor Research Corporation under Contract No. 2003-TJ-1068 and 2004-TJ-1207, the National Science Foundation under Contract No. CCR-O304574, the Fulbright Program under Grant No. 87481764, grants from the New York State Office of Science, Technology & Academic Research to the Center for Advanced Technology in Electronic Imaging Systems, and by grants from Intel Corporation, Eastman Kodak Company, Manhattan Routing, and Intrinsix Corporation.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.

Conference'04, Month 1-2, 2004, City, State, Country.

Copyright 2004 ACM 1-58113-000-0/00/0004...\$5.00.

the die periphery. Alternatively, the die can be separately imbedded into packages that are adhered onto a stack with layers of bumps sandwiched between subsequent packages dedicated to provide interconnection among the individual planes of the stack [1]. The greatest reduction in wirelength that the third dimension however, offers comes from those 3-D technologies, where the interconnections among the planes of the stack are implemented with through-silicon vias that are not constrained to the periphery of the circuit (the interplane interconnects) [2]-[4]. In this paper, the terms, through-silicon vias and simply vias, are used interchangeably. A schematic of a 3-D circuit based on these technologies, which are assumed to be the target technologies in this paper, is shown in Figure 1. As illustrated in Figure 1, each physical plane of the stack is similar to a conventional 2-D circuit in that a plane includes a device layer and multiple metal layers to connect individual circuits located on the same physical plane (the intraplane interconnects).



Figure 1. Schematic of a three-dimensional circuit.

Several a priori interconnect prediction models have been described in the literature to evaluate the performance benefits of three-dimensional circuits resulting from the reduction in interconnect length [5]-[8]. All of these models are based on the well known Rent's rule [9] and predict significant reductions in interconnection lengths, particularly global interconnects. To fully exploit the potential of 3-D circuits, sophisticated placement and routing algorithms tailored to the 3-D structures are necessary [10]-[13]. The delay savings that can be achieved by the optimum placement of the vias in 3-D ICs have not been thoroughly explored. In [14], analytic expressions for the optimum via location, where non-uniform interconnect impedance characteristics are considered, however, the presence of obstacles that limit the valid via locations are not considered. In addition, the interplane interconnect consists of only one via.

The contribution of this paper is twofold. First, an algorithm for via placement for two-terminal interplane interconnects consisting of any number of through-silicon vias is presented. Second, a heuristic that provides optimum or near-optimum solutions for timing-driven via placement under physical constraints is introduced. The proposed heuristic is used to implement an algorithm with considerably smaller computational times as compared to general optimization solvers.

The rest of the paper is organized as follows. In the next section, the problem of determining the optimum via location for

minimum interconnect delay under the distributed Elmore delay model is analyzed. In Section 3, a heuristic for the timing-driven placement of interplane vias is described considering the particular characteristics of the circuits and technology. An algorithm based on this heuristic producing significantly smaller runtimes is also presented. Both SPICE measurements and optimization results are provided in Section 4 along with a comparison of the proposed algorithm with optimization solvers, both in terms of accuracy and efficiency. Finally, in Section 5, some conclusions are offered.

#### II. PROBLEM FORMULATION

The primary problem considered in this work is introduced in this section. Consider the interplane interconnect shown in Figure 2 that connects two circuits located m physical planes apart from each other. Each of the horizontal segments of the line corresponds to a metal layer of some physical plane of the stack. The horizontal segments of the line are connected through the vias which can traverse more than one plane. Consequently, the number of horizontal segments of the line is smaller than or at most equal to the number of physical planes between the two circuits, *i.e.*  $m \ge n$ , where the equality only applies when each of the vias connects metal layers from two adjacent physical planes. In conventional two-dimensional (2-D) circuits, a two terminal net such as the structure shown in Figure 2 is typically modeled as a uniform line where the vias are either ignored or considered as lumped capacitive loads. The heterogeneity of 3-D circuits, however, does not support a uniform line model. The interplane interconnects are therefore modeled as wire segments with nonuniform impedance characteristics. The via locations or, alternatively, the length of each horizontal wire segment therefore affects the delay of the line. Thus, the objective is to place the vias such that the interconnect delay is minimum.

Each horizontal segment *i* of the line is located on a different physical plane with length  $l_i$ . The vias are denoted by the index of the first of the two connected segments. For example, if a via connects segment *i* and *i*+1, the via is denoted as  $v_i$  with length  $l_{v_i}$ . The total length of the line *L* is equal to the summation of the length of the horizontal segments and vias,

$$L = l_1 + l_{y1} + \dots + l_i + l_{yi} + \dots + l_n .$$
(1)

The length of each horizontal segment of the line is bounded,

$$l_{i\min} \le l_i \le l_{i\min} + \Delta x_i \,, \tag{2}$$

or, equivalently, with a simple variable change

$$0 \le x_i \le \Delta x_i , \tag{3}$$

where  $l_{imin}$  is the minimum length of the interconnect segment on plane *i*, and  $\Delta x_i$  is the length of the region in which the via that connects planes *i* and *i*+1 can be placed, and called "allowed regions" here for clarity.  $x_i$  is the distance of the via location from the edge of the allowed region.  $l_{imin}$  is the length of an interconnect segment connecting two allowed regions or an allowed region and a placed cell. These lengths are considered fixed. Alternatively, the routing path of a net is not altered except for the via locations in the allowed regions, where these regions are generated by a placement tool for 3-D circuits.

The corresponding electrical model of the line is shown in Figure 3. The total resistance and capacitance of a horizontal (vertical) segment  $i(v_i)$  is  $R_{i(v_i)} = r_{i(v_i)}l_{i(v_i)}$  and  $C_{i(v_i)} = c_{i(v_i)}l_{i(v_i)}$ , where  $r_{i(v_i)}$  and  $c_{i(v_i)}$  denote the resistance and capacitance, respectively, per unit length. The driver is modeled as a step input voltage and a linear resistance  $R_s$ , and the following stage as a capacitive load  $C_L$ . The distributed Elmore delay of the line in matrix form is

$$T(\mathbf{l}) = 0.5 \, \mathbf{l}^{\mathrm{T}} \mathbf{A} \mathbf{l} + \mathbf{b} \mathbf{l} + D \,, \tag{4}$$

$$\mathbf{l} = \begin{bmatrix} l_1 & l_2 & \cdots & l_{n-1} & l_n \end{bmatrix}^{\mathrm{T}},$$
 (5)

$$\mathbf{A} = \begin{bmatrix} r_1 c_1 & r_1 c_2 & \cdots & r_1 c_n \\ \vdots & \vdots & \cdots & \vdots \\ r_1 c_n & r_2 c_n & \cdots & r_n c_n \end{bmatrix},$$
(6)

$$\mathbf{b} = \begin{bmatrix} r_1 \left( \sum_{i=1}^{n-1} c_{vi} l_{vi} + C_L \right) + c_1 R_S \\ \vdots \\ r_n C_L + c_n \left( R_S + \sum_{i=1}^{n-1} r_{vi} l_{vi} \right) \end{bmatrix}^{\mathrm{T}}, \quad (7)$$

$$D = R_{S} \sum_{i=1}^{n-1} c_{vi} l_{vi} + C_{L} \sum_{i=1}^{n-1} r_{vi} l_{vi} + \frac{1}{2} \sum_{i=1}^{n-1} r_{vi} c_{vi} l_{vi}^{2} + R_{S} C_{L} .$$
(8)



Figure 2. Interplane interconnect consisting of *n* segments connecting two circuits located *m* planes apart.

Since (8) is a constant quantity, the optimization problem can be described as follows,

(P) min. 
$$T(\mathbf{l}) = 0.5 \mathbf{l}^{T} \mathbf{A} \mathbf{l} + \mathbf{b} \mathbf{l}$$
,

subject to (1) and (2).



Figure 3. Interplane interconnect model composed of a set of non-uniformly distributed *RC* segments.

As described by the following theorem, the primal problem (P) is not typically convex and therefore convex quadratic programming optimization techniques are not directly applicable.

*Theorem 1:* The primal optimization problem (P) is a convex one *iff* 

$$r_{i+1}c_i - r_i c_{i+1} > 0. (9)$$

*Proof:* **A** is a positive definite matrix if all subdeterminants are positive. By elementary row operations, the subdeterminants of **A** are positive *iff* (9) applies. If (9) applies **A** is positive definite and (P) is a convex optimization problem.

Certain transformations can be applied to convert (P) to a convex optimization problem [15]; the objective function, however is no longer quadratic. Alternatively, (P) can be cast as a geometric programming problem. Geometric programs include optimization problems for functions and inequalities of the following form,

$$g(\mathbf{y}) = \sum_{j=1}^{M} s_j y_1^{a_{1j}} y_2^{a_{2j}} \cdots y_n^{a_{nj}}, \qquad (10)$$

$$s_j y_1^{a_{1j}} y_2^{a_{2j}} \cdots y_n^{a_{nj}} \le 1$$
, (11)

where the variables  $y_j$ 's and coefficients  $s_j$ 's must be positive and the exponents  $a_{ij}$ 's are real numbers. Although equality constraints are not allowed in standard geometric problems, (P) can be solved as a generalized geometric program as described in [16]. In the following section, a heuristic that provides optimum or near-optimum solutions for (P) is presented in addition to an algorithm that yields smaller runtimes as compared to a geometric programming solver.

#### III. OPTIMUM VIA PLACEMENT HEURISTIC

In this section, a heuristic for optimally placing the throughsilicon vias in 3-D circuits is described and an algorithm for the timing-driven interplane via placement for two-terminal interconnects is presented. The key step in the heuristic is that the optimum placement of a via primarily depends on the size of the allowed regions (that are estimated or known after an initial placement) rather than the exact optimum location of the vias. To better explain this step, consider the interplane interconnect line shown in Figure 2 where the optimum location for via *i* that connects the interconnect segments *i* and *i*+1 is to be determined. With respect to this via, the critical point (*i.e.*,  $\frac{\partial T_{el}}{\partial x_i} = 0$ ) of the Elmore delay is [14]

$$x_{i} = - \begin{bmatrix} \frac{l_{vi}(r_{i}c_{vi} - r_{vi}c_{i+1} + r_{i+1}c_{i+1} - r_{i}c_{i+1})}{r_{i}c_{i} - 2r_{i}c_{i+1} + r_{i+1}c_{i+1}} + \\ \frac{R_{ui}(c_{i} - c_{i+1}) + \Delta x_{i}(r_{i}c_{i+1} - r_{i+1}c_{i+1}) + C_{di}(r_{i} - r_{i+1})}{r_{i}c_{i} - 2r_{i}c_{i+1} + r_{i+1}c_{i+1}} \end{bmatrix}, \quad (12)$$

where  $R_{ui}$  and  $C_{di}$  are the upstream resistance and downstream capacitance of the allowed region for via *i* as shown in Figure 2. As described in [14], the critical point in (12) can be either a maximum or a minimum delay point. The analysis in the rest of this section applies to the case where the critical point provided by (12) is a minimum delay point, that is,  $x_i = x_i^*$ . A similar analysis can be followed where (12) corresponds to a maximum delay point.

In (12), the optimum via location  $x_i^*$  is a monotonic function of  $R_{ui}$  and  $C_{di}$ . The sign of monotonicity depends upon the interconnect impedance parameters of the segments *i* and *i*+1 that is connected by via *i*. As the size of the allowed regions for all of the vias is given by (3), the minimum and maximum value of  $R_{ui}$  and  $C_{di}$  can be readily determined and the respective values of  $x_i^*$  for these extrema, namely  $x_{i\min}^*$  and  $x_{i\max}^*$ , are evaluated. Due to the monotonic dependence of  $x_i$  on  $R_{ui}$  and  $C_{di}$ , the optimum point for via *i*,  $x_i^*$ , lies within the range defined by  $x_{i\min}^*$  and  $x_{i\max}^*$ . The following cases are distinguished

*i)* if  $x_{i\max}^* \leq 0$ ,  $x_i^* = 0$ , and the optimum via location coincides with the lower bound of the region defined by (3) or, equivalently, the optimum length of segment *i* is equal to the lower limit of (2),  $l_{imin}$ .

*ii)* if  $x_{i\min}^* \ge \Delta x_i$ ,  $x_i^* = \Delta x_i$ , and the optimum via location coincides with the upper bound of the region defined by (3) or, equivalently, the optimum length of segment *i* is equal to the upper limit of (2),  $l_{imin} + \Delta x_i$ .

*iii)* if  $\Delta x_i \ge x_{i\min}^* \ge 0$  and  $\Delta x_i \ge x_{i\max}^* \ge 0$ , the bounded region as defined by (3) reduces to

$$0 \le x_{i\min}^* \le x_i \le x_{i\max}^* \le \Delta x_i . \tag{13}$$

In this case, the optimum length of the segment cannot be directly determined. However, by successively decreasing the range of values for  $x_i^*$  the optimum via location for segment *i* can be reached. The following example is used to demonstrate that the physical domain for  $x_i^*$  successively decreases to a single point that is the optimum via location. Consider the segments j, i, and kin Figure 2, where segments j and k are located upstream and downstream of segments *j* and *x* are rotated upstream and and maximum values of  $R_{uj}^0$ ,  $R_{ui}^0$ ,  $R_{uk}^0$ ,  $C_{dj}^0$ ,  $C_{di}^0$ , and  $C_{dk}^0$  are determined, where the superscript declares the number of iterations. Assume that  $x_{\min}^{*0}$  and  $x_{\max}^{*0}$  are obtained from (12) to satisfy (13) for all of the three segments i, j, and k. As the range of values for the optimum via location of segments i and k, decreases according to (13) the minimum (maximum) value of the upstream resistance and downstream capacitance of segment i increases (decreases), *i.e.*,  $R_{ui\min}^0 < R_{ui\min}^1$ ,  $C_{di\min}^0 < C_{di\min}^1$ ,  $R_{ui\,\text{max}}^1 < R_{ui\,\text{max}}^0$ , and  $C_{di\,\text{max}}^1 < C_{di\,\text{max}}^0$ . Due to the monotonicity of  $x_i^*$  on  $R_{ui}$  and  $C_{di}$   $x_{i\min}^{*0} < x_{i\min}^{*1}$  and  $x_{i\max}^{*1} < x_{i\max}^{*0}$ . Therefore the range of values for  $x_i^*$  also decreases and after several iterations, the optimum location for the corresponding via is determined.

*iv)* if  $x_{i\min}^* \leq 0$  and  $x_{i\max}^* \geq \Delta x_i$ , the optimum via location cannot be directly determined. Additionally, the bounding region cannot be reduced. Consequently, some loss of optimality occurs. This departure from the optimal, however, is insignificant. The variation between the extrema of the upstream resistance and downstream capacitance due to the relatively small size of the allowed regions for via placement is such that a significant variation between the values  $x_{i\min}^*$  and  $x_{i\max}^*$  does not occur for most interconnect instances. In addition, the aforementioned inequalities ( $x_{i\min}^* \leq 0$ ,  $x_{i\max}^* \geq \Delta x_i$ ) are usually satisfied where the size of the allowed region for a via is relatively small as compared to the size of the allowed regions for the remaining vias. A non-optimal placement for that interconnect segment does not considerably affect the overall delay of the line. Furthermore, there should be at least two vias for which an optimum location cannot be found. Indeed, if all of the vias but one are optimally placed, the exact values of the upstream (downstream) resistance (capacitance) for the non-optimized via have been obtained and, in turn, the optimum location for that via is determined by (12). Finally, the non-optimum placement of a via does not necessarily affect the optimum placement for the remaining vias. For example, any via placed according to the criteria described in (i) and (ii) is not affected by the placement of the remaining vias. Therefore, as it was pointed out in the beginning of this section the size of the allowed regions rather than the exact location of the vias is the key factor to determine the optimum via locations.

The aforementioned heuristic has been used to implement an algorithm that exhibits optimum or near-optimum via placement for interplane interconnects in 3-D ICs, with significantly lower runtimes as compared to a geometric programming solver. The pseudocode of the algorithm is illustrated in Figure 4. The input to the algorithm is an interplane interconnect where the minimum length of the segments and the size of the allowed regions are provided. In the first step, the arrays of the maximum and minimum upstream (downstream) resistance (capacitance) for every allowed region are determined. In the following steps, for each unprocessed via, the range of values for the optimum via location as given by (12) is evaluated. In step five these values are compared to the inequalities described in (i)-(iv). If an optimal via location is determined in this step, the via is marked as processed and the capacitance and resistance arrays are updated. If, after a number of iterations, there are non-optimal vias, in step 14 the vias are placed in the middle of the corresponding allowed regions and the algorithm terminates. Other criteria such as routing congestion can, alternatively, be applied to the place the non-optimized vias rather than placing these vias in the middle of the corresponding allowed regions. Note that there must be at least two non-optimized vias after the specified maximum number of iterations is reached. Additionally, additional tradeoffs can be made to search for the optimum location of the non-optimized vias, considering runtime with accuracy. High accuracy has been demonstrated as described in Section IV.

#### **IV. SIMULATION RESULTS**

In this section, the improvement in delay achieved by optimally placing the vias is demonstrated and the proposed optimum via placement algorithm (OVPA) is compared with a geometric programming solver. Interplane interconnects for various numbers of physical planes are analyzed. The impedance characteristics of the horizontal segments and vias are extracted for several interconnect structures using a commercial impedance extraction tool [17]. Based on the extracted impedances, the resistance and capacitance of the horizontal segments range from 5  $\Omega$ /mm to 25  $\Omega$ /mm and from 100 fF/mm to 300 fF/mm, respectively. Copper interconnect has been assumed. For all of the interconnect structures, the total length and minimum length of each horizontal segment, are randomly generated. For simplicity, all of the vias connect the segments of two adjacent physical planes and the size of the allowed regions is maintained the same for every via.

| Optin | num Via Placement Algorithm: $(\mathbf{l}_{\min}, \Delta \mathbf{x})$                      |  |  |  |  |  |
|-------|--------------------------------------------------------------------------------------------|--|--|--|--|--|
| 1.    | 1. Determine C <sub>dmin</sub> , C <sub>dmax</sub> , R <sub>umin</sub> , R <sub>umax</sub> |  |  |  |  |  |
| 2.    | while $\mathbf{s} \neq \emptyset$                                                          |  |  |  |  |  |
| 3     | ifiter < max_iter                                                                          |  |  |  |  |  |
| 4.    | $s_i \leftarrow an unprocessed via$                                                        |  |  |  |  |  |
| 5.    | obtain $x_{i\min}^*$ and $x_{i\max}^*$ from eq. (12)                                       |  |  |  |  |  |
| 6.    | check for the inequalities in $(i) - (iv)$                                                 |  |  |  |  |  |
| 7.    | if $s_i$ is optimized (cases i-ii)                                                         |  |  |  |  |  |
| 8.    | store optimum via location                                                                 |  |  |  |  |  |
| 9.    | <b>S ← S</b> - {s <sub>i</sub> }                                                           |  |  |  |  |  |
| 10.   | update $C_{dmin},\ C_{dmax},\ R_{umin},\ R_{umax}$                                         |  |  |  |  |  |
|       | elseif $\Delta x_i$ decreases (case iii)                                                   |  |  |  |  |  |
| 11.   | update l <sub>mini</sub>                                                                   |  |  |  |  |  |
| 12.   | update $C_{dmin}$ , $C_{dmax}$ , $R_{umin}$ , $R_{umax}$                                   |  |  |  |  |  |
|       | else (case iv)                                                                             |  |  |  |  |  |
| 13.   | go to step 3                                                                               |  |  |  |  |  |
|       | else (the non-optimized vias)                                                              |  |  |  |  |  |
| 14.   | place via in the middle of the allowed region                                              |  |  |  |  |  |
| 15.   | store via location                                                                         |  |  |  |  |  |
| 16.   | $\mathbf{S} \leftarrow \mathbf{S} - \{\mathbf{s}_{i}\}$                                    |  |  |  |  |  |
| 17.   | exit                                                                                       |  |  |  |  |  |

# Figure 4. Pseudocode of the proposed optimum via placement algorithm (OVPA).

SPICE 50% delay measurements are reported in Table 1. The delay of the line  $T_{I}$ , where the vias are placed in the middle of the allowed regions is listed in column 2, and the delay  $T_2$  listed in column 3 corresponds to the line delay for random via placement. The minimum interconnect delay  $T_{min}$  where the vias are optimally placed is depicted in column 4. The optimum via locations or, equivalently, the optimum length of the horizontal segments, is the output of the algorithm described in Section III. The improvement in delay as compared to the case where the vias are placed in the middle of the line is listed in column 5. The numbers in parentheses correspond to the delay improvement over the random placement. Note that the variation in delay improvement changes significantly for the listed instances even when the interconnect lengths are similar and the load capacitance and driver resistance are the same. This considerable variation shows the strong dependence of the line delay on the impedance characteristics of the segments of the line and supports the model of the interplane interconnect as a group of non-uniform segments. Additionally, depending upon the impedance characteristics of the line segments, placing the vias in the middle of the allowed regions is for some instances nearoptimum, explaining why the delay improvement is not significant for these instances. The same characteristic applies for those instances where a random placement is close to the optimum placement. Nevertheless, as listed in Table 1, an improvement of up to 32% is observed for relatively short interconnects, demonstrating that an optimum via placement can significantly enhance the performance of 3-D circuits (in addition to the primary benefit that originates from reduction in wirelength).

The algorithm presented in Section 4 is compared both in terms of accuracy and efficiency with two optimization solvers. The first solver YALMIP [18], is a generic optimization solver that supports geometric programming while GLOPTIPOLY [19] is an optimization solver for non-convex polynomial functions. Due to the excessive runtimes of GLOPTIPOLY (greater than three orders of magnitude as compared to YALMIP), however, only comparisons with YALMIP are reported. Optimization results are listed in Table 2.

As listed in columns 9 and 10 of Table 2, OVPA exhibits a high accuracy as compared to YALMIP, independent of the number of planes that comprise the 3-D interconnect, demonstrating that the proposed algorithm yields optimum solutions for most interconnect instances. In addition, for those cases where some of the vias are not optimally placed, the loss of optimality is insignificant (as previously discussed in Section 4). In column 8, the runtime ratio of YALMIP and the OVPA is listed. OVPA performs approximately two orders of magnitude faster than YALMIP.

As shown in Table 2 the delay savings from the optimum via placement strongly depend upon the size of the allowed regions. For example, doubling the size of the allowed regions for via placement the maximum delay improvement increases almost twofold. Consequently, efficient placement tools for 3-D circuits that generate sufficiently large allowed regions are required to achieve the highest delay improvement. In addition, as more interplane interconnects are routed, the allowed regions for the subsequent wires are reduced. Net ordering algorithms that determine which nets should be routed initially are therefore necessary.

A place and route methodology has recently been presented for 3-D circuits [1] where the interplane vias are considered as standard cells during initial placement. The via placement approach proposed in this paper can seamlessly be integrated into such a methodology to exploit the added delay benefits resulting from the optimum via placement where timing is the primary objective being satisfied.

#### V. CONCLUSIONS

Improvements in performance that can be achieved by optimally placing interplane vias in 3-D circuits is explored. Employing the distributed Elmore delay model, the task of optimally placing the vias is presented as a geometric programming problem. Considering the physical constraints, a heuristic is also proposed that provides accurate solutions. Based on this heuristic, a via placement algorithm that exhibits significantly smaller runtimes as compared to geometric programming solvers with negligible or no loss of optimality has also been implemented. Delay improvements of up to 16% and 32% are demonstrated where the vias are optimally placed as compared to via placement in the middle of the allowed regions and random placement, respectively. The proposed algorithm can be embedded in sophisticated place and route methodologies for 3-D circuits as a refinement step towards achieving greater performance.

TABLE I. Simulation results demonstrating the delay savings achieved by optimum via placement. The resistance and capacitance per unit length of the vias is  $r_{vi} = 6.7 \ \Omega/\text{mm}$  and  $c_{vi} = 6 \text{ pF/mm}$ , respectively. The length of the vias is  $l_{vi} = 20 \ \mu\text{m}$ . The driver resistance is  $R_S = 15 \ \Omega$  and the load capacitance is  $C_L = 100 \ \text{fF}$ . The size of the allowed regions is  $\Delta x_i = 200 \ \mu\text{m}$ .

| Length $T_I$ [ps] [mm] |       | $T_2 [ps] \qquad T_{min} [ps]$ |       | Improvement<br>[%] | n  |
|------------------------|-------|--------------------------------|-------|--------------------|----|
| 1.017                  | 12.35 | 12.64                          | 11.42 | 8.14 (10.68)       | 4  |
| 1.180                  | 13.37 | 14.42                          | 12.33 | 8.43 (16.95)       | 4  |
| 0.849                  | 11.00 | 11.71                          | 10.27 | 7.11 (14.02)       | 4  |
| 0.969                  | 13.52 | 14.96                          | 12.12 | 11.55 (23.43)      | 4  |
| 0.967                  | 12.38 | 12.59                          | 11.72 | 5.63 (7.42)        | 4  |
| 1.612                  | 18.54 | 19.85                          | 17.24 | 7.54 (15.14)       | 5  |
| 1.537                  | 20.80 | 19.47                          | 19.37 | 7.38 (0.52)        | 5  |
| 1.289                  | 17.78 | 18.43                          | 16.45 | 8.09 (12.04)       | 5  |
| 1.443                  | 18.77 | 19.54                          | 18.07 | 3.87 (8.14)        | 5  |
| 1.225                  | 16.97 | 18.33                          | 15.62 | 8.64 (17.35)       | 5  |
| 2.118                  | 30.52 | 34.81                          | 26.44 | 15.43 (31.66)      | 7  |
| 2.130                  | 27.92 | 27.32                          | 25.94 | 7.63 (5.32)        | 7  |
| 1.961                  | 28.49 | 30.67                          | 26.16 | 8.91 (17.24)       | 7  |
| 2.263                  | 35.58 | 40.11                          | 31.31 | 13.64 (28.11)      | 7  |
| 2.174                  | 32.31 | 30.34                          | 29.16 | 10.80 (4.05)       | 7  |
| 3.031                  | 43.43 | 42.15                          | 39.24 | 10.68 (7.42)       | 10 |
| 2.883                  | 42.62 | 48.36                          | 36.48 | 16.83 (32.57)      | 10 |
| 3.212                  | 46.80 | 52.13                          | 41.77 | 12.04 (24.80)      | 10 |
| 2.807                  | 44.14 | 46.67                          | 38.77 | 13.85 (20.38)      | 10 |
| 1.701                  | 65.90 | 62.29                          | 60.25 | 9.38 (3.39)        | 10 |

### VI. REFERENCES

- W. R. Davis *et al.*, "Demystifying 3D ICs: The Pros and Cons of Going Vertical," *IEEE Journal on Design and Test* of Computers, Vol. 22, No. 6, pp. 498-510, November/December 2005.
- [2] R. J. Gutmann et al., "Three-dimensional (3D) ICs: A Technology Platform for Integrated Systems and Opportunities for New Polymeric Adhesives," Proceedings of the Conference on Polymers and Adhesives in Microelectronics and Photonics, pp. 173-180, October 2001.
- [3] L. Xue, C. Liu, and S. Tiwari, "Multi-layers With Buried Structures (MLBS): An Approach to Three-dimensional Integration," *Proceedings of the IEEE International Conference on Silicon On Insulator*, pp. 117-118, October 2001.
- [4] A. Fan, A. Rahman, and R. Reif, "Copper Wafer Bonding," *Electrochemical and Solid-State Letters*, Vol. 10, No. 2, pp. 534-536, October 1999.
- [5] J. W. Joyner et al., "Impact of Three-dimensional Architectures on Interconnects in Gigascale Integration," *IEEE Transactions on Very Large Scale Integration (VLSI)* Systems, Vol. 9, No. 6, pp. 922-927, December 2000.
- [6] A. Rahman and R. Reif, "System-level Performance Evaluation of Three-dimensional Integrated Circuits," *IEEE*

Transactions on Very Large Scale Integration (VLSI) Systems, Vol. 8, No. 6, pp. 671-678, December 2000.

- [7] D. Stroobandt and J. van Campenhout, "Estimating Interconnection Length in Three-dimensional Computer Systems," *IEICE Transactions on Information and Systems*, Vol. 80, No. 10, pp. 1024-1031, October 1997.
- [8] R. Zhang *et al.*, "Stochastic Modeling, Power Trends, and Performance Characterization of 3-D Circuits," *IEEE Transactions on Electron Devices*, Vol. 48, No. 4, pp. 638-652, April 2001.
- [9] B. Landman and R. Russo, "On a Pin vs. Block Relationship for Partitions of Logic Graphs," *IEEE Transactions on Computers*, Vol. 20, No. 12, pp. 1469-1479, December 1971.
- [10] C. C. Tong and C.-L. Wu, "Routing in a Three-dimensional Chip," *IEEE Transactions on Computers*, Vol. 44, No.1, pp. 106-117, January 1995.
- [11] A. Cohoon *et al.*, "Physical Layout for Three-Dimensional FPGAs," *Proceedings of the ACM/SIGDA Physical Design Workshop*, pp. 142-149, April 1996.
- [12] P. Benkart *et al.*, "Placement and Routing in 3D Integrated Circuits," *Journal on Design and Test of Computers*, Vol. 22, No. 6, pp. 520-531, November/December 2005.

- [13] J. Cong, J. Wei, and Y. Zhang, "A Thermal-Driven Floorplanning Algorithm for 3D ICs," *Proceeding of the IEEE/ACM Conference on Computer-Aided Design*, pp. 306-313, November 2004.
- [14] V. F. Pavlidis and E. G. Friedman, "Interconnect Delay Minimization through Interlayer Via Placement," *Proceedings of the ACM Great Lakes Symposium on VLSI*, pp. 20-25, April 2005.
- [15] J. G. Ecker, "Geometric Programming: Methods, Computations and Applications," *SIAM Review*, Vol. 22, No. 3, pp. 338-362, July 1980.
- [16] S. Boyd, S. J. Kim, L. Vandenberghe, and A. Hassibi, "A Tutorial on Geometric Programming," *Optimization and Engineering* (in press).
- [17] Metal User's Guide, www.oea.com
- [18] J. Löfberg, "YALMIP: A Toolbox for Modeling and Optimization in MATLAB," *Proceedings of the IEEE International Symposium on Computer-Aided Control Systems Design*, pp. 284-289, September 2004.
- [19] D. Henrion and J. B. Lasserre, "GloptiPoly: Global Optimization over Polynomials with Matlab and SeDuMi," *ACM Transactions on Mathematical Software*, Vol. 29, No. 2, pp. 165-194, June 2003.

| TABLE II. | Optimization results for various two-te | erminal interplane interconnects a | and number of physical planes, <i>n</i> . |
|-----------|-----------------------------------------|------------------------------------|-------------------------------------------|
|-----------|-----------------------------------------|------------------------------------|-------------------------------------------|

|    | Average<br>Interconnect<br>Length<br>[mm] | $\begin{array}{c} \text{connect} & \Delta x_i \text{'s} \\ \text{ngth} & [\mu m] \end{array}$ | Delay Improvement [%] |       | YALMIP/OVPA     | Deviation of OVPA<br>from optimum<br>solution [%] |                 |                       |       |      |           |
|----|-------------------------------------------|-----------------------------------------------------------------------------------------------|-----------------------|-------|-----------------|---------------------------------------------------|-----------------|-----------------------|-------|------|-----------|
| п  |                                           |                                                                                               | [µm]                  |       | ced in the ddle |                                                   | om Via<br>ement | Runtime ratio × times | avg   | max  | Instances |
|    |                                           |                                                                                               |                       |       | avg             | max                                               | avg             | max                   |       | uvg  | ших       |
| 4  | 0.996                                     | 100                                                                                           | 3.45                  | 9.87  | 5.37            | 20.05                                             | 95.5            | 0.0001                | 0.008 | 5000 |           |
| 4  | 1.302                                     | 200                                                                                           | 6.31                  | 22.76 | 9.87            | 46.91                                             | 241.6           | 0.0002                | 0.021 | 5000 |           |
| 4  | 1.600                                     | 300                                                                                           | 8.73                  | 26.85 | 12.58           | 55.91                                             | 111.4           | 0.0001                | 0.025 | 5000 |           |
| 5  | 1.277                                     | 100                                                                                           | 3.91                  | 10.84 | 5.57            | 22.04                                             | 93.1            | 0                     | 0.003 | 5000 |           |
| 5  | 1.684                                     | 200                                                                                           | 7.06                  | 19.65 | 10.09           | 41.28                                             | 226.4           | 0.0001                | 0.009 | 5000 |           |
| 5  | 2.076                                     | 300                                                                                           | 9.77                  | 26.74 | 14.27           | 55.45                                             | 90.1            | 0.0001                | 0.009 | 5000 |           |
| 7  | 1.840                                     | 100                                                                                           | 4.64                  | 12.81 | 6.21            | 25.16                                             | 84.6            | 0                     | 0.008 | 5000 |           |
| 7  | 2.440                                     | 200                                                                                           | 8.26                  | 23.77 | 11.07           | 47.56                                             | 89.0            | 0                     | 0.002 | 5000 |           |
| 10 | 2.678                                     | 100                                                                                           | 5.48                  | 12.74 | 6.90            | 25.72                                             | 78.9            | 0                     | 0.005 | 5000 |           |