## **On Optimal Physical Synthesis of Sleep Transistors**

Changbo Long, Jinjun Xiong and Lei He {longchb, jinjun, lhe}@ee.ucla.edu EE department, University of California, Los Angeles, CA, 90095

## ABSTRACT

Considering the voltage drop constraint over a distributed model for power/ground (P/G) network, we study the following two problems for physical synthesis of sleep transistors: the min-area sleep transistor insertion (and sizing) (TIS) problem with respect to a fixed P/G network, and the simultaneous sleep transistor insertion and P/G network sizing (TIPGS) problem to minimize the weighted area of sleep transistors and P/G network. We show that there may exist multiple sleep transistor insertion solutions that all lead to a same minimum area in the TIS and TIPGS problems. We develop optimal algorithms to TIS and TIPGS problems by modeling the circuit as a single current source, and then extend to the case modeling the circuit as distributed current sources. Compared with the best known approach, our algorithms achieve area reduction by up to 44.1% and 61.3% for TIS and TIPGS, respectively.

#### **Categories and Subject Descriptors**

B.7.2 [Design Aids]: Layout, Placement and routing, Verification.

#### **General Terms**

Algorithms, Design, Performance.

#### Keywords

Power-gating, Sleep transistors, physical design.

#### 1. INTRODUCTION

Leakage power has gained an increasing importance as the VLSI technology advances to the deep-submicron era.

Copyright 2004 ACM 1-58113-817-2/04/0004 ...\$5.00.

According to [1], leakage power is around 40% of the total power in a 3 GHz Pentium 4 processor. Sleep transistors (see Fig. 1) are effective to reduce leakage power, but they also introduce extra voltage drop with increased delay and reduced noise margin. Because the introduced voltage drop is determined by the size of sleep transistors, sleep transistors can be sized to limit voltage drop and minimize area.



Figure 1: Illustration for sleep transistors. They are turned off when the circuit is in the standby mode.

The key of sizing sleep transistors includes 1) characterization of switching current and 2) physical design of sleep transistors. Most existing work studies characterization of switching current. Some recent papers have also studied the synthesis of sleep transistors. In [2], the discharging pattern of the switching current is exploited to save sleep transistor area. In [3], circuits are divided into clusters and each cluster is connected to a sleep transistor. To reduce the size of sleep transistors, techniques such as bin-packing and set-partitioning have been employed to reduce the simultaneous switching current in the clusters. In [4], to take advantage of the discharge balancing property of switching current, a mesh of distributed sleep transistors is proposed to save the area of sleep transistors. In addition, [5] employs a distributed P/G model and proposes two design styles to layout sleep transistors. They are inserted between each row of the standard cells and P/G network in one style and form an external ring between all gates and external power supply pins in the other. However, all above work assume ideal or fixed P/G networks and there is no automatic method to simultaneously optimize sleep transistors and P/G networks.

In this paper, we develop automatic physical synthesis of sleep transistors with a distributed P/G model. Specifically, we study two problems: the sleep transistor insertion (and

<sup>\*</sup>This research is partially supported by NSF CAREER award CCR-0093273, a UC MICRO grant sponsored by Analog Devices, Fujitsu Laboratories of America, Intel and LSI Logic. and a grant from Intel. Address comments to lhe@ee.ucla.edu.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.

ISPD'04, April 18-21, 2004, Phoenix, Arizona, USA.

sizing) (TIS) problem with fixed P/G network, and simultaneous sleep transistor insertion and P/G network sizing (TIPGS) problem that sizes both sleep transistors and P/G network wires. The rest parts of the paper are organized as follows. We present modeling and problem formulations in Section 2, and solve the *TIS* and *TIPGS* problems in Section 3 and 4, respectively. We present the experiment results in Section 5 and conclude the paper in Section 6. The proofs of all lemmas and theorems are included in a technical report [6].

## 2. MODELING AND PROBLEM FORMU-LATIONS

We summarize the notations frequently used in this paper in Table 1.

| $ ho_p$                   | sheet resistance of P/G network.                        |
|---------------------------|---------------------------------------------------------|
| $ ho_s$                   | effective sheet resistance of sleep transistors.        |
| $L_p$                     | length of P/G branches.                                 |
| $L_s$                     | channel length of sleep transistors.                    |
| $W_p$                     | width of P/G branches.                                  |
| $\hat{W_s}$               | channel width of sleep transistors.                     |
| $r_p$                     | resistance of P/G branches.                             |
| $r_s$                     | channel resistance of sleep transistors.                |
| TP                        | tapping points where gates connect to P/G network.      |
| $\overline{V}$            | upper bound of supply voltage drop. The default         |
|                           | value is 10% VDD.                                       |
| $\overline{V_p}$          | upper bound of voltage drop for P/G network.            |
| $\overline{V_s}$          | upper bound of voltage drop for sleep transistors.      |
| $C_{TP}$                  | cut-set of P/G branches disconnecting all gates from    |
|                           | power supply.                                           |
| $\overrightarrow{C_{TP}}$ | $C_{TP}$ with a uniform current direction.              |
| $A_p$                     | area of P/G network.                                    |
| $\hat{A_n^*}$             | optimal area of P/G network.                            |
| $A_s^P$                   | area of sleep transistors.                              |
| $A_s^*$                   | optimal area of sleep transistors.                      |
| TIS                       | min-area sleep transistor insertion and sizing problem. |
| TIPGS                     | simultaneous sleep transistor insertion and P/G         |
|                           | network sizing problem.                                 |
| SSN                       | single source network.                                  |
| MSN                       | multiple source network.                                |

Table 1: Summary of notations.

#### 2.1 Switching current model

The switching current of gates is time-variant and varies with respect to the input of the circuit. It has been modeled as time-invariant variable to reduce the complexity in [7– 9]. In this paper, we model the switching current as timeinvariant maximum current and will extend to time-variant current model in the future.

#### 2.2 P/G network model

P/G networks include power networks and ground networks. A power network can be transferred into a ground network by reversing the directions of currents. Therefore, in this paper we only consider the ground network without loss of generality.

The P/G network is modeled as an adjoint multi-port resistive network with one common-terminal, the ground(GND). The resistance of P/G branches is

$$r_p = \rho_p \cdot \frac{L_p}{W_p},\tag{1}$$

where  $\rho_p$ ,  $L_p$  and  $W_p$  are the sheet resistance, length, and width of P/G branches, respectively. We illustrate the modeling of P/G network in Fig. 2. As shown in the figure, gates are modeled as current sources and connect to the P/G network through tapping points (*TP*). P/G branches are modeled as resistors.



Figure 2: An example of P/G network modeling.

A resistive network can be represented as a graph  $\Gamma(\mathbf{V}, \mathbf{B})$ , where  $\mathbf{V}$  is the vertex set and  $\mathbf{B}$  is the branch set. Of particular interests are special subsets of  $\mathbf{B}$  called *cut-set* defined as follows.

DEFINITION 1. A cut-set of  $\Gamma(\mathbf{V}, \mathbf{B})$  is a set of branches  $\mathbf{C} \subseteq \mathbf{B}$ . Removing all branches in  $\mathbf{C}$  causes the network unconnected, but the removal of any proper subset of this set keeps the network connected. Among all cut-sets, those disconnecting all TP from power supply pins are defined as TP cut-set and denoted as  $C_{TP}$  (see Fig. 2 for an example).

#### 2.3 Sleep transistor insertion and sizing

We formulate the sleep transistor insertion problem as follows.

FORMULATION 1. Given a fixed P/G network  $\Gamma(\mathbf{V}, \mathbf{B})$ , the min-area sleep transistor insertion (and sizing) problem (TIS) finds a set of branches  $\mathbf{C} \subseteq \mathbf{B}$  to insert sleep transistors with minimum area such that all paths between TP and power pins are interrupted, and voltage drop constraints are satisfied.

THEOREM 1. The optimal solution to the TIS problem must be a  $C_{TP}$ .

A  $C_{TP}$  divides **V** into two disjointed subsets where all TP are in one set  $\mathbf{V}_1$ , and all external power pins in the other set  $\mathbf{V}_2$ . Although the *net* current should flow from  $\mathbf{V}_1$  to  $\mathbf{V}_2$ , the current directions in particular branches of  $C_{TP}$ , however, could be different. Intuitively, the non-uniform current directions in  $C_{TP}$  result in a larger sleep transistor area for the given voltage drop constraints. Therefore, we only consider a  $C_{TP}$  with the uniform current direction from  $\mathbf{V}_1$  to  $\mathbf{V}_2$ . This kind of  $C_{TP}$  is denoted as  $\overrightarrow{C_{TP}}$  in the following.

#### 2.4 Simultaneous sleep transistor insertion and P/G network sizing

Under a constant voltage drop constraint, increasing the area of sleep transistors leads to smaller voltage drop, which would allow us to save area on P/G network, or vice versa. In this sense, the area of P/G network and sleep transistors are exchangeable. This area exchangeability can be used to reduce the total chip area. For example, in a design with small number of metal layers, the routing area may be

the bottleneck to decide the size of the chip. In this case, budgeting a relatively large area to sleep transistors but a small routing area to P/G network can reduce the total chip area.

To provide a smooth trade-off between the area of P/G network and that of sleep transistors, we formulate the *simultaneous sleep transistor insertion and* P/G network sizing problem as follows:

FORMULATION 2. Simultaneous sleep transistor insertion and P/G network sizing (TIPGS): Given P/G network topology and voltage drop constraint, the TIPGS problem finds a  $\overrightarrow{C_{TP}}$  to insert sleep transistors and determines the size of sleep transistors and P/G branches such that  $\alpha A_p + \beta A_s$  is minimized, where  $\alpha$  and  $\beta$  are given constants, and  $A_p$  and  $A_s$  are the area of P/G network and sleep transistors, respectively.

## 3. TIS PROPERTIES AND ALGORITHMS

We first solve TIS on Single Source Network (SSN), where all gates are modeled as a single current source and then extend the solution to *Multiple Source Network* (MSN), where gates are modeled as distributed current sources.

#### **3.1** Single source network

SSN falls into the category of one-port two-terminal resistive network as shown in Fig. 3. The two terminals are TPand ground(GND). In this network, *driving-point impedance* is defined as

$$R = \frac{V}{I},$$

where V and I are the voltage and current between TP and GND, respectively. Regarding this network, TP is a single node and we have:



Figure 3: Illustration of SSN.

LEMMA 1. For an arbitrary  $\overrightarrow{C_{TP}} = \{c_1, c_2, ..., c_k\}$  in a one-port two-terminal network  $\Gamma(\mathbf{V}, \mathbf{B})$ , if the resistance of the resistor in each branch  $c_i$  increases by  $\Delta r_i > 0$ , we have

$$\frac{1}{\Delta R} \le \sum_{\overrightarrow{C_{TP}}} \frac{1}{\Delta r_i},\tag{2}$$

where  $\Delta R$  is the increase of the driving-point impedance.

LEMMA 2. For an arbitrary  $\overrightarrow{C_{TP}} = \{c_1, c_2, \ldots, c_k\}$ , if the current on P/G branch  $c_i$  is  $i_i$  and  $\sum_{\overrightarrow{C_{TP}}} 1/\Delta r_i$  is given, the following conditions minimize  $\Delta V$  on TP (the increase of voltage after increasing the resistance):

$$\frac{1/\Delta r_i}{\sum_{\overline{C_{TP}}} 1/\Delta r_i} = \frac{i_i}{I}.$$
(3)

LEMMA 3. All the sleep transistors have a same voltage drop in an optimal TIS solution.

Lemma 1, 2 and 3 reveal the following solution to TIS in SSN.

THEOREM 2. For any  $\overrightarrow{C_{TP}}$  in SSN, inserting sleep transistor into branch  $c_i \in \overrightarrow{C_{TP}}$  with area of

$$A_i = \rho_s \cdot L_s^2 \cdot \frac{i_i}{\overline{V} - V_p} \tag{4}$$

leads to an optimal solution for TIS, where  $i_i$  is the current in  $c_i$ ,  $\overline{V}$  is the voltage constraint on TP, and  $V_p$  is the voltage on TP before the insertion of sleep transistors.

THEOREM 3. Any  $\overrightarrow{C_{TP}}$  leads to a optimal solution of TIS with the same area.

Note that Theorem 2 and 3 solve TIS optimally and indicate that the optimal solution of TIS is not unique. This design freedom could be used to optimize for other design constraints such as routing congestion.

#### **3.2** Multiple source network



Figure 4: Illustration of MSN.

MSN belongs to m-terminal network as shown in Fig. 4, and there exist m-1 nodes in TP. Similar to Lemma 1, we have

HYPOTHESIS 1. For an arbitrary  $\overrightarrow{C_{TP}} = \{c_1, c_2, ..., c_k\}$ in an m-terminal network, if the resistance of the resistor in branch  $c_i$  increases by  $\Delta r_i > 0$ , then

$$\sum_{i=1}^{m-1} \frac{I_i}{\Delta V_i} \le \sum_{i=1}^k \frac{1}{\Delta r_i},\tag{5}$$

where  $I_i$  is the current source placed between terminal *i* and GND, and  $\Delta V_i$  is the increase of voltage at terminal *i*.

TIS of MSN can be solved based on Hypothesis 1. By Hypothesis 1, we have

$$\sum_{i=1}^{m-1} \frac{I_i}{\overline{V} - v_{p,i}} \le \sum_{i=1}^k \frac{1}{r_{s,i}},\tag{6}$$

where  $\overline{V}$  is the voltage drop constraint on TP,  $I_i$  is the current on  $TP_i$ ,  $v_{p,i}$  is the voltage on  $TP_i$  with no sleep transistors inserted, and  $r_{s,i}$  is the resistance of sleep transistors. Similar to Theorem 2 in SSN, we have

$$A_{s} \ge \rho_{s} \cdot L_{s}^{2} \cdot \sum_{i=1}^{m-1} \frac{I_{i}}{\overline{V} - v_{p,i}}.$$
(7)

The right-hand side of (7) is the lower bound on the area of sleep transistors in MSN. One solution to achieve the minimum area is to find a separable  $\overrightarrow{C_{TP}}$ , which is defined as follows.

DEFINITION 2.  $A \xrightarrow{\overrightarrow{C_{TP}}} is separable if it can be partitioned$ to m-1 subset  $\overrightarrow{C_{TP}}^{(1)}, \dots, \overrightarrow{C_{TP}}^{(m-1)}$  such that 1) For any  $1 \leq i, j \leq m-1, \ \overrightarrow{C_{TP}}^{(i)} \cap \overrightarrow{C_{TP}}^{(j)} = \Phi$ . 2) Each subset  $\overrightarrow{C_{TP}}^{(i)}$  is a  $\overrightarrow{C_{TP}}$  for  $TP_i$ .

One way to obtain a separable  $\overrightarrow{C_{TP}}$  is to use all P/G branches directly connected to a current source as  $\overrightarrow{C_{TP}}^{(i)}$ .

In summary, an algorithm is described in Fig. 5. Also, Hypothesis 1 will be verified experimentally in Section 5.

TIS algorithm for MSN1. Find a separable  $\overrightarrow{C_{TP}} = \overrightarrow{C_{TP}}^{(1)} \cup \cdots \cup \overrightarrow{C_{TP}}^{(m-1)}$ .
2. For each  $\overrightarrow{C_{TP}}^{(t)}$ , insert sleep transistor with  $A_i = \rho_s \cdot L_s^2 \cdot \frac{i_i}{\overline{V} - v_{p,t}}$ , where  $i_i$  is the current on  $c_i$  and  $v_{p,t}$  is the voltage on  $TP_t$  before inserting sleep transistors.



#### 4. TIPGS PROPERTIES AND ALGORITHMS

As in Section 3, we first solve TIPGS in SSN and then extend the solution to MSN in this section.

#### 4.1 Single source network

Let  $A_p$  be the area of the P/G network, we have

$$A_p = \sum_{\mathbf{B}} L_p \cdot W_p. \tag{8}$$

To solve the TIPGS problem for SSN, we introduce the following lemmas first.

LEMMA 4. In a min-area P/G network satisfying voltage drop constraint  $\overline{V}$  at tapping points, the product of the P/Garea  $A_p^*$  and  $\overline{V}$  is a constant. We define the constant product as

$$K_p^* = A_p^* \cdot \overline{V_p}. \tag{9}$$

Lemma 4 indicates that  $A_p^*$  is reversely proportional to  $\overline{V_p}$  and shows that the optimal sizing solution under a voltage drop constraint  $\overline{V_{p,1}}$  can be extended to another voltage drop constraint  $\overline{V_{p,2}}$  by scaling branches with the ratio of  $\overline{V_{p,1}}/\overline{V_{p,2}}$ . Similar to Lemma 4, we have the following lemma for sleep transistors.

LEMMA 5. For a given P/G network, we assume that sleep transistors inserted at an arbitrary  $\overrightarrow{C_{TP}}$  have a voltage drop equal to or below  $\overline{V_s}$ . The product of the minimum sleep transistor area  $A_s^*$  and  $\overline{V_s}$  is a constant. We define the constant product as

$$K_s^* = A_s^* \cdot \overline{V_s}.\tag{10}$$

Lemma 5 indicates the same property for sleep transistors as Lemma 4 for P/G network.

LEMMA 6. Given the voltage drop constraint on the TP in SSN as  $\overline{V}$ , we have

$$\alpha A_p + \beta A_s \ge \frac{(\sqrt{\alpha K_p^*} + \sqrt{\beta K_s^*})^2}{\overline{V}}.$$
 (11)

In other words, (11) provides a lower bound on the weighted area of P/G network and sleep transistors.

With a total voltage drop  $\overline{V}$  over P/G network and sleep transistors, we denote the voltage drop constraint on sleep transistors as  $\overline{V_s}$  and the voltage drop constraint on P/G network by removing sleep transistors as  $\overline{V_p}$ .

THEOREM 4. In an optimal TIPGS,  $\overline{V_s}$  and  $\overline{V_p}$  must be

$$\frac{\sqrt{\beta K_s^*}}{\sqrt{\alpha K_p^*} + \sqrt{\beta K_s^*}} \cdot \overline{V},\tag{12}$$

and

$$\frac{\sqrt{\alpha K_p^*}}{\sqrt{\alpha K_p^*} + \sqrt{\beta K_s^*}} \cdot \overline{V},\tag{13}$$

respectively.

Note that TP is a single node in TIPGS.

THEOREM 5. Inserting sleep transistors at any  $\overline{C_{TP}}$  leads to optimal TIPGS solutions with the same weighted sum of P/G network and sleep transistor area.

Theorem 4 is a necessary condition to minimize the weighted sum of P/G network and sleep transistor area. To make it sufficient, additionally we need to 1) optimally size P/G network to minimize  $A_p$  under the voltage drop constraint  $\overline{V_p}$  determined by (13) and 2) follow the solution of *TIS* to insert sleep transistors under the voltage drop constraint determined by (12).

#### 4.2 Multiple source network

Similar to SSN,  $K_p^*$  and  $K_s^*$  can be defined for MSN. Then, the counterpart of Theorem 6 is presented as follows.

HYPOTHESIS 2. Given the voltage drop constraint on TP in MSN as  $\overline{V}$ , we have

$$\alpha A_p + \beta A_s \ge \frac{(\sqrt{\alpha K_p^*} + \sqrt{\beta K_s^*})^2}{\overline{V}}.$$
 (14)

In other words, (14) provides a lower bound on the weighted area of P/G network and sleep transistors in MSN.

If Hypothesis 2 holds, Theorem 4 and 5 hold for MSN, too. Therefore, an TIPGS algorithm for MSN can be developed as in Fig. 6.

| <ol> <li>Determine V<sub>s</sub> and V<sub>p</sub> by (12) and (13), respectively.</li> <li>Size P/G network under constraint V<sub>p</sub> to minimize A<sub>p</sub>.</li> <li>Insert and size sleep transistors under constraint V<sub>s</sub> using TIS algorithm in Fig. 5.</li> </ol> | TIPGS algorithm for $MSN$                                                                                                                                                                                                           |
|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|                                                                                                                                                                                                                                                                                            | <ol> <li>Determine Vs and Vp by (12) and (13), respectively.</li> <li>Size P/G network under constraint Vp to minimize Ap.</li> <li>Insert and size sleep transistors under constraint Vs using TIS algorithm in Fig. 5.</li> </ol> |

#### Figure 6: TIS algorithm for MSN.

However, no algorithm has been proposed in the literature to optimally size P/G network (step 2 in Fig. 6). Nevertheless, we can construct the best algorithm to minimize  $\alpha A_p + \beta A_s$  based on the best known algorithm to size P/G network.

#### 5. EXPERIMENT

In this section, we first verify Hypothesis 1 and 2 by experiments, and then compare the *Hypo1-based TIS algorithm* in Fig. 5 and *Hypo2-based TIPGS algorithm* in Fig. 6 with alternative algorithms based on sequential linear programming.

#### 5.1 Verification of Hypothesis 1

For the purpose of verifying Hypothesis 1, we define effective area ratio (EAR) as

$$EAR = \left(\sum_{i=1}^{m-1} \frac{I_i}{\Delta V_i}\right) / \left(\sum_{i=1}^k \frac{1}{\Delta r_i}\right).$$
 (15)

where  $I_i$ ,  $\Delta V_i$ , and  $\Delta r_i$  are the same as in Hypothesis 1. If Hypothesis 1 holds, we have

$$EAR \le 1.$$
 (16)

To verify Hypothesis 1, we compute the EAR for nine mesh networks as shown in Table 2 under 100,000 random solutions. For each solution, the value of current sources , the  $\overrightarrow{C_{TP}}$ , and the size of sleep transistors are randomly chosen, and EAR is obtained by solving the networks with a linear solver integrated in SIS1.2 [10]. We report the computed EAR in column 4 of Table 2.

| 1                | 2         | 3          | 4    | 5      |
|------------------|-----------|------------|------|--------|
|                  |           |            | Max  | ĸ. EAR |
| Mesh             | # Node    | # Branches | TIS  | TIPGS  |
| $3 \times 3$     | 16        | 24         | 1.00 | 0.79   |
| $5 \times 5$     | 25        | 60         | 1.00 | 0.68   |
| $10 \times 10$   | 121       | 220        | 0.96 | 0.89   |
| $20 \times 20$   | 441       | 840        | 1.00 | 0.96   |
| $30 \times 30$   | 961       | 1,860      | 0.97 | 0.97   |
| $40 \times 40$   | 1,681     | 3,280      | 0.98 | 0.89   |
| $60 \times 60$   | 3,721     | 7,320      | 0.97 | 0.93   |
| $80 \times 80$   | $6,\!561$ | 12,960     | 0.97 | 1.00   |
| $100 \times 100$ | 10,201    | 20,200     | 0.96 | 0.96   |

Table 2: Random solutions  $(100,000 \times)$  to compute the maximum *EAR*.

According to column 4 of Table 2, it clearly shows that the maximum EAR values in all networks are equal to or less than 1. This means that the solution of TIS by the algorithm in Fig. 5 has the smallest area among all these 100,000 random solutions. This strongly indicates the correctness of Hypothesis 1.

## 5.2 Verification of Hypothesis 2

To verify Hypothesis 2, we define effective area ratio as

$$EAR = \left(\frac{(\sqrt{\alpha K_p^*} + \sqrt{\beta K_s^*})^2}{\overline{V}}\right) / (\alpha A_p + \beta A_s).$$
(17)

If Hypothesis 2 holds, we have

$$EAR \le 1.$$
 (18)

We compute the EAR for TIPGS in the same fashion as for TIS. For each circuit, we carry out  $100,000 \times$  random solutions to find the maximum EAR. However, in TIPGS $K_p^*$  and  $K_s^*$  are needed to compute EAR. According to Lemma 4,

$$K_p^* = A_p^* \cdot \overline{V_p}.$$
 (19)

Since  $A_p^*$  is unavailable in the experiments, we approximate  $K_p^*$  by

$$K_p^* = \min_{\sigma} (A_p \cdot \overline{V_p}), \qquad (20)$$

where S represents the set for all solutions.  $K_s^*$  is computed by

$$K_s^* = \rho_s \cdot L_s^2 \cdot \sum_{i=1}^{m-1} I_i.$$
 (21)

We reported the computed EAR in column 5 of Table 2. According to column 5 of Table 2, the maximum EAR is always less or equal to 1 among 100,000 random solutions for all networks. This clearly implies the correctness of Hypothesis 2.

# **5.3 Comparison between algorithms for** *TIS* **and** *TIPGS*

| Circuit | # Block | # GND | SLP-based | Hypo1-based   |
|---------|---------|-------|-----------|---------------|
|         |         |       | $A_s(\%)$ | $A_s(\%)$     |
| apte    | 9       | 2     | 0.18      | 0.14 (-22.2%) |
| xerox   | 9       | 4     | 0.28      | 0.17 (-29.3%) |
| hp      | 10      | 3     | 0.25      | 0.14 (-44.0%) |
| a3      | 25      | 3     | 0.21      | 0.13 (-38.1%) |
| ami     | 33      | 3     | 0.19      | 0.13 (-31.2%) |
| playout | 62      | 5     | 0.34      | 0.19 (-44.1%) |
| g2      | 241     | 4     | 0.15      | 0.10 (-33.3%) |

Table 3: Comparison between SLP-based andHypo1-based algorithm for TIS.

#### 5.3.1 Algorithms

We have revised the sequential linear programming algorithm proposed in [7] to solve TIPGS (denote as SLP-based algorithm) as a comparison base. The sequential linear programming algorithm in [7] is employed to size P/G network, where each branch of P/G network is modeled as a resistor. Because sleep transistors are also modeled as resistors, we are able to modify [7] to size both the P/G network and sleep transistors simultaneously (See [11] for details of the algorithm).

In fact, the *SLP-based algorithm* provides a comparison base for both *Hypo1-based algorithm* to solve *TIS* and *Hypo2based algorithm* to solve *TIPGS*. *Hypo1-based algorithm* follows the exact steps in Fig. 5. The Hypo2-based algorithm follows the steps in Fig. 6 but with minor modifications. Because there is no optimal algorithm available to minimize  $A_p$ , we employ the *SLP-based algorithm* to obtain the "optimal" P/G network under given voltage drop constraints.

For all algorithms in the experiments, we have chosen the same separable  $\overrightarrow{C_{TP}}$  that is directly adjacent to the tapping points. Theorem 3 and 5 indicate that all  $\overrightarrow{C_{TP}}$  have the same optimal value for both *TIS* and *TIPGS*, but experiment results have shown that this  $\overrightarrow{C_{TP}}$  produces a relatively good result for *SLP-base algorithm*. Therefore, the experiment setting is favorable to the *SLP-base algorithm*.

#### 5.3.2 Results

The *SLP-based*, *Hypo1-based*, and *Hypo2-based algorithm* have been applied to NCSU benchmarks [12]. Switching current is modeled as time-invariant and the current density is

|   |         |         |       |               |       | (04)         |                 |       |               |
|---|---------|---------|-------|---------------|-------|--------------|-----------------|-------|---------------|
|   | Circuit | # Block | # GND | SLP-based (%) |       |              | Hypo2-based (%) |       |               |
| ļ |         |         | Pin   | $A_p$         | $A_s$ | Weighted sum | $A_p$           | $A_s$ | Weighted sum  |
|   | apte    | 9       | 2     | 2.55          | 0.18  | 2.73         | 1.79            | 0.30  | 2.09(-23.4%)  |
|   | xerox   | 9       | 4     | 3.14          | 0.28  | 3.42         | 1.94            | 0.26  | 2.20 (-35.7%) |
|   | hp      | 10      | 3     | 2.31          | 0.25  | 2.56         | 1.20            | 0.31  | 1.51 (-41.0%) |
|   | a3      | 25      | 3     | 2.08          | 0.21  | 2.29         | 1.37            | 0.25  | 1.62 (-29.3%) |
|   | ami     | 33      | 3     | 1.88          | 0.19  | 2.07         | 0.90            | 0.19  | 1.09 (-47.3%) |
|   | playout | 62      | 5     | 4.96          | 0.34  | 5.30         | 4.19            | 0.38  | 4.57 (-13.8%) |
|   | g2      | 241     | 4     | 3.67          | 0.15  | 3.82         | 1.35            | 0.13  | 1.48 (-61.3%) |

Table 4: Comparison between SLP-based and Hypo2-based algorithm for TIPGS.

 $300mA/mm^2$ , which is similar to that of the Alpha microprocessor in [13]. We assume the P/G pitch as  $50\mu m$  and present  $A_p$  and  $A_s$  in the percentage of chip area.

To compare the *SLP-based algorithm* with the *Hypo1-base algorithm*, we first apply the *SLP-based algorithm* to find the size of P/G network branches and the size of sleep transistors. Then, we fix the size of P/G network branches and resize the sleep transistors by using the *Hypo1-base algorithm*. We compare the total area of sleep transistors obtained by the *SLP-based algorithm* and *Hypo1-base algorithm* in Table 3. For *TIS* problem, we found that the *Hypo1-base algorithm* and it can reduce the transistor area by up to 44.1%. As shown in Table 4, for *TIPGS* problem, the *Hypo2-base algorithm* reduces the total area significantly (up to 61.3%) with  $\alpha$  and  $\beta$  being set as 1.0.

#### 5.3.3 Discussion

It is observed in our experiment that SLP-based algorithm usually terminates when only one TP reaches the voltage drop constraint  $\overline{V}$ . The voltage drop slacks on other TP lead to extra P/G network and/or sleep transistor area. From Fig. 5 and 6, one can see that in Hypo1-based algorithm and Hypo2-base algorithm, the voltage drop on all TP are uniformly equal to the voltage drop constraint, which leads to significant area reduction.

#### 6. DISCUSSION AND CONCLUSION

Under a distributed P/G network model, we have studied the sleep transistor insertion (and sizing) problem (TIS)and simultaneous sleep transistor insertion and P/G sizing problem (TIPGS). We have developed effective algorithms to solve these two problems by revealing the optimal solutions to them. Compared with the best known approach using sequential linear programming, our algorithms reduce area by up to 44.1% and 61.3% for TIS and TIPGS, respectively. Our TIS and TIPGS algorithms are extremely efficiently too, as all steps are based on closed-form formulas. We have shown that there exist multiple optimal solutions to these problems, which offer design freedom to consider other design constraints such as routing congestion.

The time-invariant current model is assumed in this paper. In the future, we intend to extend our problem formulations and algorithms to time-variant current model.

#### 7. REFERENCES

 A.S.Grove, "Changing vectors of moore's law," Keynote speech, International Electron Devices Meeting, Dec. 2002.

- [2] J. Kao, S. Narendra, and A. Chandrakasan, "MTCMOS hierarchical sizing based on mutual exclusive discharge patterns," in *DAC*, 1998.
- [3] M. Anis, S. Areibi, and M. Elmasry, "Dynamic and leakage power reduction in MTCMOS circuits using an automated efficient gate clustering technique," in *DAC*, 2002.
- [4] C. Long and L. He, "Distributed sleep transistor network for power reduction," in *Proc. Design Automation Conf*, pp. 181–186, 2003.
- [5] S. Kosonocky, M. Irnmediato, P. Cottrell, T. Hook, R. Mann, and J. Brown, "Enhanced multi-threshold (mtcmos) circuits using variable well bias," in *Proc. Int. Symp. on Low Power Electronics and Design*, pp. 165–169, 2001.
- [6] C. Long, J. Xiong, and L. He, "On optimal physical synthesis of sleep transistors," in *Technical report*, *EE Dept.*, UCLA, 2003.
- [7] X. D. Tan and C. J. Shi, "Reliability-constrained area optimization of VLSI power/ground networks via sequence of linear programmings," in *Proc. Design Automation Conf*, pp. 78–83, 1999.
- [8] T. Mitsuhashi and E. Kuh, "Power and ground network topology optimization for cell based VLSIs," in *Proc. Design Automation Conf*, pp. 524–529, 1992.
- [9] S. X. D. Tan and C. J. Shi., "Efficient vlsi power/ground network sizing based on equivalent circuit modeling," *IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems*, March 2003.
- [10] E. M. Sentovich, K. J. Singh, L. Lavagno, and etc., "Sis: a system for sequential circuit synthesis," *Memorandum NO. UCB/ERL M92/41*, May 1992.
- [11] C. Long, J. Xiong, and L. He, "Automatic synthesis of power gating capable power/ground network," in *Technical report, EE Dept.*, UCLA, 2003.
- [12] http://www.cbl.ncsu.edu.
- [13] A. Jain, W. Anderson, T. Benninghoff, and D. e. Berucci, "A 1.2 ghz alpha microprocessor with 44.8 gb/s chip pin bandwidth," in *Proc. IEEE Int. Solid-State Circuits Conf.*, pp. 240–241, 2001.