# Physically Justifiable Die-Level Modeling of Spatial Variation in View of Systematic Across Wafer Variability

Lerong Cheng, Puneet Gupta, Member, IEEE, Costas J. Spanos, Fellow, IEEE, Kun Qian, Student Member, IEEE, and Lei He, Senior Member, IEEE

Abstract-Modeling spatial variation is important for statistical analysis. Most existing works model spatial variation as spatially correlated random variables. We discuss process origins of spatial variability, all of which indicate that spatial variation comes from deterministic across-wafer variation, and purely random spatial variation is not significant. We analytically study the impact of across-wafer variation and show how it gives an appearance of correlation. We have developed a new die-level variation model considering deterministic across-wafer variation and derived the range of conditions under which ignoring spatial variation altogether may be acceptable. Experimental results show that for statistical timing and leakage analysis, our model is within 2% and 5% error from exact simulation result, respectively, while the error of the existing distance-based spatial variation model is up to 6.5% and 17%, respectively. Moreover, our new model is also 6x faster than the spatial variation model for statistical timing analysis and 7x faster for statistical leakage analysis.

*Index Terms*—Leakage analysis, spatial correlation, SSTA, timing analysis, yield modeling.

## I. INTRODUCTION

W ITH THE CMOS technology scaling, process variation has become a major concern for very large scale integration design. Modeling and analyzing process variation has attracted a lot of attention.

Several works focus on analyzing and modeling of process variation [1]–[16]. The simplest method models process variation as the sum of inter-die (global) variation and independent within-die (local random) variation [4]. Later, it was observed that within-die variation is spatially correlated and the correlation depends on the distance between two within-die locations. [1] model spatial variation as correlated random variables, and principle component analysis is applied

P. Gupta and L. He are with the Department of Electrical Engineering, University of California, Los Angeles, CA 90095 USA (e-mail puneetlerong@ee.ucla.edu; lhelerong@ee.ucla.edu).

C. J. Spanos and K. Qian are with the Department of Electrical Engineering and Computer Science, University of California, Berkeley, CA 94720 USA (e-mail spanos@eecs.berkeley.edu; qiankun@eecs.berkeley.edu).

Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TCAD.2010.2089568

to perform statistical timing analysis. In this model, a chip is divided into several grids and each grid has its own spatial variation. The spatial variations of different grids are correlated and the correlation coefficient depends on the distance between two grids. [2] focuses on the extraction of spatial correlation and it models the correlation coefficient as a function of distance. Several more complex spatial correlation models have been proposed in [17]–[26].

In contrast to the spatial correlation models, process oriented modeling has concluded that within-die spatial variation is caused by deterministic across wafer and across-field variation while purely random within-die spatial variation is not significant [27]–[29]. However, in practical design flow, designers do not know the within-wafer location or within-field location of each die; therefore, we need to analyze the impact of across-wafer variation and across-field variation on die-scale. Since silicon measurements cited in this paper indicate that across-wafer variation is much more significant than the across-field variation, we consider only across-wafer variation in this paper, but the approach is easily extended to account for across-field variations.

In this paper, we first analyze the impact of deterministic across-wafer variation on spatial correlation. We observe that when quadratic across-wafer variation model is used as in [28], [30], and [31].

- Different locations on the chip may have different mean and variance. Such differences increase when the chip size increases.
- When chip size is small, the correlation coefficients for a certain Euclidean distance are within a narrow range. This explains why most existing works find that spatial correlation is a function of distance.
- 3) Within-die spatial variation is *NOT* spatially correlated when across-wafer systematic variation is removed.
- 4) Within-die spatial variation is *NOT* independent from inter-die variation.
- 5) If chip size is small enough, the two-level inter-/withindie decomposition of process variation is still very accurate.

Based on our analysis, we propose three accurate and efficient spatial variation models<sup>1</sup> considering across-wafer

<sup>1</sup>The program and data of our proposed model can be downloaded at http://nanocad.ee.ucla.edu/Main/Stat.

Manuscript received February 4, 2010; revised June 7, 2010 and September 23, 2010; accepted September 28, 2010. Date of current version February 11, 2011. This work was supported in part by Integrated Modeling Process and Computation for Technology. This paper was recommended by Associate Editor D. Sylvester.

L. Cheng is with SanDisk Corporation, Milpitas, CA 95035 USA (e-mail: lerong@ee.ucla.edu).

variation. Experimental results show that our model is more accurate and efficient compared to the distance-based spatial variation model in [2]. Compared to the exact simulation, error of our model for statistical timing analysis is within 2% and the error for statistical leakage analysis is within 5%. On the other hand, the error of the distance-based spatial correlation model is up to 6.5% for statistical timing analysis and up to 17% for statistical leakage analysis. Moreover, our model is  $6 \times$  faster than the distance-based spatial correlation model for statistical timing analysis and  $7 \times$  faster for statistical leakage analysis.

The rest of this paper is organized as follows. Section II discusses the physical causes for across-wafer variation. Section III analyzes the impact of across-wafer variation on diescale. Section IV discusses the case when the across-wafer variation is not a perfect parabola. Section V introduces the new variation models; the new models are applied to statistical timing analysis in Section VI and statistical leakage analysis in Section VII. Section VIII summarizes the advantages and disadvantages of different variation models. Section IX further discusses the case when the across-wafer variation is an arbitrary function, and finally Section X concludes this paper.

# **II. PHYSICAL ORIGINS OF SPATIAL VARIATION**

In silicon manufacturing, there are many steps that cause non-uniformity in devices across the wafer. Interestingly, most of these processes by the very nature of the equipment follow a radially varying trend across the wafer. Most processes are "center-fed" or "edge-fed" with the boundary conditions at the edge of wafer being substantially different. Moreover, wafers are often rotated to increase process uniformity across them which further leads to radial behavior of non-uniformity. This is further exacerbated by advent of single-wafer processing for 300 mm wafers.

For example, overlay error includes errors in the position and rotation of the wafer stage during exposure, wafer stage vibration, and the distortion of the wafer with respect to the exposure pattern [32]. Magnification and rotation components of overlay error increase from center of the wafer outward.<sup>2</sup> During chemical vapor deposition step, species depletion and temperature non-uniformity on the wafer at lower temperatures may cause thickness non-uniformity [33], [34]. Redeposition effect in physical vapor deposition [35] may cause non-uniformity of etch rate. Moreover, center peak shape of the RF electric field distribution [36] also leads to a center peak shape of etch rate, and chamber wall conditions [37] also cause etch rate non-uniformity. In real processes, the wafers are rotated to improve uniformity. [35] and [37] showed that the etch rate varies radially across the wafer: the etch rate is high at the center of the wafer and decreases toward the edges. Post-exposure bake (PEB) temperatures are higher at the center of the wafer and decreases outwards [38]. Similarly, other processes ranging from resist coat to wafer deformation due to vacuum chuck holding it follow a bowl-shaped trend across the wafer. All these processes cause a systematic across-wafer variation in physical dimensions.

Across-wafer variation of gate length observed in several recent silicon measurements [28], [30], [31], [39] validates



Fig. 1. Ring oscillator frequency within a wafer. (a) Process 1. (b) Process 2.

our arguments. [40] also showed that ring oscillator frequency and leakage current decrease from the center to the edge of the wafer. Fig. 1 shows industrial data of ring oscillator frequency for wafers from two different industrial processes. Process 1 is with 45 nm technology and process 2 is with 65 nm technology. From the figure, we see that for both processes, ring oscillator frequency decreases from the center to the edge of the wafer. Moreover, it has also been shown that there is no spatial correlation for threshold voltage variation [27]. Therefore, the across wafer frequency and leakage variation is mainly caused by gate length variation.

It has been shown that for process 1, the across-wafer frequency variation can be approximated as a quadratic function (a parabola) [40]. For process 2, the across-wafer variation is not a perfect parabola as process 1. However, it follows a systematic trend that the ring oscillator frequency decreases from the center to the edge of the wafer. Since the measurement data for process 2 (more than 300 wafers) is much more than process 1, in the rest of this paper, all of our simulation and experiments are based on the measurement result of process 2.

Besides across-wafer variation, lithography-induced effects such as lens aberrations can lead to systematic across-field variation and across-die variation. Across-die variation can be modeled as within-die deterministic mean shift and will not cause within-die spatial correlation. Moreover, silicon measurements cited in this paper indicate that across-wafer variation is much more significant (probably due to advancements in resolution enhancement and lithographic equipment) than across-field and across-wafer variation. Hence, for simplicity, we consider only across-wafer variation in this paper.

# III. ANALYSIS OF WAFER LEVEL VARIATION AND SPATIAL CORRELATION

In this paper, a variation source V, such as  $L_{eff}$ , is modeled as

$$V = v_0 + v_c + v_p \tag{1}$$

where  $v_0$  is the nominal value,  $v_c$  is a systematic constant offset, and  $v_p$  is the uncertainty part of process variation. Since both  $v_0$  and  $v_c$  are constant, we may combine them as one constant term. The uncertainty term  $v_p$  is modeled as

$$v_p = v_{aw} + v_{d-d} + v_{ad} + v_{af} + v_r.$$
 (2)

 $v_{d-d}$  comprises of inter-die random, inter-wafer, inter-lot variation, and fitting error<sup>3</sup> of quadratic fitting of across wafer variation;  $v_{af}$  and  $v_{ad}$  are the across-field and across-die variation, respectively. As discussed in Section II, we consider

<sup>&</sup>lt;sup>2</sup>Overlay error can directly impact critical dimension in double patterning.

 $<sup>^{3}</sup>$ We assume that the fitting error is purely random, that is, it only introduces inter-die variation without affecting within-die variation. We further discuss the impact of fitting error in Section IV.



Fig. 2. PDF of across-wafer variation coefficients. (a) PDf of a and b. (b) PDF of c and d.

only across-wafer variation and ignore these two types of variations ( $v_{af}$  and  $v_{ad}$ ) in this paper;  $v_r$  is the random noise;  $v_{aw}$  is across-wafer variation, which is modeled as a quadratic function as in [3], [28], [30], and [31]

$$v_{aw}(x_w, y_w) = ax_w^2 + by_w^2 + cx_w + dy_w$$
 (3)

where a, b, c, and d are coefficients obtained from fitting the measurement data from industry process shown in Fig. 1(b),<sup>4</sup>  $(x_w, y_w)$  is across-wafer location. We obtain the coefficients of the above across-wafer variation model by fitting the industrial 65 nm process measured ring oscillator delay with 348 wafers from 23 lots. In this section, we assume that a, b, c, and dare fixed for a process. In practice, these coefficients may vary slightly from wafer-to-wafer or lot-to-lot. Fig. 2 illustrates the probability density function (PDF) of the fitting coefficients for 348 wafers. From the figure, we find that the coefficients are distributed within 30% of the mean. The most accurate way is to model them as random variables. However, this will significantly increase the complexity of the variation model. For simplicity, in this paper, we assume the coefficients to be constant (using the mean value). Making such assumption introduces some error of the model, we will further discuss how to reduce the error in Section IV. In the rest of this section, all simulations are based on this extracted model.

Combining (2) and (3), we have

$$v_p(x_w, y_w) = ax_w^2 + by_w^2 + cx_w + dy_w + v_{d-d} + v_r.$$
 (4)  
In the rest of this section, we will analyze spatial variation  
based on the above model. Table I summarizes the  
mathematical notations used in this section. In the rest  
of this paper, we assume that inter-die random variation  $v_{d-d}$   
and within-die random variation  $v_r$  are Gaussian random  
variables with zero mean (the nonzero mean can be lumped  
in to systematic offset  $v_r$ ).<sup>5</sup>

## A. Variation of Mean and Variance with Location

Equation (4) provides a wafer level variation model, however, in real design, only die level variation model can be applied, i.e., for a die, whose center lies on  $(x_c, y_c)$  wafer coordinates, we want to know the variation of location (x, y)[assuming the coordinate of the center of the die to be (0, 0)]. In order to obtain the die level variation, we have to obtain the across-wafer coordinate from the die location in the wafer  $(x_c, y_c)$  and within-die location (x, y). In this paper, we assume that the chip coordinate aligns with the chip edges and the

TABLE I

#### NOTATIONS

| Symbols                                 | Description                                                                                                                                                                                       | Units            |
|-----------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------|
|                                         | Across-wafer variation symbols                                                                                                                                                                    |                  |
| V                                       | Variation source                                                                                                                                                                                  | 1                |
| $v_c$                                   | Constant systematic offset                                                                                                                                                                        | 1                |
| $v_0$                                   | Nominal value                                                                                                                                                                                     | 1                |
| $v_p(x, y)$                             | Variation of within-die location $(x, y)$                                                                                                                                                         | 1                |
| vaf                                     | Across-wafer variation (quadratic function)                                                                                                                                                       | 1                |
| $v_{d-d}$                               | Inter-die random variation (zero mean Gaussian)                                                                                                                                                   | 1                |
| $v_r$                                   | Within-die random variation (zero mean Gaussian)                                                                                                                                                  | 1                |
| $\sigma_{d-d}^2$                        | Variance of $v_{d-d}$                                                                                                                                                                             | 1                |
| $\sigma^2_{\substack{d-d\\\sigma^2_r}}$ | Variance of $v_r$                                                                                                                                                                                 | 1                |
| a, b                                    | Across-wafer variation coefficients                                                                                                                                                               | mm <sup>-2</sup> |
| c, d                                    | Across-wafer variation coefficients                                                                                                                                                               | mm <sup>-1</sup> |
|                                         | Inter-die/spatial/within-die variation symbols                                                                                                                                                    |                  |
| vg                                      | Inter-die variation                                                                                                                                                                               | 1                |
| $v_s$                                   | Within-die spatial variation                                                                                                                                                                      | 1                |
| v <sub>l</sub>                          | Within-die random variation                                                                                                                                                                       | 1                |
| -1                                      | Size/location symbols                                                                                                                                                                             | -                |
| $r_w$                                   | Wafer radius                                                                                                                                                                                      | mm               |
| $(l_x, l_y)$                            | x and y dimension die size                                                                                                                                                                        | mm               |
| $(x_w, y_w)$                            | Within-wafer location                                                                                                                                                                             | mm               |
| $(x_w, y_w)$<br>$(x_c, y_c)$            | Location of the center of the die in the wafer                                                                                                                                                    | mm               |
| $(x_c, y_c)$<br>(x, y)                  | Within-die location                                                                                                                                                                               | mm               |
| (x, y)<br>()                            | Angle between the die and wafer coordinates                                                                                                                                                       | 1                |
| (x', y')                                | Within-die location in wafer coordinate                                                                                                                                                           | mm               |
| (A , Y )                                | $x' = x \cos \omega + y \sin \omega, y' = y \cos \omega - x \sin \omega$                                                                                                                          |                  |
| $(l'_x, l'_y)$                          | $l'_{x} = l_{x} \cos \omega + l_{y} \sin \omega, \ y' = l_{y} \cos \omega - l_{x} \sin \omega$ $l'_{x} = l_{x} \cos \omega + l_{y} \sin \omega, \ l'_{y} = l_{y} \cos \omega - l_{x} \sin \omega$ | mm               |
| $(x_x, x_y)$<br>(x'', y'')              | $x'' = x'\sqrt{a/b} + c/(2\sqrt{ab}), y'' = y'\sqrt{b/a} + d/(2\sqrt{ab})$                                                                                                                        | mm               |
| $(l''_x, l''_y)$                        |                                                                                                                                                                                                   | mm               |
|                                         | $l''_{x} = l'_{x}\sqrt{a/b} + c/(2\sqrt{ab}), l''_{y} = l'_{y}\sqrt{b/a} + d/(2\sqrt{ab})$                                                                                                        |                  |
| $r_{d\mu}$                              | $r_{d\mu} = \sqrt{bx''^2 + ay''^2}$                                                                                                                                                               | 1                |
| r <sub>d</sub>                          | $r_{d\sigma} = \sqrt{x'' 2 + y'' 2}$                                                                                                                                                              | mm               |
| δ                                       | Euclidean distance between $(x_1'', y_1'')$ and $(x_2'', y_2'')$                                                                                                                                  | mm               |
|                                         | $\delta = \sqrt{(x_1'' - x_2'')^2 + (y_1'' - y_2'')^2}$                                                                                                                                           |                  |
| $r_m''$                                 | $r_m^{''} = \sqrt{l'' 2_x/4 + l'' 2_y/4}$ Other symbols                                                                                                                                           | mm               |
|                                         | Other symbols                                                                                                                                                                                     |                  |
| $k_0$                                   | $k_0 = r_w^2(a+b)/4 - c^2/4a - d^2/4b$                                                                                                                                                            | 1                |
| $k_1$                                   | $k_1 = r_w^4 (a^2 + b^2)/16 - r_w^4 ab/24 + \sigma_{d-d}^2 s$                                                                                                                                     | 1                |
| $k_2$                                   | $k_2 = k_1 / (abr_w^2)$                                                                                                                                                                           | mm <sup>2</sup>  |
| α                                       | $\begin{aligned} \alpha &= x_1'' x_2'' + y_1'' y_2'' \\ \beta &= \sigma_r^2 / (abr_w^2) \end{aligned}$                                                                                            | mm <sup>2</sup>  |
| β                                       | $\beta = \sigma_r^2 / (abr_w^2)$                                                                                                                                                                  | mm <sup>2</sup>  |
| s <sub>0</sub>                          | $s_0 = \cos^2 \omega (al_x^2 + bl_y^2)/12 + \sin^2 \omega (bl_x^2 + al_y^2)/12$                                                                                                                   | 1                |
| <i>s</i> <sub>1</sub>                   | $s_1 = s_0 + c^2/4a + d^2/4b$                                                                                                                                                                     | 1                |

Note. Unit 1 means a no unit. In this paper, we assume that variation is normalized with respect to the nominal value, hence variation has no unit.

wafer coordinate aligns with the major and minor axises of the across-wafer variation parabola.<sup>6</sup> Notice that in practice, the wafer coordinate and chip coordinate might not be aligned, as shown in Fig. 3, where  $\omega$  is the angle between wafer coordinate and chip coordinate. We may convert die location (x, y) to wafer coordinate (x', y') by rotating coordinates, as shown in Table I. In this case, the within wafer location of within-die location (x, y) is calculated as

$$x_w = x_c + x' \qquad y_2 = y_c + y'.$$
  
Then, variation of location  $(x, y)$  is calculated as  
$$v_p(x, y) = a(x_c + x')^2 + b(y_c + y')^2 + (5)$$
$$c(x_c + x') + d(y_c + y') + v_{d-d} + v_r.$$

In real design flow, the die location in the wafer  $(x_c, y_c)$  is not known to designers. We can convert the wafer-level systematic variation model to a die-level model by noting that dies are always distributed evenly in the wafer. Therefore, we may model  $(x_c, y_c)$  as random variables which are evenly distributed in the circle centering at (0, 0) with radius  $r_w$  (radius of the wafer). For simplicity, we convert rectangular coordinate to polar coordinate

$$x = \rho \cos \theta \qquad y = \rho \sin \theta \tag{6}$$

<sup>&</sup>lt;sup>4</sup>Since we look on the wafer mean as wafer to wafer random variation and the systematic offset is lumped in to constant term  $v_c$ , there is no constant term in the quadratic across-wafer variation model (lumped to wafer to wafer variation and systematic offset).

<sup>&</sup>lt;sup>5</sup>We assume inter-lot random, inter-wafer random, and inter-die random variation to be independent zero mean Gaussian random variables. Therefore,  $v_{d-d}$  is also a zero mean Gaussian random variable.

<sup>&</sup>lt;sup>6</sup>If we force the wafer coordinate and chip coordinate to be aligned, there will be a crossing term in the across-wafer variation model in (4), which makes the problem more complicated.



Fig. 3. Wafer coordinate and chip coordinate.

where  $\rho$  and  $\theta$  are independent random variables.  $\rho$  is with triangle distribution ranging from 0 to  $r_w$ ,  $\theta$  is with uniform distribution ranging from 0 to  $2\pi$ 

$$PDF_{\rho}(\rho) = 2\rho/r_{w}^{2} \qquad 0 \le \rho < r_{w}$$

$$PDF_{\theta}(\theta) = 1/2\pi \qquad 0 \le \theta < 2\pi.$$

$$(7)$$

With PDF, we can also obtain the first few order moments and joint moments of  $x_c$  and  $y_c$ . Since  $(x_c, y_c)$  are distributed in a symmetric area, joint moment  $E[x_c^m y_c^n] = 0$  when either *m* or *n* is odd number. Therefore, we only need to consider the even order moments and joint moments

$$E[x_c^2] = E[y_c^2] = r_w^2/4$$

$$E[x_c^4] = E[y_c^4] = r_w^4/8$$

$$E[x_c^2y_c^2] = r_w^4/24.$$
(8)

The detailed derivation of the above equations is in Appendix A. In this case, the variation at location (x, y),  $v_p(x, y)$ , is expressed as a function of four random variables  $x_c, y_c, v_{d-d}$ , and  $v_r$ . Then, the mean of  $v_p(x, y)$  is calculated as  $\mu_{v_p}(x, y) = E[v_{aw}(x_c + x', y_c + y')] + E[v_{d-d}] + E[v_r].$  (9)

As discussed above,  $v_{d-d}$  and  $v_r$  are zero mean,  $v_{aw}(x, y)$  is quadratic function of  $x_c$  and  $y_c$ , therefore,  $E[v_{aw}(x, y)]$  can be obtained from the moments and joint moments of  $x_c$  and  $y_c$  as shown in (8)

$$\mu_{v_p}(x, y) = k_0 + r_{d\mu}^2 \tag{10}$$

where  $r_{d\mu}$  and  $k_0$  are defined in Table I. In a way similar to mean calculation, we may also calculate variance of  $v_p(x, y)$  $\sigma_{u}^2(x, y) = k_1 + \sigma_{u}^2 + abr_{u}^2 r_{dz}^2$  (11)

where 
$$k_1$$
 and  $r_{d\sigma}$  are defined in Table I.

The detailed derivation of (10) and (11) is in Appendix B. From (10) and (11), it is interesting to note that different within-die locations may have different means and variances.<sup>7</sup> The location  $(x_0, y_0)$  having the smallest mean and variance is given by letting x'' = 0 and y'' = 0

$$x'' = 0 \implies x_0 = -c \cos \omega/2a - d \sin \omega/2b$$
  
$$y'' = 0 \implies y_0 = d \cos \omega/2b - c \sin \omega/2a.$$

The locations farther away from  $(x_0, y_0)$  will have larger mean and variance. Fig. 4 illustrates the mean and variance for different  $r_{d\mu}$  (or  $r_{d\sigma}$ ) obtained from our proposed model as shown in (5). From the figure, we find that the mean and variance differ for different on chip locations, but the



Fig. 4. (a)  $\mu$  change for different  $r_{d\mu}$ . (b)  $\sigma^2$  change for different  $r_{d\sigma}$ .

difference is very small. Especially for mean, the difference is less than 1%. Therefore, in the real measurement data, the location dependence of mean and variance is not obvious because a very small noise will overwhelm the difference.

## B. Appearance of Spatial Correlation

Besides mean and variance, we are also interested in the covariance between two locations  $(x_1, y_1)$  and  $(x_2, y_2)$ . Similar to the calculation of mean and variance, covariance is calculated as

$$Cov = k_1 + abr_w^2 \alpha.$$

Knowing the variance and covariance calculated above, we may obtain the correlation coefficient as

$$\rho = \sqrt{\frac{k_2^2 + 2k_2\alpha + \alpha^2}{(k_2 + \beta)^2 + (r_{d\sigma 1}^2 + r_{d\sigma 2}^2)(k_2 + \beta) + r_{d\sigma 1}^2 r_{d\sigma 2}^2}}$$
(12)

where  $\alpha$ ,  $\beta$ , and  $k_2$  are defined in Table I. The detailed derivation of covariance and correlation coefficient is in Appendix C. From (12), we obtain the upper bound and lower bound of the correlation coefficient for a certain Euclidean distance

$$\begin{split} \rho &\leq \rho_u = \sqrt{1 - \frac{\delta^2 k_2 + \delta^2 \beta / 2 + 2\beta k_2 + \beta^2}{(k_2 + \beta)^2 + 2r''_m^2 (k_2 + \beta) + r''_m^4}} \\ \rho &\geq \rho_l = \sqrt{1 - \frac{\delta^2 (k_2 - r''_m^2 / 2 + \delta^2 / 4) + \beta (\beta + 2k_2 + 2r''_m) + r''_m^4}{(k_2 + \beta)^2 + \delta^2 (k_2 + \beta) / 2 + \delta^4 / 16} \end{split}$$

where  $\delta$ ,  $l''_x$ ,  $l''_y$ , and  $r''_m$  are defined in Table I. From the upper bound and lower bound, we may also calculate the range of correlation coefficient

$$\rho_u - \rho_l \le \sqrt{4r''_m^2/(r''_m^2 + k_2 + \beta)}$$

The derivation of the upper bound, lower bound, and range of correlation coefficient is in Appendix D. Notice that usually the wafer size is much larger than the die size, that is  $k_2 \gg r''_m^2$ , therefore,  $\rho_u - \rho_l \ll 1$ , that is, the range of correlation coefficient for a certain distance is very narrow. Moreover, from the above equation, we also find that when the variances of the inter-die random and within-die random variation increase, the range decreases. This explains why most existing works [2], [17] find that spatial correlation is a function of distance.

Fig. 5(a) illustrates the exact data for 40 locations, the upper bound and the lower bound obtained from our proposed model as shown in (5). From the figure, we find that the range of  $\rho$ for a certain distance is very narrow. Although the correlation coefficient is within a narrow range, covariance is not, as shown in Fig. 5(b). This is because of the differences of variance across the die.

<sup>&</sup>lt;sup>7</sup>Such difference is caused by the chip-level nonlinearity of the across-wafer variation function [we assume quadratic function as in (3)]. If the the across-wafer variation function is linear at chip level, for example, a piecewise linear function with piece size larger than chip size, the mean and variance will be the same for all locations of a die.



Fig. 5. (a) Apparent spatial correlation. (b) Covariance as a function of distance.



Fig. 6. Correlation coefficient for within-die spatial variation after inter-die variation is removed.

Fig.  $6^8$  shows the correlation coefficient for within-die variation after subtracting the mean variation of the die (mainly caused by across wafer variation). In the figure, the correlation coefficients are obtained from our proposed model as shown in (5).

We observe that the within-die spatial variation is almost *NOT* spatially correlated, as empirically observed in [28], [30], and [31]. This further validates that the spatial variation is caused by systematic across-wafer variation.

# C. Dependence Between Inter-Die and Within-Die Variation

In most existing variation models, process variation is decomposed into inter-die, within-die spatial, and within-die random variation

$$v_p = v_g + v_s + v_l \tag{13}$$

where  $v_g$  is the inter-die variation,  $v_s$  is the within-die spatial variation, and  $v_l$  is the within-die variation. Usually,  $v_g$  is modeled as the variation of the chip mean,  $v_s$  is the residual of across-wafer variation after subtracting the inter-die components, and  $v_l$  is the pure random local variation.  $v_g$ ,  $v_s$ , and  $v_l$  are assumed to be independent.

With the variation model in (5), we may also calculate the inter-die, within-die spatial, and within die random variation. Within-die random variation is the local random variation  $v_l = v_r$ . Inter-die and spatial variation is induced by the die-to-die variation, and across-wafer variation. Inter die variation is calculated as the variation of the chip mean

$$v_g = \frac{1}{l_x l_y} \iint_{\substack{|x| < l_x/2 \\ |y| < l_y/2}} v_p(x, y) dx dy$$
$$= a x_c^2 + b y_c^2 + c x_c + d y_c + v_{d-d} + s_0$$

ı

where  $s_0$  is defined in Table I. Within-die spatial variation is calculated as the residual of across-wafer variation after



Fig. 7. Approximating across-wafer variation. Note: we assume square chips and chip size means edge length in mm. (a) Piecewise constant. (b) SNR versus chip size.

subtracting the chip mean

ı

$$v_s(x, y) = v_p(x, y) - v_g - v_l$$
  
=  $r_{d\mu}^2 + 2ax'x_c + 2by'y_c - s_1$  (14)

where  $s_1$  is defined in Table I. The derivation of the above equation is shown in Appendix E. From the above equations, we find that both inter-die and within-die spatial variations are functions of random variables  $x_c$  and  $y_c$ . Hence, we may not decompose process variation into independent inter-die and within-die spatial variation.

# D. When Can Spatial Variation Be Ignored?

In this section, we analyze the accuracy of the simple two-level inter-/within-die variation model for different chip sizes. If we only consider inter-/within-die variation, we may lump the across-wafer variation into inter-die variation, that is, approximate the across-wafer variation as a piecewise constant function, as shown in Fig. 7(a). To evaluate the impact of the approximation error, we may treat such approximation error as noise and the process variation as signal; and then evaluate the signal-to-noise ratio. In order to do this, we calculate the mean square approximation error and the total variance of variation. The signal-to-noise ratio when ignoring the spatial variation is given as

$$SNR = \sigma_{total}^{2} / MSE \approx \frac{6abr_{w}^{4} + 6(c+d)r_{w}^{2} + \sigma_{M}^{2} + \sigma_{R}^{2}}{abr_{w}^{2}(l_{x}^{2} + l_{y}^{2}) + 2(c+d)l_{x}l_{y}}$$

It can be seen that MSE depends on chip size. When chip size is small, MSE is small. This is because we approximate the across-wafer variation as a piecewise constant function with small steps, hence such approximation is accurate. Fig. 7(b) illustrates the SNR for different die sizes. It can be seen that the SNR decreases when die size increases as expected. We also observe that when chip size  $(l_x \text{ and } l_y)$  is smaller than 1 cm, the SNR is up to 100. That means, two-level inter-/within-die variation model is accurate.

### **IV. GENERAL ACROSS-WAFER VARIATION MODEL**

In the previous section, we assumed that the across-wafer variation is a quadratic function as shown in (5). In practice, across-wafer variation may not be an exact parabola. Moreover, the across-wafer variation will be slightly different for different wafers. Therefore, there will be some fitting residual after subtracting the across-wafer parabola

$$v(x_w, y_w) = v_p(x_w, y_w) + v_e(x_w, y_w)$$
(15)

where  $v_p$  is the quadratic across-wafer variation model as shown in (5) and  $v_e$  is fitting residual. In the previous section, we assume that the fitting residual is lumped into inter-die random variation. However, the fitting residual contains not

<sup>&</sup>lt;sup>8</sup>In the figure, the correlation coefficient can be a negative number when distance is large. This is because after subtracting the mean, when the within-die variation of one corner increases, the within-die variation of the opposite corner must decrease. That means, the within-die variations of opposite corners are negative correlated. Moreover even when two locations are very closed, if they lie on the opposite side of the center, their correlation is still near zero.



Fig. 8. Ring oscillator frequency within a wafer. (a) Original delay variation. (b) Residual after subtracting (5). (c) Residual after subtracting (16).

only inter-die random variation but a systematic trend of within-die variation. Fig. 8(a) illustrates the original delay variation across the wafer, and Fig. 8(b) illustrates the fitting residual of a wafer delay variation after subtracting the quadratic across-wafer variation function. From the figure, we find that after removing the quadratic across-wafer variation, the scale of process variation reduces dramatically. From the die point of view, such residual will also introduce some spatial correlation. Fig. 9(a) illustrates the correlation coefficients for different distances of the fitting residual. We find that the correlation coefficient is still high (>0.4) the fitting residual, but the correlation coefficients are no longer within a narrow band for a given distance. From Fig. 8(b), we also find the the spatial frequencies of the fitting residual are low. Therefore, from the die point of view, the fitting residual can be approximated by the first-order Taylor expansion

$$v_e(x', y') = v_e(x_c + x', y_c + y') \approx v_e(x_c, y_c) + s_x x' + s_y y'$$

$$s_x = \partial v_e(x_w, y_w) / \partial x_w | x_w = x_c \quad y_w = y_c$$

$$s_y = \partial v_e(x_w, y_w) / \partial y_w | x_w = x_c \quad y_w = y_c \quad (16)$$

where  $v_e(x', y')$  is the fitting residual at die location (x', y'). In the above model, the term  $v_e(x_c, y_c)$  is the same for the whole chip, but different from chip to chip. We may lump it into die-to-die variation. Since the impact of fitting residual on different chips is different,  $s_x$  and  $s_y$  vary from chip to chip. In this case, we may model  $s_x$  and  $s_y$  as random variables. Fig. 10 illustrates the distribution of  $s_x$  and  $s_y$  obtained from measurement data of process 2. In order to obtain samples of  $s_x$  and  $s_y$ , we remove the quadratic wafer-level spatial pattern for each wafer, and then fit linear model in (16) for each die. From the figure, we find that both  $s_x$  and  $s_y$  follow Gaussian distribution. Moreover, the correlation between  $s_x$  and  $s_y$  is very weak ( $\rho < 0.1$ ). Therefore, in this paper, we assume that  $s_x$  and  $s_y$  are uncorrelated Gaussian random variables.

Notice that when we model the fitting residual as a linear within-die variation trend, two more random variables  $s_x$  and  $s_y$  are introduced. This makes the variation model more complicated. When the across-wafer variation is with a perfect parabola and the fitting residual is not significant, we may just lump the fitting residual in to inter-die and random variation.

In addition, we also observe that after subtracting the model of fitting residual in (16), the remaining variation is almost uncorrelated, as illustrated in Figs. 8(c) and 9(b).



Fig. 9. (a) Correlation coefficient after subtracting (5). (b) Correlation coefficient after subtracting (16).



Fig. 10. (a) PDF of  $s_x$ . (b) PDF of  $s_y$ .

Combining (16) and (5), we obtain a general die-level across-wafer variation model

$$v_p(x, y) = a(x_c + x')^2 + b(y_c + y')^2 + c(x_c + x') + d(y_c + y') + s_x x' + s_y y' + v_{d-d} + m_r(x, y).$$
(17)

## V. MODELING SPATIAL VARIABILITY

As discussed in Section I, spatial variation largely comes from the deterministic across-wafer variation. Hence, modeling the within-die variation as spatial-correlated random variables is not accurate as discussed in Section III.

In this section, we introduce three new spatial variation models considering across-wafer variation:

- 1) slope augmented across-wafer variation model (SAAW);
- 2) quadratic across-wafer variation model (*QAW*);
- 3) location dependent across-wafer variation model (*LDAW*).

In the rest of this section, we will discuss these models in detail.

## A. Slope Augmented Across-Wafer Model

Equation (17) calculates the variation for a given location (x, y). In the equation, the die locations within the wafer  $(x_c, y_c)$  are modeled as random variables and their PDF is shown in (6) and (7). Equation (17) provides a new spatial variation model. We refer to the new model as slope augmented across-wafer variation model (*SAAW*).

Notice that in *SAAW* model, there are only six random variables, inter-die random variation  $v_{d-d}$ , within-die random variation  $v_r$ , die location within the wafer  $x_c$  and  $y_c$ , and slope of fitting residual  $s_x$  and  $s_y$ . However, for the traditional distance-based spatial variation model, the number of spatial variation sources depends on the number of grids. Larger chip needs more variables. Therefore, our new model not only models the across-wafer variation accurately but also is more efficient than the traditional spatial correlation model.

#### B. Quadratic Across-Wafer Model

In our original variation model in [41], we model the acrosswafer variation as a quadratic function, as shown in (5), without modeling the fitting residual. In this case, there are only four random variables,  $x_c$ ,  $y_c$ ,  $v_{d-d}$ , and  $v_r$ . We refer to this model as quadratic across-wafer variation model (*QAW*). Since *QAW* does not consider fitting residual, it is not as accurate as *SAAW*. As discussed in Section IV, when the across-wafer variation is a perfect parabola, the fitting residual is not significant,<sup>9</sup> we may just lump the fitting residual into inter-die random variation and simplify *SAAW* to *QAW*.

## C. Location Dependent Across-Wafer Model

As discussed in Section III-D, when die size is small enough, applying the two-level inter-/within-die variation model does not introduce much error. However, inter-/withindie variation model still does not consider the mean and variance difference at different locations of a chip, as discussed in Section III-A. To further improve the accuracy of inter-/within-die variation model, we may account for this

$$v(x, y) \approx v'_{d} + \mu_{v_{n}}(x, y) + \sigma_{v_{n}}(x, y)v'_{r}(x, y)$$
(18)

where  $v'_d$  is inter-die variation including inter-lot random, inter-wafer random, inter-die random, and across-wafer variation,  $v'_r(x, y)$  is within die variation including within-die random variation and residual of across-wafer variation,  $\mu_{v_n}(x, y)$ and  $\sigma_{v_n}(x, y)$  are mean and variance difference at different locations of a chip, which can be calculated from (10) and (11). We refer to the above model as location dependent acrosswafer variation model (LDAW). LDAW is a further simplification of QAW, it lumps the across-wafer variation into interdie and within-die variation. Inter-die variation is modeled as chip mean as discussed in Section III-C, and the residual is lumped into within-die variation. In this paper, we assume that  $v'_d$  is zero mean Gaussian random variable. The variance of  $v'_{d}$  is obtained from measurement. We also assume  $v'_{r}(x, y)$  to have a standard normal distribution. In this case, the within-die variation,  $\sigma_{v_p}(x, y)v'_r(x, y)$ , is a zero mean Gaussian random variable whose variance is  $\sigma_{v_n}^2(x, y)$ , which is determined by within-die location. Moreover, in this model, the mean of v(x, y) is  $\mu_{v_n}(x, y)$  which is also location dependent. Notice that in 18,  $\mu_{v_n}(x, y)$  and  $\sigma_{v_n}(x, y)$  are deterministic value for a certain within-die location (x, y). Therefore, LDAW model has only two random variables  $v'_d$  and  $v'_r$  which is the same as two level inter-/within-die model. Hence compared to inter-/withindie model, LDAW has similar efficiency but higher accuracy because LDAW considers mean and variance difference across the chip while inter-/within-die model does not.

## VI. APPLICATION TO STATISTICAL TIMING ANALYSIS

In this section, we apply our across-wafer variation model to statistical static timing analysis.

# A. Delay Model

In statistical timing analysis, people usually approximate cell delay as linear function of variation sources

$$D = D_0 + A_s V^T \tag{19}$$

<sup>9</sup>For example, process 1 as discussed in Section II.

where  $D_0$  is the nominal cell delay,  $A_s = (a_{s1}, a_{s2}, \ldots, a_{sn})$ is the vector of linear sensitivity coefficients, and  $V = (V_1, V_2, \ldots, V_n)$  is the vector of variation sources. For *SAAW* and *QAW*, since each variation source is a quadratic function of random variables, as shown in (17) and (5), the gate delay is a second-order function of random variables

$$D = p_2(RV)$$

where  $p_i(\cdot)$  is a *i*th order polynomial function and  $RV = (rv_1, rv_2, ..., rv_n)$  is the vector of random variables (such as  $x_c$  and  $y_c$ ). In this case, the quadratic SSTA flow in [42] can be applied to estimate chip delay variation. For *LDAW*, each variation source is a linear function of random variables, as shown in (18), then the cell delay is also a linear function of random variables

$$D = p_1(RV).$$

In this case, the linear SSTA flow in [42] can be applied to estimate chip delay variation.

To model cell delay more accurately, a quadratic cell delay model [42] can be used

$$D = D_0 + A_s V^T + V B_s V^T \tag{20}$$

where  $B_s = (b_{sij})$  is matrix of second-order sensitivity coefficients. In this case, for *LDAW*, the cell delay is a quadratic function of random variables. Therefore, quadratic SSTA can be applied to estimate chip delay. However, for *SAAW* and *QAW*, since each variation source is a quadratic function of random variables, the cell delay becomes a fourth-order function of random variables

$$D = p_4(RV).$$

Handling such high order delay variation function is complicated. In this case, the moment matching technique in [42] is applied to approximate the fourth-order function to a quadratic function by matching the first two order moments and joint moments

$$p_2(RV) \approx p_4(RV)$$

$$E[p_2(RV)] = [p_4(RV)]$$

$$E[rv_i \cdot p_2(RV)] = E[rv_i \cdot p_4(RV)]$$

$$E[rv_i^2 \cdot p_2(RV)] = E[rv_i^2 \cdot p_4(RV)]$$

$$E[rv_irv_i \cdot p_2(RV)] = E[rv_irv_i \cdot p_4(RV)].$$

With the above approximation, quadratic SSTA can be applied. Notice that moment matching approximation is performed only once for all cells and does not increase the run time of SSTA.

Moreover, it was shown in [42] that ignoring crossing terms of quadratic cell delay function (*semi-quadratic* delay model) significantly improves the run time of SSTA without affecting accuracy too much. Therefore, to improve efficiency, we may ignore the crossing terms when we perform quadratic SSTA (*semi-quadratic* SSTA).

## B. Experimental Result

We have implemented the non-linear SSTA in [42] with different spatial variation model in C++. In order to verify the efficiency and accuracy, three comparison cases are defined: 1) Monte-Carlo simulation with the exact deterministic

|    |      |      |      |      |      |      | DE  | LAI  | IEK  | CEN   | IAG | ELN  | KUK  | FUP  |    | FFER | ENI  | VAI   | XIA I | ION  | WIOI | DELS | • |      |      |            |      |   |
|----|------|------|------|------|------|------|-----|------|------|-------|-----|------|------|------|----|------|------|-------|-------|------|------|------|---|------|------|------------|------|---|
| y  | ]    | Exac | t    | S    | AAW  | Quad | l   | SA   | AW S | S-Qua | d   | 9    | QAW  | Quad |    | Q    | AW S | -Quad | i     |      | LDA  | W*   |   |      | SP   | <i>C</i> * |      | Г |
| el | μ    | σ    | 95%  | μ    | σ    | 95%  | Т   | μ    | σ    | 95%   | Т   | μ    | σ    | 95%  | Т  | μ    | σ    | 95%   | Т     | μ    | σ    | 95%  | Т | μ    | σ    | 95%        | Т    | Γ |
| d  | 17.6 | 2.27 | 24.5 | -0.4 | -0.9 | -1.0 | 146 | -0.9 | -1.7 | -1.7  | 27  | -0.8 | -1.9 | -2.3 | 54 | -1.3 | -2.5 | -3.2  | 19    | -2.1 | -7.5 | -6.9 | 9 | -1.5 | -4.2 | -3.8       | 1450 | - |

TABLE II DELAY PEDCENTAGE EDDOP FOR DIFFERENT VARIATION MODELS

| Benchmark | Delay | ]    | Exac | t    | 5    | SAAW | Quad | 1   | SA   | AW S | S-Qua | ıd  | 9    | QAW  | Quad |     | Q    | AW S | -Quad | t  |      | LDA  | $W^*$ |    |      | SP   | $C^*$ |      |      | IW    | rak   |    |
|-----------|-------|------|------|------|------|------|------|-----|------|------|-------|-----|------|------|------|-----|------|------|-------|----|------|------|-------|----|------|------|-------|------|------|-------|-------|----|
|           | Model | μ    | σ    | 95%  | μ    | σ    | 95%  | Т   | μ    | σ    | 95%   | Т   | μ    | σ    | 95%  | Т   | μ    | σ    | 95%   | Т  | μ    | σ    | 95%   | Т  | μ    | σ    | 95%   | Т    | μ    | σ     | 95%   | Т  |
| c1908     | Quad  | 17.6 | 2.27 | 24.5 | -0.4 | -0.9 | -1.0 | 146 | -0.9 | -1.7 | -1.7  | 27  | -0.8 | -1.9 | -2.3 | 54  | -1.3 | -2.5 | -3.2  | 19 | -2.1 | -7.5 | -6.9  | 9  | -1.5 | -4.2 | -3.8  | 1450 | -2.6 | -10.2 | -8.9  | 8  |
|           | Lin   | -    | -    | -    | -0.8 | -1.5 | -1.4 | 150 | -1.4 | -1.8 | -2.0  | 26  | -0.9 | -3.6 | -3.1 | 53  | -1.2 | -3.4 | -3.9  | 18 | -2.6 | -7.5 | -8.1  | 10 | -2.1 | -4.4 | -4.0  | 135  | -3.0 | -11.5 | -10.3 | 10 |
| c3540     | Quad  | 25.7 | 3.43 | 34.5 | +0.4 | +0.9 | +0.7 | 212 | -0.4 | -1.3 | -1.1  | 36  | +0.4 | -1.8 | -1.2 | 76  | -0.9 | -2.1 | -1.9  | 25 | +0.4 | -5.8 | -4.6  | 13 | -1.2 | -4.8 | -4.0  | 4210 | -1.4 | -7.3  | -6.5  | 12 |
|           | Lin   | -    | -    | -    | -0.6 | -1.2 | -1.2 | 209 | -1.1 | -1.9 | -1.6  | 35  | -0.9 | -3.6 | -3.1 | 77  | -1.2 | -5.1 | -3.9  | 27 | -1.8 | -6.5 | -6.0  | 9  | -2.0 | -6.5 | -5.7  | 202  | -2.9 | -9.3  | -8.8  | 10 |
| c7552     | Quad  | 48.9 | 6.47 | 64.7 | -0.6 | +0.3 | +0.2 | 435 | -0.8 | -0.2 | -0.9  | 67  | -0.8 | -1.6 | -1.4 | 115 | -1.6 | -1.5 | -1.7  | 48 | -2.7 | -3.6 | -4.0  | 20 | -1.0 | -2.3 | -2.9  | 8182 | -2.1 | -6.7  | -6.5  | 22 |
|           | Lin   | -    | -    | _    | -0.6 | -0.5 | -0.6 | 430 | -1.5 | -1.4 | -1.6  | 101 | -1.1 | -1.3 | -1.4 | 109 | -1.9 | -2.2 | -2.8  | 79 | -3.3 | -4.6 | -4.9  | 16 | -3.3 | -3.5 | -4.3  | 433  | -3.9 | -8.9  | -8.2  | 15 |

Note. The µ, σ, and 95-percentile point for exact simulation is in ns. Run time (T) is in ms. \* for LDAW, SPC, and IW, linear SSTA is applied when assuming linear cell delay model.

across-wafer variation model,<sup>10</sup> which is the golden case for comparison; 2) distance-based spatial correlation model from [2], which is referred to as spatial correlation model (SPC); and 3) two-level inter-/within die variation model, which is referred to as inter-/within-die variation model (IW).

We apply all the above methods to the ISCAS85 suite of benchmarks in predictive technology model (PTM) 45 nm technology [43]. We assume random placement for ISCAS85 circuits. Since process variation has smaller impact on interconnect delay than on logic cell delay, we only consider logic cell delay when calculate the full chip delay variation. In the experiment, we consider the gate length variation obtained from minimum square error fitting on the ring oscillator delay from industrial 65 nm process (process 2 as discussed in Section II) measurement from the model as shown in (17). We obtain the across-wafer coefficients a, b, c, and d, fitting residual residual coefficients  $s_x$  and  $s_y$ , standard deviation of random inter-wafer, inter-die, and within-die variation as percentage with respect to the nominal value. Then we assume that the percentages of all the above coefficients to nominal value are the same at 45 nm technology node and 65 nm technology node. To obtain across-wafer coefficients a, b, c, and d, we apply quadratic function to fit the across-wafer variation for each wafer to obtain the fitting coefficients for each wafer, then use average coefficients of all wafers for our experiment. To obtain fitting residual residual coefficients  $s_x$  and  $s_y$ , we first get the slope of fitting residual  $s_x$  and  $s_y$ for each chip, and then calculate the mean and variance of  $s_x$  and  $s_y$  for all chips. In the experiment, we assume  $s_x$  and  $s_{y}$  to be Gaussian random variables with mean and variance obtained from the measurement data.

1) Full Chip Delay: In the experiment, we assume that the chips size is  $2 \text{ cm} \times 2 \text{ cm}$  and the wafer radius is 15 cm. Since ISCAS85 benchmarks are very small, the impact of spatial variation on delay is not significant within the circuit. In order to show such impact, we assume the benchmarks are stretched on a 2 cm  $\times$  2 cm chip. In our experiment, for the SPC model, we divide the chip to  $10 \times 10 = 100$  grids. Table II illustrates the percentage error of mean  $(\mu)$ , standard deviation ( $\sigma$ ), and 95-percentile point (95%) and run time (T) of different variation models. In the table, we also compare the result of using quadratic cell delay model (Quad) and linear cell delay model (Lin). We only use quadratic cell

<sup>10</sup>In the simulation, each wafer may have different across-wafer variation which is obtained from measurement data of process 2. We have simulated 318 wafers correspondent to 318 measured wafers.

delay model for golden case simulation (exact), the error is calculated as error of different variation models compared to the golden case simulation. For SAAW and QAW, we also compare the results of applying quadratic SSTA with crossing terms (SAAW Quad and QAW Quad) and applying SSTA without crossing terms (SAAW S-Quad and QAW S-Quad). From the table, we have the following observations.

- 1) Compared to full quadratic SSTA, semi-quadratic SSTA (SSTA without crossing terms) achieves up to  $8 \times$ speedup with less than 1% accuracy loss.
- 2) SAAW is more accurate than QAW. This is because the fitting residual is significant for the measurement data, QAW ignores fitting residual and hence introduces more error.
- 3) The error of SAAW using semi-quadratic SSTA is within 2% while the error of spatial correlation model is up to 6.5%.
- 4) Compared to quadratic cell delay model, linear cell delay has less than 2% accuracy loss. This is because in our experiment, the cell delay variation is well approximated by a linear function.
- 5) For linear cell delay model, SAAW achieves about  $6 \times$ speedup compared to SPC. This is because there are 100 grids in the spatial correlation model, resulting in 37 spatial random variables,<sup>11</sup> while SAAW has only six random variables.
- 6) LDAW and IW are very efficient. However, both models have much larger error than others. This is because both model ignore correlation. But LDAW is still more accurate than IW without run time penalty.

Since linear cell delay model and semi-quadratic SSTA are accurate, we assume linear cell delay and apply semi-quadratic SSTA for all experiments. Moreover, since SAAW is more accurate than QAW with only a small run time overhead, we do not consider the QAW model in the following experiments.

In our experiment, we only consider big chips. As discussed in Section III-D, when the chip size is small, the impact on across-wafer variation at die level is not significant. In order to verify this, we perform delay estimation of ISCAS85 benchmarks stretching on different size chips. Table III shows the percentage error for different models with different chip size. From the table, we find that when chip size is small, LDAW and IW is accurate. Considering that LDAW has similar

<sup>&</sup>lt;sup>11</sup>There are 100 correlated spatial random variables, we apply PCA to truncate some insignificant principle components and there remains 37 significant principle components.

# TABLE III Percentage Error for ISCAS85 Benchmark Stretching on a Chip with Different Chip Size

| Bench- | Chip |      | Exac | t    | 1    | SAAW | ,    | 1    | LDAW | 7    |      | SPC  |      |      | IW   |      |
|--------|------|------|------|------|------|------|------|------|------|------|------|------|------|------|------|------|
| mark   | size | μ    | σ    | 95%  |
| c1908  | 10   | 17.4 | 2.24 | 24.2 | -1.2 | -1.8 | -1.9 | -2.2 | -5.5 | -5.9 | -1.7 | -3.5 | -3.3 | -2.5 | -6.8 | -7.1 |
|        | 6    | 17.5 | 2.23 | 24.3 | -1.1 | -1.5 | -1.6 | -2.2 | -4.1 | -4.2 | -1.2 | -2.6 | -2.4 | -2.2 | -4.5 | -4.3 |
|        | 3    | 17.6 | 2.22 | 24.4 | -0.9 | -1.2 | -1.1 | -1.1 | -1.8 | -1.6 | -1.0 | -1.5 | -1.5 | -1.9 | -2.1 | -2.5 |
| c3540  | 10   | 25.6 | 3.42 | 34.6 | -1.0 | -2.2 | -1.6 | -1.4 | -6.2 | -4.9 | -1.8 | -5.4 | -4.8 | -2.2 | -7.5 | -6.9 |
|        | 6    | 25.8 | 3.45 | 34.3 | -0.8 | -1.2 | -1.0 | -1.2 | -3.8 | -3.3 | -1.5 | -3.4 | -3.0 | -1.3 | -4.2 | -3.7 |
|        | 3    | 25.8 | 3.44 | 34.4 | -0.5 | -1.0 | -1.0 | -1.1 | -2.1 | -2.0 | -1.0 | -1.9 | -1.9 | -1.1 | -2.2 | -2.3 |
| c7552  | 10   | 48.9 | 6.47 | 64.7 | -1.2 | -1.1 | -1.4 | -2.8 | -3.5 | -3.7 | -2.5 | -2.2 | -2.7 | -3.2 | -5.6 | -6.3 |
|        | 6    | 48.9 | 6.47 | 64.7 | -1.0 | -1.1 | -1.3 | -1.4 | -2.5 | -2.3 | -2.2 | -2.1 | -2.5 | -2.9 | -3.1 | -3.3 |
|        | 3    | 48.9 | 6.47 | 64.7 | -0.6 | -0.9 | -1.0 | -1.0 | -1.1 | -1.2 | -0.9 | -1.3 | -1.4 | -1.3 | -1.4 | -1.6 |

Note. We assume square chips and chip size means edge length in mm. The exact delay values are in ns

## TABLE IV

# Delay Percentage Error at Different Locations in a $2 \text{ cm} \times 2 \text{ cm}$ Chip

| Benchmark | Location |      | Exact | t    |       | LDAW | 7    |       | IW   |      |
|-----------|----------|------|-------|------|-------|------|------|-------|------|------|
|           |          | μ    | σ     | 95%  | $\mu$ | σ    | 95%  | $\mu$ | σ    | 95%  |
| c3540     | C        | 25.4 | 3.29  | 32.2 | +0.4  | +0.3 | +0.3 | +1.1  | +2.4 | +2.8 |
|           | LL       | 24.8 | 3.22  | 31.9 | +0.8  | +0.6 | +0.3 | +3.5  | +1.4 | +1.9 |
|           | LR       | 26.2 | 3.35  | 33.1 | -0.7  | +0.6 | -0.6 | -1.9  | -2.4 | -1.8 |
|           | UL       | 26.5 | 3.36  | 33.3 | -0.2  | +0.3 | +0.3 | -3.3  | -3.0 | -4.4 |
|           | UR       | 27.1 | 3.41  | 34.1 | -0.3  | -0.3 | -0.6 | -6.1  | -4.1 | -5.2 |
| c7552     | C        | 48.2 | 6.37  | 60.2 | +0.8  | +0.3 | +0.2 | +1.0  | +0.9 | +0.9 |
|           | LL       | 47.2 | 6.11  | 58.5 | +0.4  | +0.3 | +0.7 | +3.6  | +4.6 | +3.1 |
|           | LR       | 49.4 | 6.51  | 62.3 | -0.2  | +0.3 | +0.3 | -0.8  | -1.9 | -3.0 |
|           | UL       | 49.5 | 6.65  | 63.1 | -0.2  | +0.1 | +0.1 | -1.0  | -4.0 | -4.1 |
|           | UR       | 50.1 | 6.91  | 65.3 | -0.4  | +0.3 | +0.1 | -1.0  | -7.4 | -7.4 |

Note. The exact delay values are in ns.

#### TABLE V

Delay Comparison for ISCAS85 Benchmark in 1 cm  $\times$  1 cm Chip

| Benchmark | Location |       | Exact | t    |       | LDAW | 7    |       | IW   |      |
|-----------|----------|-------|-------|------|-------|------|------|-------|------|------|
| 1         |          | $\mu$ | σ     | 95%  | $\mu$ | σ    | 95%  | $\mu$ | σ    | 95%  |
| c3540     | С        | 25.4  | 3.26  | 31.9 | +0.5  | +0.4 | +0.2 | +1.3  | +1.6 | +1.9 |
|           | LL       | 25.0  | 3.25  | 31.6 | +0.5  | +0.3 | +0.3 | +2.4  | +1.3 | +2.4 |
|           | LR       | 25.8  | 3.30  | 32.4 | -0.4  | -0.4 | -0.4 | -1.1  | -0.9 | -0.8 |
|           | UL       | 26.1  | 3.32  | 32.6 | -0.4  | +0.3 | -0.2 | -2.0  | -1.5 | -1.4 |
|           | UR       | 26.2  | 3.34  | 33.0 | -0.3  | -0.3 | -0.6 | -2.6  | -1.8 | -2.2 |
| c7552     | C        | 48.6  | 6.38  | 60.4 | +0.2  | -0.1 | -0.2 | +1.0  | +0.5 | +0.6 |
|           | LL       | 48.2  | 6.35  | 60.3 | -0.2  | -0.2 | +0.2 | +1.7  | +1.0 | +1.1 |
|           | LR       | 48.9  | 6.42  | 60.4 | +0.2  | -0.2 | +0.2 | -0.2  | -0.7 | -0.5 |
|           | UL       | 49.1  | 6.43  | 61.0 | -0.2  | -0.2 | -0.2 | -0.9  | +1.0 | -1.3 |
|           | UR       | 49.2  | 6.45  | 61.2 | -0.3  | -0.4 | -0.5 | -1.1  | -1.4 | -1.8 |

Note. The exact delay values are in ns.

run time but is more accurate (although when chip size is small, the accuracy improvement is limited) compared to *IW*, *LDAW* is always better than *IW*.

2) Delay of Blocks on Different Locations on a Chip: The above experiment assumes that the benchmarks are stretched on a chip. However, in real design, especially for big chips, the design is separated into several blocks and each block only occupies a small region on a chip. In this case, the critical path is within a small region instead of spanning all over the chip. As discussed in Section III-A, different chip locations may have different mean and variance. Therefore, when a block is place at different locations of a chip, its delay variation may be different. In order to show such effect, we assume that the ISCAS85 benchmark circuit is placed (no stretched) in different locations of a chip: center (C), lower left corner

## TABLE VI

Delay Comparison for ISCAS85 Benchmark in 6 mm  $\times$  6 mm Chip

| Benchmark | Location |       | Exact |      |       | LDAW |      |       | IW   |      |
|-----------|----------|-------|-------|------|-------|------|------|-------|------|------|
|           |          | $\mu$ | σ     | 95%  | $\mu$ | σ    | 95%  | $\mu$ | σ    | 95%  |
| c3540     | С        | 25.5  | 3.27  | 32.0 | +0.4  | +0.2 | +0.3 | +0.9  | +0.7 | +1.4 |
|           | LL       | 25.1  | 3.25  | 31.7 | +0.5  | +0.3 | +0.3 | +2.4  | +1.3 | +2.4 |
|           | LR       | 26.0  | 3.32  | 32.6 | -0.4  | -0.4 | -0.4 | -1.1  | -0.9 | -0.8 |
|           | UL       | 26.2  | 3.34  | 32.8 | -0.4  | +0.3 | -0.2 | -2.0  | -1.5 | -1.4 |
|           | UR       | 26.3  | 3.35  | 33.1 | -0.3  | -0.3 | -0.6 | -2.6  | -1.8 | -2.2 |
| c7552     | C        | 48.4  | 6.37  | 60.1 | +0.2  | -0.1 | -0.2 | +1.0  | +0.5 | +0.6 |
|           | LL       | 48.1  | 6.34  | 59.9 | -0.2  | -0.2 | +0.2 | +1.7  | +1.0 | +1.1 |
|           | LR       | 49.0  | 6.43  | 60.7 | +0.2  | -0.2 | +0.2 | -0.2  | -0.7 | -0.5 |
|           | UL       | 49.3  | 6.46  | 61.3 | -0.2  | -0.2 | -0.2 | -0.9  | +1.0 | -1.3 |
|           | UR       | 49.4  | 6.48  | 61.5 | -0.3  | -0.4 | -0.5 | -1.1  | -1.4 | -1.8 |

Note. The exact delay values are in ns.

## TABLE VII

Delay Comparison for ISCAS85 Benchmark in 3 mm  $\times$  3 mm Chip

| Benchmark | Location |       | Exact | t    |       | LDAW | r    |       | IW   |      |
|-----------|----------|-------|-------|------|-------|------|------|-------|------|------|
|           |          | $\mu$ | σ     | 95%  | $\mu$ | σ    | 95%  | $\mu$ | σ    | 95%  |
| c3540     | С        | 25.6  | 3.28  | 32.2 | +0.4  | +0.2 | +0.3 | +0.5  | +0.3 | +0.8 |
|           | LL       | 25.4  | 3.26  | 320  | +0.5  | +0.6 | +0.3 | +1.2  | +1.0 | +1.2 |
|           | LR       | 25.8  | 3.31  | 32.4 | -0.2  | -0.4 | -0.6 | -0.5  | -0.7 | -0.7 |
|           | UL       | 25.9  | 3.22  | 32.5 | -0.2  | +0.3 | -0.2 | -1.0  | -1.1 | -0.9 |
|           | UR       | 26.0  | 3.31  | 32.7 | -0.2  | -0.2 | -0.3 | -0.7  | -0.8 | -1.1 |
| c7552     | C        | 48.6  | 6.38  | 60.3 | +0.2  | -0.3 | +0.7 | +0.2  | +0.5 | +1.1 |
|           | LL       | 48.4  | 6.37  | 60.1 | -0.2  | 0.1  | +0.2 | +0.4  | +0.3 | +0.7 |
|           | LR       | 48.8  | 6.42  | 60.6 | +0.2  | -0.3 | +0.3 | -0.6  | -0.9 | -0.5 |
|           | UL       | 48.9  | 6.43  | 60.9 | -0.2  | +0.2 | -0.2 | -0.4  | +0.0 | -0.6 |
|           | UR       | 49.2  | 6.45  | 61.2 | -0.2  | -0.2 | -0.3 | -0.7  | -0.8 | -1.1 |

Note. The exact delay values are in ns.

(LL), lower right corner (LR), upper left corner (UL), and upper right corner (UR), and then calculate the delay variation with location. Since ISCAS85 benchmarks are very small, the impact of spatial variation on delay is not significant within the circuit. Therefore, in this experiment, we only compare two models *LDAW* and IW.<sup>12</sup>

Table IV compares the percentage error of *LDAW* and *IW* for ISCAS85 benchmarks placed on different locations of a 2 cm  $\times$  2 cm chip. From the table, we find that the error of *LDAW* is within 1% error from the exact simulation and the error of *IW* is up to 8%. This is because *LDAW* predicts different mean and variance for different location correctly, as discussed in Section V while *IW* can only give the same mean and variance for all locations.

Tables V–VII show percentage error of *LDAW* and *IW* for ISCAS85 benchmarks placed on different locations of a  $1 \text{ cm} \times 1 \text{ cm}$ ,  $6 \text{ mm} \times 6 \text{ mm}$ , and  $3 \text{ mm} \times 3 \text{ mm}$  chip, respectively. From the tables, we find that the error of *IW* becomes smaller when chip size is small.

## VII. APPLICATION TO STATISTICAL LEAKAGE ANALYSIS

Besides SSTA, we also apply our variation model to statistical leakage power analysis. Usually, cell leakage power variation is modeled as exponential function of variation sources

$$P_{leak} = P_0 \cdot e^{\sum c_i V_i} \tag{21}$$

<sup>12</sup>When the circuit is in a small region, *SAAW* and *QAW* will give similar result as *LDAW*, and *SPC* will give similar result as *IW*.

| Benchmark |       | Exact |      |       | SA   | 4 <i>W</i> |      |      | QA   | AW   |      |       | LDA   | W     |     |      | SP    | C     |     |       | IW    | /     |     |
|-----------|-------|-------|------|-------|------|------------|------|------|------|------|------|-------|-------|-------|-----|------|-------|-------|-----|-------|-------|-------|-----|
|           | $\mu$ | σ     | 95%  | $\mu$ | σ    | 95%        | Т    | μ    | σ    | 95%  | Т    | $\mu$ | σ     | 95%   | Т   | μ    | σ     | 95%   | Т   | $\mu$ | σ     | 95%   | Т   |
| c1355     | 62.1  | 14.5  | 92.5 | +1.5  | +3.3 | +3.2       | 16.3 | +3.4 | +6.5 | +5.4 | 9.8  | -5.5  | -15.6 | -16.9 | 2.5 | +5.3 | +10.4 | +12.2 | 123 | -7.9  | -19.6 | -20.9 | 2.7 |
| c1908     | 95.6  | 20.3  | 144  | +0.9  | +4.4 | +3.7       | 15.5 | +2.3 | +7.8 | +6.3 | 9.7  | -6.5  | -17.8 | -19.6 | 2.6 | +5.7 | +14.8 | +14.3 | 122 | -8.6  | -19.2 | -23.5 | 2.9 |
| c2670     | 131   | 22.9  | 181  | +1.4  | +2.7 | +1.7       | 15.7 | +2.9 | +8.5 | +4.7 | 9.5  | -7.9  | -16.9 | -22.0 | 2.8 | +6.8 | +12.2 | +9.4  | 122 | -9.2  | -20.3 | -25.5 | 2.4 |
| c3540     | 201   | 37.4  | 282  | +1.5  | +2.3 | +1.8       | 15.4 | +3.1 | +5.8 | +4.4 | 10.4 | -5.6  | -16.5 | -20.2 | 2.6 | +4.9 | +11.2 | +8.2  | 123 | -8.3  | -18.2 | -23.5 | 2.7 |
| c7552     | 403   | 73.2  | 562  | +1.6  | +2.7 | +1.9       | 15.3 | +3.7 | +6.0 | +5.0 | 10.1 | -7.3  | -12.6 | -16.9 | 2.6 | +7.1 | +13.8 | +10.7 | 122 | -9.2  | -20.3 | -24.5 | 2.5 |

TABLE VIII Leakage Error Percentage for Different Models in 2 cm  $\times$  2 cm Chip

Note. The exact values are in mW. Run time (T) is in s.

## TABLE IX

LEAKAGE ERROR FOR DIFFERENT VARIATION MODEL ON DIFFERENT SIZE CHIPS

| Bench- | Chip |      | Exact |      |      | SAAW | <i>,</i> |      | LDAW  | ,     |      | SPC  |      |      | IW    |       |
|--------|------|------|-------|------|------|------|----------|------|-------|-------|------|------|------|------|-------|-------|
| mark   | size | μ    | σ     | 95%  | μ    | σ    | 95%      | μ    | σ     | 95%   | μ    | σ    | 95%  | μ    | σ     | 95%   |
| c1355  | 10   | 15.4 | 3.52  | 23.2 | +1.2 | +3.1 | +3.0     | -4.7 | -10.6 | -11.2 | +4.8 | +7.5 | +9.2 | -5.4 | -12.3 | -13.8 |
|        | 6    | 5.92 | 1.62  | 10.3 | +1.0 | +2.7 | +2.9     | -3.5 | -7.5  | -8.3  | +2.7 | +6.2 | +6.7 | -3.9 | -8.1  | -9.2  |
|        | 3    | 1.48 | 0.40  | 2.58 | +0.6 | +1.8 | +2.0     | -1.8 | -3.5  | -3.7  | +1.6 | +2.9 | +3.1 | -2.0 | -3.5  | -4.0  |
| c1908  | 10   | 23.9 | 5.7   | 36.1 | +1.0 | +2.7 | +2.9     | -4.2 | -12.3 | -14.1 | +3.7 | +8.5 | +9.2 | -5.5 | -13.9 | -15.1 |
|        | 6    | 10.6 | 2.25  | 16.1 | +0.9 | +2.0 | +2.1     | -3.4 | -7.3  | -8.2  | +1.9 | +5.6 | +6.8 | -3.5 | -7.3  | -9.0  |
|        | 3    | 2.65 | 0.57  | 4.03 | +1.0 | +2.2 | +2.2     | -1.3 | -3.5  | -4.0  | +1.2 | +2.9 | +3.1 | -1.9 | -4.0  | -4.4  |
| c2670  | 10   | 32.8 | 5.72  | 45.2 | +1.2 | +2.8 | +2.9     | -5.3 | -11.1 | -14.0 | +4.2 | +8.2 | +9.1 | -6.5 | -12.3 | -17.1 |
|        | 6    | 14.6 | 2.55  | 20.1 | +1.0 | +1.8 | +1.9     | -3.5 | -6.2  | -7.1  | +2.4 | +4.3 | +6.0 | -4.3 | -7.1  | -8.3  |
|        | 3    | 3.65 | 0.65  | 5.03 | +0.8 | +1.4 | +1.7     | -1.9 | -3.2  | -3.7  | +2.0 | +3.6 | +3.5 | -2.2 | -4.5  | -5.1  |
| c3540  | 10   | 50.5 | 9.37  | 71.1 | +1.2 | +1.8 | +2.1     | -3.9 | -7.8  | -10.5 | +2.9 | +5.3 | +6.2 | -5.0 | -9.6  | -14.5 |
|        | 6    | 22.3 | 4.16  | 30.2 | +1.0 | +1.4 | +1.7     | -2.3 | -4.5  | -5.5  | +1.8 | +3.5 | +3.6 | -3.4 | -6.1  | -8.5  |
|        | 3    | 5.58 | 1.05  | 7.55 | +0.9 | +1.2 | +1.4     | -1.2 | -3.1  | -3.0  | +1.0 | +1.4 | +2.0 | -1.4 | -3.4  | -3.9  |
| c7552  | 10   | 102  | 18.5  | 141  | +1.4 | +2.3 | +2.4     | -4.6 | -7.3  | -9.9  | +4.0 | +6.2 | +8.6 | -6.3 | -8.2  | -12.5 |
|        | 6    | 45.0 | 8.19  | 6.26 | +1.2 | +1.9 | +1.9     | -3.0 | -5.2  | -7.3  | +2.7 | +4.2 | +5.1 | -3.5 | -6.0  | -8.2  |
|        | 3    | 11.4 | 2.06  | 1.58 | +0.9 | +0.7 | +1.3     | -1.2 | -1.8  | -2.0  | +1.5 | +1.6 | +1.6 | -2.3 | -2.9  | -3.4  |

Note. Exact values are in mW

where  $P_0$  is the nominal leakage power and  $c_i$ s are sensitivity coefficients. The full chip leakage power is calculated as the sum of leakage power of all cells

$$P_{chip} = \sum_{i \in Cell} P_{i,leak}$$
(22)

where *Cell* is the set of all cells in the chip and  $P_{i,leak}$  is leakage power of the *i*th cell. Since each variation source is a quadratic function as in (17), the cell leakage power is an exponential of a quadratic function of random variables. Considering the random variables may be non-Gaussian, there are no closed-form equations to calculate the full chip leakage power. Therefore, in this paper, we apply Monte-Carlo simulation to obtain the full chip leakage power variation.

We have implemented leakage variation analysis with different models in MATLAB. In the experiment, we use the same setting and comparison cases as the SSTA experiment in Section VI. For each variation model, we use 100 000 sample Monte-Carlo simulation to obtain the full chip leakage power for all variation models. For the leakage analysis, we assume that 900 copies of ISCAS benchmark circuits are placed in a  $30 \times 30$  array on a 2 cm×2 cm chip. Table VIII compares the leakage variation for ISCAS85 benchmarks. From the table, we observe the following.

- 1) Error of *SAAW* is within 5% while error of *SPC* is up to 17%. Moreover, *SAAW* is  $7 \times$  faster than *SPC* because there are fewer random variables for *SAAW*.
- 2) *SAAW* is more accurate than *QAW*, but is about 50% slower.

3) Both *LDAW* and *IW* are not accurate. This is because both these models do not consider correlation and hence underestimate the leakage power variation.

Similar to the SSTA, for leakage variation analysis, we also perform leakage estimation in different size chips:  $1 \text{ cm} \times 1 \text{ cm}$ ,  $6 \text{ mm} \times 6 \text{ mm}$ , and  $3 \text{ mm} \times 3 \text{ mm}$ . For the  $1 \text{ cm} \times 1 \text{ cm}$ chip, we assume that 225 copies of ISCAS85 benchmark circuits are placed in a  $15 \times 15$  array, for the  $6 \text{ mm} \times 6 \text{ mm}$ chip, we assume 100 copies of ISCAS85 benchmark circuits are placed in a  $10 \times 10$  array, and for the  $3 \text{ mm} \times 3 \text{ mm}$ chip, we assume 25 copies of ISCAS benchmark circuits are placed in a  $5 \times 5$  array. Table IX shows the error percentage for different models for different size chips. From the table, we find that the error of *LDAW*, *SPC*, and *IW* reduces when chip size becomes smaller as expected.

#### VIII. SUMMARY OF DIFFERENT MODELS

In the previous sections, we compared the accuracy and efficiency of different models. Table X summarizes the advantages and disadvantages of our proposed spatial variation models (SAAW, QAW, and LDAW), and the traditional variation models (SPC and IW). Our proposed across-wafer variation models exactly model the across-wafer variation and the number of random variables does not depend on chip size. Therefore they are accurate and efficient. SAAW has six random variables and it can be applied to any across-wafer variation models. QAW has four random variables, hence it is more efficient than SAAW. However, it can be applied only when the across-wafer variation is a perfect parabola. LDAW is the most efficient, ignores correlation and only works for small chips. Moreover, SAAW and QAW need to know the across-wafer variation. Therefore, one needs to track the die locations within the wafer to build up the model.

On the other hand, the traditional variation models as well as *LDAW*) only require measurement on a die without tracking die locations. Therefore, they are somewhat easier to build. However, such models are not accurate compared to our proposed models. Moreover, for *SPC*, since the number of random variables depends on number of grids, it is not as efficient as our proposed models.

## IX. ARBITRARY ACROSS-WAFER VARIATION FUNCTION

In this paper, we assume across-wafer variation to be a parabola. In most cases, our proposed *SAAW* model is good enough to model the across-wafer variation at die level.

| Model Type  | Advantages | Disadvantages     | Models         | # of RVs             | Case to Apply                                   |
|-------------|------------|-------------------|----------------|----------------------|-------------------------------------------------|
| Across-     | Accurate   | Need die tracking | SAAW (17)      | 6                    | Large chip, non-parabola across-wafer variation |
| wafer       | Efficient  | to extract        | <i>QAW</i> (5) | 4                    | Large chip, parabola across-wafer variation     |
| models      |            |                   | LDAW (18)      | 2                    | Small chip                                      |
| Traditional | Easy to    | Not               | SPC            | Depend on # of grids | Large chip                                      |
| models      | extract    | accurate          | IW             | 2                    | Small chip                                      |

TABLE X SUMMARY OF DIFFERENT VARIATION MODELS

However, there are some special cases where the across wafer variation is an arbitrary function as follows:

$$v_p = f(x_w, y_w) + v_{d-d} + v_r.$$

In this case, the statistical characteristics such as mean, variance, covariance, and correlation coefficient depend on the function f. In most of the cases, we may not have the closed form formulae to calculate the statistical characteristics. However, we may still apply similar method as in Section III to model the across-wafer location for a die as random variables

$$v_p(x, y) = f(x_c + x, y_c + y) + v_{d-d} + v_r.$$

When we know the function f, either in closed form or as a numerical lookup table, we may perform Monte-Carlo simulation on the above formula for statistical analysis. In this case, since there is no closed-form, we cannot perform analytical statistical analysis, such as SSTA or statistical leakage analysis.

# X. CONCLUSION

In this paper, we analytically studied the impact of systematic across-wafer variation on within-die spatial variation. For simplicity, we first assumed that across-wafer variation is a quadratic function. We first observed that different locations on a chip may have different means and variances and such difference becomes more significant when chip size increases. Second, we found that spatial correlation is visible only when the across wafer systematic is not taken into account. When it is taken into account, we showed that within die random variability does not exhibit a strong or useful pattern of spatial correlation. We exploited these observations in order to create a much more accurate and efficient model for performance variability prediction. Third, we found that the within-die spatial variation is NOT independent of the inter-die variation. However, when chip size is small enough, such dependence is weak and the across-wafer variation can be lumped into inter-die variation. In this case, the two level inter-/within-die variation model is still accurate. We further considered the case when the across-wafer variation is not with a perfect quadratic function. Based on the above analysis, we proposed accurate and efficient variation models for deterministic across wafer variation. We further applied our new variation models to two applications: statistical static timing analysis and statistical leakage analysis. Experimental result showed that compared to the distance-based spatial variation model, our new model reduces the error from 6.5% to 2% for statistical timing analysis and reduces error from 17% to 5% for statistical leakage analysis. Our model also improves the run time by  $6 \times$  for statistical timing analysis and by  $7 \times$  for statistical leakage analysis.

# APPENDIX A MOMENTS OF $x_c$ AND $y_c$

1) 2nd and 4th Order Moments of  $x_c$  and  $y_c$ :

$$E[x_c^2] = E[\rho^2]E[\cos^2\theta] = \int_0^{r_w} \rho^2 \cdot 2\rho/r_w^2 d\rho \int_0^{2\pi} \cos^2(\theta)/2\pi d\theta$$
  
=  $(\rho^4/2r_w^2)|_0^{r_w} \cdot (\theta + \sin(\theta)\cos(\theta))|_0^{2\pi}/4\pi$   
=  $r_w^2/4$   
$$E[x_c^4] = E[\rho^4]E[\cos^4\theta] = \int_0^{r_w} \rho^4 \cdot 2\rho/r_w^2 d\rho \int_0^{2\pi} \cos^4(\theta)/2\pi d\theta$$
  
=  $(\rho^6/3r_w^2)|_0^{r_w} \cdot (12\theta + 8\sin(2\theta) + \sin(4\theta))|_0^{2\pi}/64\pi$   
=  $r_w^4/8.$ 

Since  $x_c$  and  $y_c$  are symmetric, we have

$$E[y_c^2] = E[x_c^2] = r_w^2/4$$
  

$$E[y_c^4] = E[x_c^4] = r_w^4/8.$$

2) Joint Moment of  $x_c$  and  $y_c$ :

$$\begin{split} E[x_c^2 y_c^2] &= E[\rho^4] E[\cos^2 \theta \sin^2 \theta] \\ &= \int_0^{r_w} \rho^4 \cdot 2\rho / r_w^2 d\rho \int_0^{2\pi} \cos^2(\theta) \sin^2(\theta) / 2\pi d\theta \\ &= (\rho^6 / 3r_w^2) |_0^{r_w} \cdot (4\theta - \sin(4\theta)) |_0^{2\pi} / 64\pi \\ &= r_w^4 / 24. \end{split}$$

# APPENDIX B MEAN AND VARIANCE OF $v_p$

We first express  $v_{aw}$  as

$$v_{aw}(x_c + x', y_c + y') = a(x_c + x')^2 + b(y_c + y')^2 + c(x_c + x') + d(y_c + y') = a(x_c + x''\sqrt{b/a})^2 + b(y_c + y''\sqrt{a/b})^2 - c^2/4a - d^2/4b.$$
(23)

1) Mean of  $v_p$ : We first compute the mean of  $v_{aw}$  as follows, notice that we only need to consider even order moments and joint moments of  $x_c$  and  $y_c$  as discussed in Section III

$$\begin{split} \mu_{v_{aw}} &= E[v_{aw}(x_c + x', y_c + y')] \\ &= E[a(x_c + x''\sqrt{b/a})^2 + b(y_c + y''\sqrt{a/b})^2] - c^2/4a - d^2/4b \\ &= aE[x_c^2] + bE[y_c^2] + bx''^2 + ay''^2 - c^2/4a - d^2/4b \\ &= r_w^2(a+b)/4 - c^2/4a - d^2/4b + bx''^2 + ay''^2 = k_0 + r_{d\mu}^2. \end{split}$$

Since we assume that  $v_{d-d}$  and  $v_r$  are with zero mean, mean of  $v_p$  is

$$\mu_{v_p} = \mu_{aw} + \mu_{d-d} + \mu_r = k_0 + r_{d\mu}^2.$$

2) Variance of  $v_p$ : We first compute the variance of  $v_{aw}$ 

$$\sigma_{v_{aw}}^{2} = E[v_{aw}^{2}(x_{c} + x', y_{c} + y')] - E^{2}[v_{aw}(x_{c} + x', y_{c} + y')]$$
  
$$= a^{2}E[x_{c}^{4}] + 4abx''^{2}E[x_{c}^{2}] + b^{2}E[y_{c}^{4}] + 4aby''^{2}E[y_{c}^{2}]$$
  
$$+ 2abE[x_{c}^{2}y_{c}^{2}] - (aE[x_{c}^{2}] + bE[y_{c}^{2}])^{2}$$
  
$$= r_{w}^{4}(a^{2} + b^{2})/16 - r_{w}^{4}ab/24 + abr_{w}^{2}(x''^{2} + y''^{2}).$$
(24)

Then the variance of  $v_p$  is

$$\begin{aligned} \sigma_{v_p}^2 &= \sigma_{v_{aw}}^2 + \sigma_{v_{d-d}}^2 + \sigma_{v_r}^2 \\ &= r_w^4 (a^2 + b^2) / 16 - r_w^4 ab / 24 + \sigma_{d-d}^2 + \sigma_r^2 + ab r_w^2 (x''^2 + y''^2) \\ &= k_1 + \sigma_r^2 + ab r_w^2 r_{d\sigma}^2. \end{aligned}$$

COVARIANCE AND CORRELATION COEFficient Between  $v_p(x_1, y_1)$  AND  $v_p(x_2, y_2)$ 

1) Covariance Between  $v_p(x_1, y_1)$  and  $v_p(x_2, y_2)$ : We fist compute the covariance between  $v_{aw}(x_1, y_1)$  and  $v_{aw}(x_2, y_2)$ 

$$\begin{aligned} \cos v_{aw} &= E[v_{aw}(x_c + x_1', y_c + y_1') \cdot v_{aw}(x_c + x_2', y_c + y_2')] - \\ &\quad E[v_{aw}(x_c + x_1', y_c + y_1')] \cdot E[v_{aw}(x_c + x_2', y_c + y_2')] \\ &= a^2 E[x_c^4] + 4abx_1''y_2''E[x_c^2] + b^2 E[y_c^4] + 4abx_2''y_1''E[y_c^2] \\ &\quad +2abE[x_c^2y_c^2] - (aE[x_c^2] + bE[y_c^2])^2 \\ &= r_w^4(a^2 + b^2)/16 - r_w^4ab/24 + abr_w^2(x_1''y_2'' + x_2''y_1''). \end{aligned}$$

Since all devices on the same chip share the same inter-die variation and within-die random variation is independent for different devices, then the covariance between  $v_p(x_1, y_1)$  and  $v_p(x_2, y_2)$  is

$$cov = cov_{aw} + \sigma_{d-d}^{2}$$
  
=  $r_{w}^{4}(a^{2} + b^{2})/16 - r_{w}^{4}ab/24 + \sigma_{d-d}^{2} + abr_{w}^{2}(x_{1}''y_{2}'' + x_{2}''y_{1}'')$   
=  $k_{1} + abr_{w}^{2}\alpha$ .

$$\begin{split} \rho &= \frac{\mathrm{cov}}{\sigma_{1}\sigma_{2}} = \sqrt{\frac{\mathrm{cov}^{2}}{\sigma_{1}^{2}\sigma_{2}^{2}}} = \sqrt{\frac{(k_{1}+abr_{w}^{2}\alpha)^{2}}{(k_{1}+\sigma_{r}^{2}+abr_{w}^{2}r_{d\sigma_{1}}^{2})(k_{1}+\sigma_{r}^{2}+abr_{w}^{2}r_{d\sigma_{2}}^{2})} \\ &= \sqrt{\frac{(k_{1}/(abr_{w}^{2})+\sigma_{r}^{2}/(abr_{w}^{2})+r_{d\sigma_{1}}^{2})(k_{1}/(abr_{w}^{2})+\sigma_{r}^{2}(abr_{w}^{2})+r_{d\sigma_{2}}^{2})} \\ &= \sqrt{\frac{(k_{2}+\alpha)^{2}}{(k_{2}+\beta+r_{d\sigma_{1}}^{2})(k_{2}+\beta+r_{d\sigma_{2}}^{2})}} \\ &= \sqrt{\frac{k_{2}^{2}+2k_{2}\alpha+\alpha^{2}}{(k_{2}+\beta)^{2}+(r_{d\sigma_{1}}^{2}+r_{d\sigma_{2}}^{2})(k_{2}+\beta)+r_{d\sigma_{1}}^{2}r_{d\sigma_{2}}^{2}}}. \end{split}$$

# $\begin{array}{c} \text{Appendix D} \\ \text{Upper Bound and Lower Bound of } \rho \end{array}$

1) Upper Bound of  $\rho$ :

$$\rho = \sqrt{\frac{k_2^2 + 2k_2\alpha + \alpha^2}{(k_2 + \beta)^2 + (r_{d\sigma 1}^2 + r_{d\sigma 2}^2)(k_2 + \beta) + r_{d\sigma 1}^2 r_{d\sigma 2}^2}}$$
(25)  
=  $\sqrt{1 - \frac{2k_2\beta + \beta^2 + \delta^2 k_2 + (x_1''y_2'' - x_2''y_1'')^2 + (r_{d\sigma 1}^2 + r_{d\sigma 1}^2)\beta}{(k_2 + \beta)^2 + (r_{d\sigma 1}^2 + r_{d\sigma 2}^2)(k_2 + \beta) + r_{d\sigma 1}^2 r_{d\sigma 2}^2}}.$ 

In the above equation,  $\rho$  is represented in a form of  $\sqrt{1-\zeta/\eta}$ . To obtain the upper bound, we increase the denominator  $\eta$  and reduce numerator  $\zeta$ . Considering that

$$(x_1''y_2'' - x_2''y_1'')^2 \ge 0$$
  

$$r_{d\sigma 1}^2 + r_{d\sigma 2}^2 = x_1''^2 + y_1''^2 + x_2''^2 + y_2''^2$$
  

$$\ge ((x_1'' - x_2'')^2 + (y_1'' - y_2'')^2)/2$$
  

$$= \delta^2/2$$

and

$$\begin{split} &-l''_{x}/2 \leq x'' \leq l''_{x} - l''_{y}/2 \leq y'' \leq l''_{y} \\ \Rightarrow & r^{2}_{d\sigma} = x''^{2} + y''^{2} \leq l''^{2}_{x}/4 + l''^{2}_{y}/4 = r''^{2}_{m} \\ \Rightarrow & r^{2}_{d\sigma1} + r^{2}_{d\sigma2} \leq 2r''^{2}_{m} - r^{2}_{d\sigma1}r^{2}_{d\sigma2} \leq r''^{4}_{m}. \end{split}$$

Replacing  $(x_1''y_2'' - x_2''y_1'')^2$  with 0 and  $(r_{d\sigma 1}^2 + r_{d\sigma 2}^2)$  with  $\delta^2/2$  in the numerator, and replacing  $(r_{d\sigma 1}^2 + r_{d\sigma 2}^2)$  with  $2r''_m^2$  and  $r_{d\sigma 1}^2 r_{d\sigma 2}^2$  with  $r''_m^4$  in the denominator, we have

$$\rho \le \sqrt{1 - \frac{\delta^2 k_2 + \delta^2 \beta / 2 + 2\beta k_2 + \beta^2}{(k_2 + \beta)^2 + 2r''_m^2 (k_2 + \beta) + r''_m^4}}.$$

2) Lower Bound of  $\rho$ :  $\rho$  is represented in a form of  $\sqrt{1-\zeta/\eta}$  as shown in (25). Consider that  $\zeta/\eta$  is between 0 and 1, increasing  $\zeta$  and  $\eta$  with the same value will increase  $\zeta/\eta$ , then reduces  $\rho$ . Therefore, to obtain the lower bound, we first add a non-negative value  $(r_{d\sigma 1}^2 - r_{d\sigma 1}^2)^2/4$  to both numerator and denominator

$$\begin{split} \rho &\geq \sqrt{1 - \frac{2k_2\beta + \beta^2 + \delta^2 k_2 + (x_1''y_2'' - x_2''y_1'')^2 + (r_{d\sigma1}^2 + r_{d\sigma1}^2)\beta + (r_{d\sigma1}^2 - r_{d\sigma1}^2)^2/4}}{(k_2 + \beta)^2 + (r_{d\sigma1}^2 + r_{d\sigma2}^2)(k_2 + \beta) + r_{d\sigma1}^2 r_{d\sigma2}^2 + (r_{d\sigma1}^2 - r_{d\sigma1}^2)^2/4} \\ &= \sqrt{1 - \frac{2k_2\beta + \beta^2 + \delta^2 k_2 + (x_1''y_2'' - x_2''y_1'')^2 + (r_{d\sigma1}^2 + r_{d\sigma1}^2)\beta + (r_{d\sigma1}^2 - r_{d\sigma1}^2)^2/4}{(k_2 + \beta)^2 + (r_{d\sigma1}^2 + r_{d\sigma2}^2)(k_2 + \beta) + (r_{d\sigma1}^2 + r_{d\sigma1}^2)^2/4}} \\ &= \sqrt{1 - \frac{2k_2\beta + \beta^2 + \delta^2 k_2 + (r_{d\sigma1}^2 + r_{d\sigma1}^2 - \delta^2/2)^2/4 + 3\delta^4/16 + (r_{d\sigma1}^2 + r_{d\sigma1}^2)\beta}{(k_2 + \beta)^2 + (r_{d\sigma1}^2 + r_{d\sigma2}^2)(k_2 + \beta) + (r_{d\sigma1}^2 + r_{d\sigma1}^2)/4}}. \end{split}$$

Considering that

$$2r''_{m}^{2} \ge r_{d\sigma1}^{2} + r_{d\sigma2}^{2} \ge \delta^{2}/2 \Rightarrow (2r''_{m}^{2} - \delta^{2}/2)^{2} \ge (r_{d\sigma1}^{2} + r_{d\sigma2}^{2} - \delta^{2}/2)^{2}.$$

Similar to the upper bound proof, replacing  $(r_{d\sigma 1}^2 + r_{d\sigma 2}^2 - \delta^2/2)^2$  with  $(2r''_m^2 - \delta^2/2)^2$  and  $(r_{d\sigma 1}^2 + r_{d\sigma 2}^2)$  with  $2r''_m^2$  in the numerator, and replacing  $(r_{d\sigma 1}^2 + r_{d\sigma 2}^2)$  with  $\delta^2/2$  in the denominator, we have

$$\begin{split} \rho &\geq \sqrt{1 - \frac{2k_2\beta + \beta^2 + \delta^2 k_2 + (2r''_m^2 - \delta^2/2)^2 / 4 + 3\delta^4 / 16 + 2r''_m^2 \beta}{(k_2 + \beta)^2 + \delta^2 (k_2 + \beta) / 2 + \delta^4 / 16}} \\ &= \sqrt{1 - \frac{\delta^2 (k_2 - r''_m^2 / 2 + \delta^2 / 4) + \beta (\beta + 2k_2 + 2r''_m) + r''_m^4}{(k_2 + \beta)^2 + \delta^2 (k_2 + \beta) / 2 + \delta^4 / 16}}. \quad \Box$$

3) *Range of*  $\rho$ : To obtain the range of  $\rho$ , similar to the proof of lower bound, we add a non-negative value  $2r'_m^2(k_2+\beta)/2 + r''_m^4 - \delta^2(k_2+\beta)/2 - \delta^4/16$  to both numerator and denominator of the lower bound to obtain a looser lower bound  $\rho_l \ge \rho_l'$ 

$$=\sqrt{1-\frac{\delta^2(k_2/2-r''_m^2/2+3\delta^2/16-\beta/2)+\beta(\beta+2k_2+2r''_m^2)+2r''_m^4+2r''_m^2(k_2+\beta)/2}{(k_2+\beta)^2+2r''_m^2(k_2+\beta)+r''_m}}$$

Then, the range of  $\rho$  can be calculated as  $\rho_{\mu} - \rho_{\mu} = \sqrt{(\rho_{\mu} - \rho_{\mu})^2} \le \sqrt{(\rho_{\mu} - \rho_{\mu})(\rho_{\mu} + \rho_{\mu})} = \sqrt{\rho_{\mu}^2 - \rho_{\mu}^2}$ 

$$\begin{split} \rho_{u}^{2} &= \rho_{l}^{2} - \sqrt{(\rho_{u}^{2} - \rho_{l}^{2})^{2}} \leq \sqrt{(\rho_{u}^{2} - \rho_{l}^{2})(\rho_{u}^{2} + \rho_{l}^{2}) - \sqrt{\rho_{u}^{2} - \rho_{l}^{2}}} \\ \leq \sqrt{\rho_{u}^{2} - \rho_{l}^{\prime 2}} = \sqrt{\frac{k_{2}(2r''_{m}^{2} - \delta^{2}/2) + \beta(4r''_{m}^{2} - \delta^{2}) + 3\delta^{4}/16 + 2r''_{m}^{4} - \delta^{2}r''_{m}^{2}/2}}{(k_{2} + \beta + r''_{m}^{2})^{2}}} \\ \leq \sqrt{\frac{2k_{2}r''_{m}^{2} + 4\beta r''_{m}^{2} + 5r''_{m}^{4}}{(k_{2} + \beta + r''_{m}^{2})^{2}}}. \end{split}$$

Since usually the wafer size is much larger than the die size, that is  $k_2 \gg r''_m^2$ , then  $2k_2r''_m^2 \gg r''_m^4$ . Therefore, we have

$$\rho_{u} - \rho_{l} \leq \sqrt{\frac{2k_{2}r''_{m}^{2} + 4\beta r''_{m}^{2} + 2k_{2}r''_{m}^{2} + 4r''_{m}^{4}}{(k_{2} + \beta + r''_{m}^{2})^{2}}}$$
$$= \sqrt{\frac{4r''_{m}^{2}}{k_{2} + \beta + r''_{m}^{2}}}.$$

APPENDIX E

COMPUTATION OF  $v_g$  AND  $v_s$ 

1) Computation of  $v_g$ :

$$v_g = \frac{1}{l_x l_y} \iint_{\substack{|x| < l_x/2 \\ |y| < l_y/2}} v_p(x, y) dx dy = a x_c^2 + b y_c^2 + c x_c + d y_c + v_{d-d}$$
$$+ \frac{1}{l_x l_y} \iint_{\substack{|x| < l_x/2 \\ |y| < l_y/2}} (2(a x_c \cos \omega - b y_c \sin \omega) x + 2(a x_c \sin \omega + b y_c \cos \omega))$$

+
$$(a\cos^2\omega + b\sin^2\omega)x^2$$
 +  $(a\sin^2\omega + b\cos^2\omega)x^2$   
+ $2(a - b)\cos\omega\sin\omega xy + v_r(x, y))dxdy.$ 

Since the integration region is symmetric, the integration of odd order moments and joint moments of x and y is zero. Moreover, notice that  $v_r(x, y)$  is zero mean. Then, we have

$$v_{g} = ax_{c}^{2} + by_{c}^{2} + cx_{c} + dy_{c} + v_{d-d} + ((a\cos^{2}\omega + b\sin^{2}\omega)l_{x}^{2} + (a\sin^{2}\omega + b\cos^{2}\omega)l_{y}^{2})/12 = ax_{c}^{2} + by_{c}^{2} + cx_{c} + dy_{c} + v_{d-d} + \cos^{2}\omega(al_{x}^{2} + bl_{y}^{2})/12 + \sin^{2}\omega(bl_{x}^{2} + al_{y}^{2})/12 = ax_{c}^{2} + by_{c}^{2} + cx_{c} + dy_{c} + v_{d-d} + s_{0}.$$

2) Computation of  $v_s$ :

$$v_{s} = v_{aw}(x, y) + v_{d-d} - v_{g} = a(x_{c} + x')^{2} + b(y_{c} + y')^{2} + c(x_{c} + x') + d(y_{c} + y') + v_{d-d} - (ax_{c}^{2} + by_{c}^{2} + cx_{c} + dy_{c} + v_{d-d} + s_{0}) = 2ax'x_{c} + 2by'y_{c} + ax'^{2} + by'^{2} + cx' + dy' - s_{0} = 2ax'x_{c} + 2by'y_{c} + bx''^{2} + ay''^{2} - s_{0} - c^{2}/4a - d^{2}/4b = r_{d\mu}^{2} + 2ax'x_{c} + 2by'y_{c} - s_{1}.$$

## REFERENCES

- H. Chang and S. S. Sapatnekar, "Statistical timing analysis considering spatial correlations using a single PERT-like traversal," in *Proc. Int. Conf. Comput.-Aided Design*, Nov. 2003, pp. 621–625.
- [2] J. Xiong, V. Zolotov, and L. He, "Robust extraction of spatial correlation," in *Proc. Int. Symp. Phys. Design*, Apr. 2006, pp. 2–9.
- [3] D. Boning, J. Chung, D. Ouma, and R. Divecha, "Spatial variation in semiconductor processes: Modeling for control," in *Proc. Process Control Diagnostics Modeling Semiconductor Manuf. II Electrochem. Soc. Meeting*, May 1997, pp. 21–30.
- [4] S. Borkar, T. Karnik, S. Narendra, J. Tschanz, A. Keshavarzi, and V. De, "Parameter variations and impact on circuits and microarchitecture," in *Proc. Design Autom. Conf.*, Jun. 2003, pp. 338–342.
- [5] T. Jing, Y. Hu, Z. Feng, X.-L. Hong, X. Hu, and G. Yan, "A full-scale solution to the rectilinear obstacle-avoiding Steiner problem," *Integr. VLSI J.*, vol. 41, no. 3, pp. 413–425, May 2008.
- [6] F. Gong, W. Yu, Z. Wang, Z. Yu, and C. Yan, "Efficient techniques for 3-D impedance extraction using mixed boundary element method," in *Proc. Asia South Pacific Design Autom. Conf.*, 2008, pp. 158–163.
- [7] Z. Cao, T. Jing, Y. Hu, Y. Shi, X. Hong, X. Hu, and G. Yan, "DraXRouter: Global routing in x-architecture with dynamic resource assignment," in *Proc. Asia South Pacific Design Autom. Conf.*, 2006, pp. 618–623.

- [8] Y. Hu, Z. Feng, T. Jing, X. Hong, Y. Yang, G. Yu, X. Hu, and G. Yan, "Forst: A 3-step heuristic for obstacle-avoiding rectilinear Steiner minimal tree construction," *J. Inf. Comput. Sci.*, vol. 1, no. 3, pp. 107–116, 2004.
- [9] N. Guan, Z. Gu, Q. Deng, W. Xu, and G. Yu, "Schedulability analysis of preemptive and nonpreemptive EDF on partial runtime-reconfigurable FPGAs," ACM Trans. Des. Autom. Electron. Syst., vol. 13, no. 4, p. 56, 2008.
- [10] F. Ren and D. Markovic, "True energy-performance analysis of the MTJ-based logic-in-memory architecture (1-bit full adder)," *IEEE Trans. Electron Devices*, vol. 57, no. 5, pp. 1023–1028, May 2010.
- [11] Y. Shi, B. Xie, and Y. Mao, "Circuit simulation method in mathematical modeling," *Chin. J. Eng. Math.*, vol. 21, pp. 43–48, Jul. 2004.
- [12] Z. Feng, Y. Hu, T. Jing, X. Hong, X. Hu, and G. Yan, "An o(nlogn) algorithm for obstacle-avoiding routing tree construction in the λ-geometry plane," in *Proc. Int. Symp. Phys. Design*, 2006, pp. 48–55.
- [13] Y. Hu, Z. Feng, T. Jing, X. Hong, Y. Yang, G. Yu, X. Hu, and G. Yan, "A 3-step heuristic for obstacle-avoiding rectilinear Steiner minimal tree construction," in *Proc. Int. Symp. Comput. Inf.*, 2004, pp. 107–116.
- [14] J. Xiong, Y. Shi, V. Zolotov, and C. Visweswariah, "Statistical multilayer process space coverage for at-speed test," in *Proc. Design Autom. Conf.*, 2009, pp. 340–345.
- [15] S. Zeng, W. Yu, F. Gong, X. Hong, J. Shi, Z. Wang, and C.-K. Cheng, "Efficient frequency-dependent reluctance extraction for largescale power/ground grid," in *Proc. Int. Solid-State Integr.-Circuit Technol.*, 2008, pp. 2292–2295.
- [16] Y. Hu, T. Jing, X. Hong, Z. Feng, X. Hu, and G. Yan, "An efficient rectilinear Steiner minimum tree algorithm based on ant colony optimization," in *Proc. Int. Conf. Commun. Circuits Syst.*, 2004, pp. 1276– 1280.
- [17] K. Chopra, N. Shenoy, and D. Blaauw, "Variogram based robust extraction of process variation," in *Proc. ACM/IEEE Int. Workshop Timing Issues*, Feb. 2007, pp. 86–91.
- [18] T. Jing, Z. Feng, Y. Hu, X. Hong, X. Hu, and G. Yan, "λ-oat: λ-geometry obstacle-avoiding tree construction with O(n log n) complexity," *IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst.*, vol. 26, no. 11, pp. 2073–2079, Nov. 2007.
- [19] K. Xu, W. Xu, J. Shen, and X. Xu, "Task scheduling algorithm based on dual-Vdd dynamic reconfigurable FPGA," *J. Zhejiang Univ.*, vol. 45, pp. 52–58, Feb. 2010.
- [20] F. Gong, W. Yu, C. Yan, and Z. Wang, "An algorithm based on mixed boundary element integral formulations for extracting frequencydependent impedances of 3-D VLSI Interconnects," *J. Comput.-Aided Des. Comput. Graph.*, vol. 19, pp. 60–69, Oct. 2007.
- [21] C. Liu, J. Su, and Y. Shi, "Temperature aware routing synthesis considering spatiotemporal hotspot," in *Proc. IEEE Int. Conf. Comput. Des.*, Oct. 2008, pp. 107–113.
- [22] Y. Hu, T. Jing, Z. Feng, X. Hong, X. Hu, and G. Yan, "Aco-Steiner: Ant colony optimization based rectilinear Steiner minimal tree algorithm," *J. Comput. Sci. Technol.*, vol. 21, no. 1, pp. 147–152, 2006.
- [23] W. Xu, K. Xu, and X. Xu, "A novel placement algorithm for symmetrical FPGAs," in *Proc. Int. Conf. ASIC*, 2007, pp. 1281–1284.
- [24] J. Xiong, Y. Shi, V. Zolotov, and C. Visweswariah, "Pre-ATPG path selection for near optimal post-ATPG process space coverage," in *Proc. Int. Conf. Comput.-Aided Des.*, Nov. 2009, pp. 89–96.
- [25] Y. Hu, T. Jing, X. Hong, Z. Feng, X. Hu, and G. Yan, "An-oarsman: Obstacle-avoiding routing tree construction with good length performance," in *Proc. Asia South Pacific Design Autom. Conf.*, Jan. 2005, pp. 7–12.
- [26] M. Rofouei, W. Xu, and M. Sarrafzadeh, "Computing with uncertainty in a smart textile surface for object recognition," in *Proc. Int. Conf. Multisensor Fusion Integr. Intell. Syst.*, 2010, pp. 256–261.
- [27] W. Zhao, Y. Cao, F. Liu, K. Agarwal, D. Acharyya, S. Nassif, and K. Nowka, "Rigorous extraction of process variations for 65 nm CMOS design," in *Proc. Solid State Device Research Conf.*, Sep 2007, pp. 89–92.
- [28] P. Friedberg, W. Cheung, and C. J. Spanos, "Spatial modeling of micronscale gate length variation," *Proc. SPIE*, vol. 6155, pp. 125–129, Mar. 2006.
- [29] N. Drego, A. Chandrakasan, and D. Boning, "Lack of spatial correlation in MOSFET threshold voltage variation and implications for voltage scaling," *IEEE Trans. Semiconductor Manuf.*, vol. 22, no. 2, pp. 245–255, May 2009.
- [30] Q. Zhang, C. Tang, J. Cain, A. Hui, T. Hsieh, N. Maccrae, B. Singh, K. Poolla, and C. J. Spanos, "Across-wafer CD uniformity control

through lithography and etch process: Experimental verification," *Proc. SPIE*, vol. 6518, pp. 167–170, Apr. 2007.

- [31] K. Qian and C. J. Spanos, "A comprehensive model of process variability for statistical timing optimization," *Proc. SPIE*, vol. 6925, pp. 178–182, 2008.
- [32] J. R. Sheats, *Microlithography: Science and Technology*. Boca Raton, FL: CRC Press, 1998.
- [33] J. Sali, S. Patil, S. Jadkar, and M. Takwale, "Hot-wire CVD growth simulation for thickness uniformity," in *Proc. Int. Conf. Cat-CVD Process*, 2001, pp. 92–98.
- [34] S. Sakai, M. Ogino, R. Shimizu, and Y. Shimogaki, "Deposition uniformity control in a commercial scale HTO-CVD reactor," in *Proc. Mater. Res. Soc. Symp.*, 2007, pp. 121–125.
- [35] J. Brcka and R. L. Robison, "Wafer redeposition impact on etch rate uniformity in IPVD system," *IEEE Trans. Plasma Sci.*, vol. 35, no. 1, pp. 74–82, Feb. 2007.
- [36] T. W. Kim and E. S. Aydil, "Investigation of etch rate uniformity of 60 MHz plasma etching equipment," *Japan. J. Appl. Phys.*, vol. 40, pp. 6613–6618, Nov. 2001.
- [37] T. W. Kim and E. S. Aydil, "Effects of chamber wall conditions on Cl concentration and Si etch rate uniformity in plasma etching reactors," *J. Electrochem. Soc.*, vol. 150, no. 7, pp. G418–G427, Jun. 2003.
- [38] Q. Zhang, K. Poola, and C. Spanos, "One step forward from run-to-run critical dimension control: Across-wafer level critical dimension control through lithography and etch process," *J. Process Control*, vol. 18, no. 10, pp. 937–945, 2008.
- [39] B. Stine, D. Boning, and J. Chung, "Analysis and decomposition of spatial variation in integrated circuit processes and devices," *IEEE Trans. Semiconductor Manuf.*, vol. 10, no. 1, pp. 24–41, Feb. 1997.
- [40] K. Qian and C. J. Spanos, "Hierarchical modeling of spatial variability of a 45 nm test chip," *Proc. SPIE*, vol. 7275, pp. 234–238, Mar. 2009.
- [41] L. Cheng, P. Gupta, C. Spanos, K. Qian, and L. He, "Physically justifiable die-level modeling of spatial variation in view of systematic across wafer variability," in *Proc. Design Autom. Conf.*, Jul. 2009, pp. 104–109.
- [42] L. Cheng, J. Xiong, and L. He, "Non-Gaussian statistical timing analysis using second-order polynomial fitting," *IEEE Trans. Comput.-Aided Design Integr. Circuits Syst.*, vol. 28, no. 1, pp. 130–140, Jan. 2009.
- [43] Predictive Technology Model [Online]. Available: http://www.eas.asu. edu/ptm



Lerong Cheng received the B.S. degree in electronics and communication engineering from Zhongshan University, Guangzhou, China, in 2001, the M.S. degree in electrical and computer engineering from Portland State University, Portland, OR, in 2003, and the Ph.D. degree in electrical engineering from the University of California, Los Angeles, in 2009.

He is currently a Computer-Aided Design Engineer with SanDisk Corporation, Milpitas, CA. His current research interests include computer-aided design of very large scale integration circuits and

systems, programmable fabrics, low-power and high-performance designs, and statistical timing analysis.



**Puneet Gupta** (S'03–M'07) received the B.Tech. degree in electrical engineering from the Indian Institute of Technology Delhi, New Delhi, India, in 2000, and the Ph.D. degree from the University of California, San Diego, in 2007.

He is currently a Faculty Member with the Department of Electrical Engineering, University of California, Los Angeles. He co-founded Blaze DFM, Inc. (acquired by Tela, Inc.), Sunnyvale, CA, in 2004, and served as its Product Architect until 2007. He has authored over 60 papers, ten U.S.

patents, and a book chapter. His research has focused on building high-value bridges across application-architecture-implementation-fabrication interfaces for lowered cost and power, increased yield, and improved predictability of integrated circuits and systems. Dr. Gupta is a recipient of the NSF CAREER Award, the ACM/SIGDA Outstanding New Faculty Award, the European Design Automation Association Outstanding Dissertation Award, and the IBM Ph.D. fellowship. He has given tutorial talks at DAC, ICCAD, the International VLSI Design Conference, and the SPIE Advanced Lithography Symposium. He has served on the technical program committees of DAC, ICCAD, ASPDAC, ISQED, ICCD, SLIP, and VLSI Design. He served as the Program Chair of the IEEE DFM&Y Workshop in 2009 and 2010.



**Costas J. Spanos** (F'00) received the Electrical Engineering Diploma from the National Technical University of Athens, Athens, Greece, in 1980, and the M.S. and Ph.D. degrees in electrical and computer engineering from Carnegie Mellon University, Pittsburgh, PA, in 1981 and 1985, respectively.

From 1985 to 1988, he was with the Advanced Computer-Aided Design Group, Digital Equipment Corporation, Maynard, MA, where he worked on the statistical characterization, simulation, and diagnosis of very large scale integration processes. In 1988, he

joined the faculty at the Department of Electrical Engineering and Computer Sciences, University of California, Berkeley (UC Berkeley), where he is currently a Professor. He was the Director of the Berkeley Microfabrication Laboratory, Berkeley, from 1994 to 2000, the Director of the Electronics Research Laboratory, UC Berkeley, from 2004 to 2005, and the Associate Dean for Research with the College of Engineering from 2004 to 2008. He is currently the Department Chair with the Department of Electrical Engineering and Computer Sciences (EECS), UC Berkeley. He has published more than 200 referred articles and has co-authored a textbook in semiconductor manufacturing. He is working toward the deployment of information technology and statistical data mining techniques for energy efficiency applications. His current research interests include the application of statistical analysis in the design and fabrication of integrated circuits, and the development and deployment of novel sensors and computer-aided techniques in semiconductor manufacturing.

Dr. Spanos has served in the technical committees of numerous conferences, and was the Editor of the IEEE TRANSACTIONS ON SEMICONDUCTOR MANUFACTURING from 1991 to 1994. He has received several Best Paper Awards. In 2000, he was elected a Fellow of the Institute of Electrical and Electronic Engineers for contributions and leadership in semiconductor manufacturing. In 2009, he was appointed the Andrew S. Grove Distinguished Professor with the Department of EECS.



**Kun Qian** (S'06) received the B.S. degree from the Department of Microelectronics, Peking University, Beijing, China, in 2005. Since 2005, he started his graduate study with the Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, and is currently working toward the Ph.D. degree.

His current research interests include characterization, analysis and modeling of semiconductor device and circuit variability, statistical compact model extraction, and yield predictions.



Lei He (M'99–SM'08) received the Ph.D. degree in computer science from the University of California, Los Angeles (UCLA), in 1999.

He is currently a Professor with the Department of Electrical Engineering, UCLA. He was a Faculty Member with the University of Wisconsin, Madison, from 1999 and 2002. He held visiting or consulting positions with Cadence, Santa Clara, CA, Empyrean Soft, Santa Clara, Hewlett-Package, Santa Clara, Intel, Santa Clara, and Synopsys. He was a Technical Advisory Board Member for Apache Design

Solutions and Rio Design Automation. He has published one book and over 200 technical papers. His current research interests include modeling and simulation, very large scale integration circuits and systems, and cyber physical systems.

Dr. He has earned twelve Best Paper nominations, mainly from the Design Automation Conference and the International Conference on Computer-Aided Design, and five Best Paper or Best Contribution Awards, including the ACM Transactions on Electronic System Design Automation 2010 Best Paper Award.