Handbook of algorithms for physical design automation part 39 pps

[16], the use of interleaved median improvement during quadratic force-directed placement was shown to improve wirelength by 10–15 percent when measured in terms of HPWL.. Rather than us

Trang 1

2

3

4, 5 6

1

(b)

FIGURE 18.7 Illustration of median calculation for a cell C connected to three nets In (a), the original

placement of cells is shown In (b), the median or optimal range of(x, y) values for cell C is shown Six x- and y-positions are used for the median computation because three nets are involved Note that two-pin nets degenerate to a single point A larger set of position for cell C can be computed and is best done by expanding

the median rectangle outward according to the points used in the median computation and corresponds to the concept of-neighborhoods described by Goto in Ref [23].

Median improvement was implemented within the force-directed placer,FDP Specifically, mul-tiple passes of median improvement are performed as cell overlap is reduced Because median improvement attempts to reposition each cell within its median rectangle, the use of median improve-ment can reintroduce cell overlap into the placeimprove-ment To alleviate the cell overlap,FDPattempts to carefully monitor the distribution of cell area when placing a cell inside its median rectangle, it

is positioned such that a minimum of overlap is reintroduced Further, if at any point during the algorithm too much cell overlap is reintroduced, the algorithm is terminated

In Ref [16], it was empirically observed that the use of median improvement is most effective near the beginning of placement when a large amount of cell overlap is prevalent and less effective toward the end of placement To this end, the median rectangle used inFDPis enlarged when it is discovered

Trang 2

Median rectangle

5, 6 4

3 2 1

4, 5 3

2 1 6

(a)

Extended rectangle for HPWL and overlap minimization

Median rectangle

5, 6 4

3 2 1

1 2

3

4, 5 6

(b)

FIGURE 18.8 Rectangles used inFDPbased on Goto’s idea to improve wirelength In (a) the median or optimal range for replacing a cell In (b) the extended range for replacing the cell that improves wirelength while providing more placement flexibility in order to avoid the reintroduction of overlap

that the algorithm is reintroducing too much cell overlap—this extension of the search rectangle for repositioning a cell is similar to the notion of an-neighborhood described by Goto The rectangles

considered by the median improvement algorithm used inFDPare illustrated in Figure 18.8 Finally, the median improvement heuristic is used in yet another way inFDP Specifically, at each iteration, median improvement is applied to determine a new position for each cell However, the cell positions are not updated Rather, the cell positions obtained by calling median improvement are used to compute an additional force on each cell It was found, inFDP, that this additional force can be used

to deflect the constant forces and lead to an improved overall quality of placement In Ref [16], the use of interleaved median improvement during quadratic force-directed placement was shown to improve wirelength by 10–15 percent when measured in terms of HPWL

Another interleaved optimization is the iterative local refinement used inFastPlace In this approach, a regular bin structure is imposed over the placement area to estimate the current utilization

of a placement region The netlist is traversed and the source bin for each cell is determined Cells are then moved from source to target bins based on both the amount of wirelength improvement and the target bin’s utilization For every cell present in a bin, four scores are computed corresponding to

Trang 3

Score

Score N

E W

S

3 2

1 1

2 3

FIGURE 18.9 Illustration of iterative local refinement used inFastPlace A regular bin structure is imposed over the placement area Cells in source bins are tentatively moved from source bins in the north, east, south, and west directions into adjacent target bins Scores for each of the four moves are calculated based on reduction in wirelength and change in bin utilization The best movement is chosen unless all scores are negative, in which

case the cell is left in its source bin (From Viswanathan, N and Chu, C.C.-N., IEEE Trans CAD 24, 5, 722,

2005 Copyright IEEE 2005 With permission.)

the four possible movements of a cell For computing scores, it is assumed that a cell moves from its current position in the source bin to the same position in each target bin that is adjacent to the source bin That is, cells are tentatively moved by one bin width (or bin height) Each score is a weighted summation of two components, namely the resulting wirelength reduction and resulting utilization

of the source and target bins Because this refinement scheme is used primarily to reduce wirelength, the first term of the scoring function is more heavily weighted If, for any cell, the four computed scores are all negative, the cell is kept in its source bin This refinement strategy is illustrated in Figure 18.9 Several iterations of iterative local refinement are performed until there is no significant improvement in wirelength By not using iterative local refinement in Ref [24], a reduction of 32.2 percent in total runtime was observed, but final wirelengths were 15.1 percent worse Further, the wirelength increase was more prominent as circuit size increased Thus, the iterative local refinement

is significant in improving the final quality of result

Netlist clustering is an attractive means of improving the runtime and quality of placements pro-duced by force-directed methods Clustering coarsens a netlist by merging cells together to form larger groups of cells, or clusters, with the hyperedges adjusted to reflect the possible absorption

of circuit connections into clusters Placement is performed on the coarsened netlist and, as the algorithm progresses, netlists are repeatedly uncoarsened and placed Unclustering and reclustering

is sometimes performed at intermediate steps during placement to allow the placer to escape from earlier bad clustering decisions [16,25] After placement, detailed improvement is usually performed

on the flat netlist to improve final results

Clustering methods have been used successfully in a number of force-based methods [20,24, 26–29], including many of those methods described in this chapter Clustering has been shown to significantly improve the runtime and quality of placement results Much of this improvement stems from the fact that clustering helps to keep tightly connected cells together and prevents placement algorithms from being trapped in local minima For additional information on clustering, we refer the reader to the cited works and Chapter 7 For additional information on the use of clustering to improve placement, we refer the reader to Chapter 19

Trang 4

18.5 NONQUADRATIC, CONTINUOUS METHODS

As previously mentioned, force-directed placers tend to rely on quadratic optimization and inter-leaved improvement heuristics that directly minimize an approximation of wirelength Although the quadratic wirelength can be linearized using, for example, reweighting [30] and function regular-ization [31], these schemes require more computational effort when compared to simple quadratic optimization with interleaved improvement Further, the linearization of quadratic wirelength still requires the conversion of the circuit hypergraph to its weighted graph representation, which serves

to further abstract the wirelength model

Rather than using a quadratic wirelength objective and a hypergraph-to-graph transformation, several placers includingAPlace [32],mPL [28], and LSD[24] work directly with the circuit hypergraph and attempt to simultaneously minimize a linear wirelength estimate and distribute cells throughout the placement area These methods implement an objective function that consists, in part,

of minimizing a metric of quality such as wirelength and a measure of infeasibility such as overlap This can be encapsulated in the generic form given by

whereβ is an adjustable parameter that represents a trade-off between the quality of result and the

amount of cell overlap during any point of the placement method Typically, the trade-off con-stant, β, is set close to 1 early in placement (where the focus is on placement quality), and is

reduced throughout the placement method to encourage the distribution of cells throughout the placement area

Placement methods that work directly with the circuit hypergraph, upon first glance, do not appear to be force-directed methods At a minimum, however, they are similar in that these methods reduce cell overlap without partitioning We shall see that there are additional similarities

LSD[24] performs placement with an objective function similar to Equation 18.26 and works with the circuit netlist directly to minimize HPWL However, it still relies on force-directed methods that use constant additional forces such asKraftwerkandFDP

Specifically, each placement iteration ofLSDworks as follows: Given a placement, additional constant forces are computed to both stabilize and perturb the given placement that is identical to KraftwerkandFDP The particular weights for the perturbing forces are less significant in LSD and the forces are normalized to unity Subsequently, a QP identical to Equation 18.5 is solved and yields a new placement of cells However, unlikeKraftwerkandFDP—which consider this to

be the new placement—LSDconsiders this placement as only a suggested placement of cells Cells are not actually moved to these new positions Rather, the new cell positions are subtracted from their original positions to yield a suggested search direction Movement in this suggested direction reduces cell overlap while accounting for quadratic wirelength

Within a placement iteration,LSDemploys a median improvement heuristic The initial place-ment provided to the heuristic is the original placeplace-ment and not that obtained from the QP Similar

to the QP, the median improvement heuristic returns a new placement of cells However, this new placement is aimed at reducing the HPWL of the circuit netlist directly, and does not take into account quadratic wirelength or cell overlap This new placement, however, is not used as a placement— similar to the placement from the QP, the new cell positions from median improvement are subtracted from their original positions to yield another suggested search direction This search direction aims

to minimize HPWL

Finally, in each placement iteration,LSDperforms a two-dimensional line search within the cone produced by the two search directions computed from the QP and application of median improvement This search cone is illustrated in Figure 18.10 For each position sampled by the

Trang 5

Suggested position from

a median improvement iteration Cell i

Suggested position from a Kraftwerk iteration

Search region to trade off overlap and wirelength reductions for cell i

FIGURE 18.10 Illustration of the search cone used inLSD For each cell, a new placement is computed along the lines ofKraftwerk, which yields a search direction that tends to reduce cell overlap Similarly, for each cell, a call to median improvement yields a new placement that tends to reduce HPWL directly.LSDuses a line search to explore the region between these two search directions to find the placement of cells that best serves

to trade off HPWL and cell overlap according to Equation 18.26

two-dimensional search, the placement quality and placement overlap are assessed according to the normalized function given by

score= β × New_HPWL

Orig_HPWL+ (1 − β) × New_Overlap

where β controls the preference between HPWL and cell overlap The values New_HPWL and Old_HPWLrepresent the HPWL of the current placement being considered by the line search and the original placement, respectively Similarly, New_OverlapandOld_Overlaprepresent a measure of the amount of cell overlap for the current placement being considered by the line search and the original placement, respectively Once all placements have been tested by the line search, the placement with the best normalized score is selected and cell positions are updated The value of

β is a control parameter, which is initially set close to unity to encourage wirelength improvement

and is slowly lowered as cells are spread to accelerate convergence Placement terminates once there exists relatively little cell overlap, as in Ref [16]

The line search offers a significant advantage overKraftwerk-like methods [4,16,24], because

it implements an easily tunable objective function that can be geared toward speed (by encouraging faster spreading) or toward quality (by preferring lower HPWL) The line search can be also be extended to account for additional objectives simply by computing new forces and modifying the objective function and line search accordingly Nevertheless, the similarities betweenLSDand more traditional force-directed placers are clear

18.5.2 APLACE AND THELOG-SUM-EXPAPPROXIMATION

In the patent by Naylor et al [33], the HPWL of a hyperedge is approximated using a log-sum-exp formula, given by

HPWLλ (e) = α

⎡

⎣ln

⎛

vj∈ei

e xj α

⎞

⎠ + ln

⎛

vj ∈ei

e −xj α

⎞

⎠ + ln

⎛

vj ∈ei

e yj α

⎞

⎠ + ln

⎛

vj ∈ei

e −yj α

⎞

⎠

⎤

⎦ (18.28)

Trang 6

whereα is defined as a smoothing parameter The smaller the value of α, the more accurate the

approximation to Equation 18.1 However,α cannot be chosen to be too small because of machine

precision and numerical stability In effect, the use of the log-sum-exp formula picks the dominant cell positions to approximate the exact HPWL for each edge as specified in Equation 18.1 Despite its use of transcendental functions, the approximation in Equation 18.28 is both differentiable and strictly convex, which makes it fairly simple to minimize

To spread cells, it is desirable to augment the log-sum-exp form with a penalty function that penalizes the uneven distribution of cells To this end, based on the patent in Ref [33],APlace [32,34,35] imposes a grid on the placement area and attempts to equalize the total cell area in every grid bin The straightforward penalty for an uneven cell distribution is given by

b

where A (b) is the cell area in bin b This penalty is neither smooth nor differentiable and is difficult

to optimize.APlaceapproximates the total cell area in each grid bin by area potentials for each cell The area potential uses a bell-shaped function, as shown in Figure 18.11, to model the effect of

a cell’s area on nearby grid bins It is described by the equation given by

Potential(c, b) = α(c) · f (|c x − b x |) · f (|c y − b y |) (18.30)

for grid bin b with center (x b , y b ), cell c with center (x c , y c ), and f (·) representing the bell-shaped

function Here,α(c) is a proportionality factor used to ensure that the sum of the potentials for a cell

equals the cell’s area That is,

b

Potential(c, b) = Area(c) ∀ c ∈ V

In Equation 18.30 and illustrated in Figure 18.11, the bell-shaped function is given by

p (d) =

1−2d2

r2 , 0≤ d ≤ r

2

2(d−r)2

p(d)

1 − 2d2 /r2

2(r− ) 2 /r2

r

FIGURE 18.11 Bell-shaped penalty function that is used to remove overlap between cells r controls the

range of interaction (the radius) of any given cell’s potential In standard cell placement, the value of r can be set constant, but in mixed-size placement, it is typically adjusted on a per-cell basis (with larger values of r employed for larger cells) (From Kaling, A.B and Wang, Q., IEEE Trans CAD 24, 5, 734, 2005 Copyright

2005 With permission.)

Trang 7

where r represents the radius of the cells’ potentials The use of piecewise quadratic functions makes

the potential function simple to differentiate Given the notion of cell potential, the expected potential

of a grid bin b is one in which the total cell area is evenly distributed over all grid bins That is,

Expected potential(b) = Total cell area

Thus, to minimize cell overlap, it suffices to minimize the difference of the potential of cells within each bin and the corresponding expected potential of each bin using a penalty function given by

Penalty=

b

cell c

Potential(c, b) − Expected potential(b)

2

(18.33)

which is smooth and differentiable because of the selection of the bell-shaped potential function

InAPlace, the penalty term in Equation 18.33 is combined with the log-sum-exp approximation

to wirelength in Equation 18.28 to arrive at a linearly weighted objective function that represents a trade-off in linear wirelength minimization and the quadratic overlap penalty This objective function

is given by

In Equation 18.34, the constantζ controls the weight associated with wirelength minimization, while

ω is used to weight the overlap removal Too large a value of ω can cause cells to spread hastily and

lead to poor wirelength; too large a value ofζ can contract cells together and prevent them from

spreading out To counteract these effects,APlace keeps the value ofω fixed, and sets ζ to be

large in the beginning; as the solver slows down (or as a solution appears),ζ is divided by two The

equation is solved repeatedly (and the balance of the weight tipped toward the penalty objective) until cells are spread evenly across the placement area Because of its smooth and differentiable nature, this objective function can be solved efficiently using the Polak–Ribierre method [10] To address runtime performance, the placement grid is initially made very coarse, which leads to cells spreading more quickly early on A progressively finer grid is used as cells spread to ensure a more even distribution of area

The bell-shaped function previously described is most applicable to standard cells, which are roughly the same size A modification to the bell-shaped penalty term is described in Ref [34]

to allow for the placement of larger macrocells like those found in mixed-size circuits In this modification, the scope of the area potential is extended according to the block size so that a larger

block has a nonzero potential with respect to nearby grid bins Given a module v with width w v,

located in bin b, the scope of the module’s x-potential is given by wv

2 + 2w b That is to say, every grid bin within a horizontal distance ofwv

2 + 2w b from the module’s center has a nonzero x-potential contribution from the module Consequently, the bell-shaped potential of a cell v and grid bin b become (x-direction only)

p vx (d) =

⎧

⎨

⎩

1− α · d2, 0≤ d ≤ w v

2 + w b β(d − w v

2 − 2w b ), w v

2 + w b ≤ d ≤ w v

2 + 2w b

(18.35)

whereα = 4(wv+4wb)

wv +2wb andβ = 2(wv+4wb)

wb The function is formulated in this fashion so that it is smooth

when d x= wv

2 + w b (A similar formula is employed in the y-direction.)

One of the benefits of the formulation employed inAPlaceis its extensibility In Ref [32], geometric constraints are considered as additional penalty terms For example, to handle alignment

constraints (in the x-dimension), a penalty function such as

i ∈|VH | (x i − ¯x)2can be added

Trang 8

Numerous additional details are presented in Ref [29] to improve the overall quality and per-formance ofAPlace One improvement to quality stems from the use of multilevel clustering An adaptive grid size is also described in which the coarseness of the grid is modified based on average cluster size, as this was found to lead to better wirelengths with reduced runtime Several other implementation-specific details are mentioned therein, and we refer interested readers to Ref [29] for more information

APlacedoes not share a direct analogy with the concept of a force—its relationship to other force-directed methods is limited to the removal of cell overlap without the need to partition the placement area and, perhaps, its use of the conjugate gradient method for minimization It is rea-sonable to interpret the gradient of the objective function used inAPlaceas a force that specifies

a direction for cell movements

18.5.3 MPLANDITSGENERALIZATION OFFORCE-DIRECTEDPLACEMENT

Like APlace, mPL [28] works directly with the circuit hypergraph and minimizes wirelength through the use of the log-sum-exp form in Equation 18.28 The log-sum-exp form was chosen from among two other objective function candidates: the first was the quadratic approximation to

wirelength given in Equation 18.2 and the second was the L p-norm approximation [15] given by

HPWLLp=

e ∈EH

⎡

⎢

⎛

vk∈e

x k p

⎞

⎠

1

−

⎛

vk ∈e

x k −p

⎞

⎠

− 1p

+

⎛

vk ∈e

y k p

⎞

⎠

1

−

⎛

vk ∈e

y k −p

⎞

⎠

− 1p⎤

In the L p -norm, the first and second terms tend to max x k and min x k as p tends to infinity [28], which

results in a tight approximation of the HPWL similar to the log-sum-exp form However, experimental evidence presented in Ref [28] suggests that the log-sum-exp form offers HPWL results, which are 3

and 61 percent better than the L p (p = 32) and quadratic approximations, respectively Moreover, the log-sum-exp approximation was solvable 67 percent faster than the L p-norm but 23 percent slower than the quadratic model, and was therefore deemed to offer the best balance of runtime and quality UnlikeAPlace, which uses the bell-shaped function to spread cells evenly in localized regions, mPLspreads cells globally via the Helmoltz equation (which is similar to the Poisson equation used

inKraftwerkand other placers).mPLimposes a grid structure over top of the placement region

For every bin b i,j in the grid, its density d i,j is computed at each placement iteration New positions for cells are determined by solving the optimization problem given by

min

⎧

⎨

⎩W =

e ∈EH

HPWLλ (e)|d i,j = K ∀ bins b i,j

⎫

⎬

where K is a target density (K can be specified differently for each b i,jin situations when a nonuniform distribution of cells is desired [28].) This optimization problem is difficult to solve because of the nondifferentiability of the constraints Thus,mPL uses the Helmholtz equation to arrive at a

continuous density representation of D= (d i,j ).

The solution to the Helmholtz equation (with boundary conditions) can be used to model diffusion processes, and thus makes an ideal candidate for modeling the spreading of cells in a two-dimensional grid (The Poisson equation can be considered a special case of the Helmholtz equation.) Applied to the density distribution, the Helmholtz equation can be rewritten as

∂2φ(x, y)

∂x2 +∂2φ(x, y)

∂y2 − φ(x, y) = d(x, y), (x, y) ∈ R

∂φ

(18.38)

Trang 9

 > 0

v is an outer unit normal

R represents the placement region, and d (x, y) represents the continuous density function

The boundary conditions, encapsulated in the term∂φ ∂v = 0, specify that forces pointing outside

of the placement region be set to zero (i.e., Neumann boundary conditions)—this is a key difference with the Poisson method used inKraftwerk, which assumes that forces become zero at infinity

Because the solution of Equation 18.38 gains two more derivatives than d (x, y), φ is a smoothed version of the density function [28] To solve this problem using the densities d i,j, the problem is first discretized using the finite difference method [36] (while employing the Neumann boundary conditions) Ifφ i,jrepresents the value ofφ at the center of bin b i,j , and h x , h yrepresent the width and height of a bin, the discrete approximation to Equation 18.38 can be expressed as

φ i +1,j − 2φ i,j + φ i −1,j

h2 +φ i,j+1− 2φ i,j + φ i,j−1

for all 1≤ i ≤ m and 1 ≤ j ≤ n, subject to

φ 0,j = φ 1,j, ∀1 ≤ j ≤ n

φ m +1,j = φ m,j, ∀1 ≤ j ≤ n

φ i,0 = φ i,1, ∀1 ≤ i ≤ m

φ i,n+1= φ i,n, ∀1 ≤ i ≤ m

(18.40)

Once this linear system of equations is solved forφ, the optimization problem Equation 18.37 can

be reexpressed in terms ofφ yielding the problem given by

min

⎧

⎨

⎩W =

e ∈EH

HPWLλ (e) | φ i,j = ˆK ∀ bins b i,j

⎫

⎬

where ˆK is a scaled constant representation of K This optimization problem is solved using Uzawa’s

algorithm [37] One iteration of Uzawa’s algorithm is given by

W k+1+

i,j

λ k i,j φ i,j = 0

λ k+1

i,j = λ k i,j + α(φ i,j − ˆK)

(18.42)

where

λ k is the Lagrange multiplier at the kth iteration

α is a parameter to control convergence

Recall that the wirelength portion of the objective W is a function of the cell positions in each iteration Thus, the term W k+1is a function of the positions of cells in x and y in iteration k+ 1 The gradient ofφ i,jcan be approximated using the difference scheme

xk φ i,j =φ i,j+1− φ i,j

yk φ i,j =φ i +1,j − φ i,j

Trang 10

In Ref [23], one iteration of aKraftwerkplacement iteration is shown to be related to an iteration of Uzawa’s algorithm Given the force equation used byKraftwerkin Equation 18.5, a change of variable names shows that the incremental change in cell positions for a quadratic system can be reexpressed as

xk+1

yk+1

+

px py

+ τ k

fk

In this equation, the quadratic wirelength approximation is used in place of the log-sum-exp

model for W in Equations 18.37 and 18.41 The values of C, p x, and py are derived fromW.

The scalar τ k controls the weighting in each iteration, while the forces fx and fy are computed

based on the placement in the kth iteration In Kraftwerk, this equation is iteratively solved until cells are well spread across the placement area This equation is a special case of the Uzawa iteration in Equation 18.42, achieved by fixingλ k

ij = τ k Theλ k values are known to be the Lagrange multipliers of Equation 18.41, and must be large enough to spread cells but small enough to achieve convergence Whereas force weighting is an issue inKraftwerk,mPLoffers the possibility of dynamically adjusting the weights for all forces individually at each iteration

by updating the Lagrangian multipliers Consequently, mPL does not require the ad hoc force scaling typically employed in methods based on Ref [4] and represents a generalization of Kraftwerk

18.6 OTHER ISSUES

Several important issues have not been addressed in the previous description of force-directed place-ment methods and we touch upon some of these issues here, including I/O placeplace-ment, fixed obstacles, and heterogeneous resource placement

Force-directed methods, includingKraftwerk,FDP,mFAR, andFastPlace, require I/Os

to be preplaced before force-directed placement because of the need for(x) to be positive definite.

However, in some circumstances, the placement of I/Os can be another degree of freedom, as their placement can impact overall quality Despite some efforts in the literature (c.f., [24,38]), it is possible that placeable I/Os can be handled more effectively within the context of force-directed and continuous placement methods

Another difficulty for force-directed methods potentially lies in the handling of fixed obstacles within the placement area Figure 18.12a shows theadaptec2circuit from theISPD05benchmark suite [39] This circuit contains a large number of preplaced macrocells In particular, there is a very large preplaced macrocell in the middle of the placement area Several heuristic strategies (c.f., [19]) have been presented for positioning fixed points on the boundaries of fixed obstacles to encourage cells to be pulled through fixed obstacles and to prevent other cells from remaining inside of fixed obstacles Figure 18.12b shows the positions of movable cells immediately after the solution to an unperturbed quadratic objective—the cells are highly overlapping, and many are located to the right

of the large macro In practice, it may be difficult for the movable cells in this example to push through or “move around” the obstacles If one considers the various incarnations of force-directed methods, there is nothing explicit, which indicates that the methods would fail to properly handle fixed obstacles In fact, placers includingAPlace,mPL,mFAR,FastPlace, andKraftwerk successfully placed the circuits in theISPD05benchmark suite, all of which contain a number of fixed obstacles However, it is possible that the handling of fixed obstacles can be improved Another issue for force-directed methods pertains to the proper handling of heterogeneous resources that commonly appear in the context structured ASIC placement Such resources cor-respond to specialized macroblocks (like RAM blocks, multiplier blocks, and IP cores) [40], which are generally much fewer in number than the remaining core (standard) cells Such blocks require placement at discrete positions inside of the structured ASIC and their placement imposes discrete

Định dạng
Số trang	10
Dung lượng	191,42 KB