Handbook of algorithms for physical design automation part 44 pdf

The Tetris algorithm packs a single cell at a time onto one side of the placement region, whereas this approach finds a set of cells to fill an entire standard cell row.. If the size of

Trang 1

20.3.2 TETRIS-BASEDLEGALIZATION

Rather than distributing logic elements with flow-like methods, an alternate approach based on packing is possible The Tetris legalization method by Hill [12] is remarkably simple, and trivial to implement We first discuss the approach in the context of standard cell design, and then show how

it extends to handle a mix of standard cells and macroblocks For each cell c i, we have a desired position(xi , y i ); the cells must be legalized into standard cell rows R = {r1, r2, , rk}, and assume that the left-most open position in each row is known

The legalization method first sorts the cells C by their x position, and then inserts them into the

left side of a row in a greedy manner, such that the displacement of each cell is minimized

{C} =All cells to be legalized;

Sort the cells inCby theirX-coordiates to getLs;

lj= left-most position of each rowrj;

fori=1 to the number of cells do

best=lim sup;

forj=1 to the number of rows do

cost=displacement of moving celliinLstolj;

best=cost;

best_row=j

end if

end for

Move celliinLsto the rowbest_row;

lbest _ row=lbest _ row+widthi;

end for

Pseudocode for the approach is shown in Algorithm 4 If one were to rotate the placement region counterclockwise, the legalization process might look very much like a game of Tetris, for which the algorithm is named

Our example code can be made to run more quickly by only considering rows close to the desired position of a cell; in practice, runtimes are linear with the number of cells to be legalized Our example also packs cells to the left, but this is not in general necessary; if space has been reserved for routability, it may be perferable to legalize into a position that is not to the left

Other obvious variants are to legalize from both the left and right sides, and to sort the cells

by their leftmost boundary, rather than their center The method can also be adapted to designs that contain macroblocks; [13] simply added a check for each row spanned by a macroblock, to find the leftmost position that did not result in an overlap

Although the Tetris method works well in practice for global placements that have the logic elements distributed evenly, even small areas with overlap can cause significant trouble When a block must be displaced during legalization, it may cause a ripple effect, resulting in the displacements of many other blocks In some cases, wirelengths can jump by 30 percent or more; the method is fast and effective for easy legalization problems, but performs poorly on more difficult cases

20.3.3 SINGLE-ROWDYNAMICPROGRAMMING-BASEDLEGALIZATION

Contrast to the simple Tetris legalization method is one which operates on a row-by-row basis [15], single-row dynamic programming-based legalization was developed for standard cell placements, and selects a subset of cells to place in a row using dynamic programming

Trang 2

The outline of the technique is given in Algorithm 5 Legalization is done from the bottom row to

the top row (although the reverse is also possible) Cells are first sorted by their initial Y -coordinates

in nondecreasing order Let Lsdenote this sorted list of cells A set of candidate cells C i

candare then

selected for assignment to a particular row, say R i The total width of cells in C i

candis greater than the

capacity of R iand satisfies the following relation:

k1∗ width(R i) ≤ width(c i

cand) = m

j=1

width(cj ) ≤ k2∗ width(R i)

where

k1, k2∈ and are constants such that 1 < k1 ≤ k2

width(Ri) is the width of row Ri

width(C i

cand) is the total width of cells in C i

cand

width(cj ) is the width of cell cj ∈ C i

cand

m = |C i

cand|

Note that cells in C i

candare always the lowermost m cells in Lsthat satisfy the above relation The Tetris algorithm packs a single cell at a time onto one side of the placement region, whereas this approach finds a set of cells to fill an entire standard cell row The method used to find this set of cells is similar to the dynamic programming technique of solving the classic 0–1 knapsack problem [16]

In the knapsack problem, there is a limit on the weight of the knapsack; the analogue of this constraint is the limit on total cell width in the row Similarly, the value of an item in the knapsack problem is modeled by the physical displacement of each cell

After identifying the set of candidate cells C i

cand, they are sorted by their X-coordinates in nondecreasing order These cells are considered for assignment to R i in this sorted order The knapsack-like problem is solved using dynamic programming, with the selected cells being packed

in, and the remaining cells being considered for legalization in the next row This process is repeated until all cells have been legalized Algorithm 5 illustrates the approach

(The overall approach is a variation of the classic method for the 0–1 knapsack problem)

{C} =All cells to be legalized;

Sort the cells inCby theirY-coordinates to getLs;

fori=1 toNr-1do

Select a set of candidate cells(Ci

cand)fromLs; Assign a subset of cellsCi

assignfromCi

candto rowRi;

Ls=Ls\Ci

assign;

end for

Assign the remaining cells inLsto rowRN r;

Every cell c j has two factors associated with it: (1) the cost of assigning c j to row R i, denoted by

w i

j and (2) the penalty of not assigning c j to R i , denoted by p i

j Obviously, if c j is assigned to R ithen,

p i

j = 0 and if it is not assigned to R i , then w i

j = 0

If cell c j is to be assigned to row R i, then the cost of assignment is given by the following equation:

w i

j = (x i

j − x j )2+ (y i

j − y j)2

where

(xj , y j ) is the location in Ri where c jis going to be placed

(xj , y j) is cs initial location

Trang 3

On the other hand, the penalty of not assigning c j to R iis given by

p i

j = (y i+1

j − y j)2

where

y i+1

j is the y-coordinate of the center of row R i+1

yj is cj s initial y-coordinate

20.4 LOCAL IMPROVEMENTS

Following legalization, a large number of well-studied techniques can be used to optimize a circuit

placement The computational complexity of many useful algorithms is exponential; O (2 n ) or O(n!) are common, where n is the number of logic elements being optimized Even on the fastest available computer, the practical values of n are small—far smaller than the number of elements in a placement

problem

For this reason, only a small number of elements are processed in any given step: those under the sliding window If the size of the sliding window can accomodate six cells, one might optimize the first six cells in a single row, and then move on to the third through ninth cell With only six

elements, a O (n!) algorithm is practical Overlaps between optimization windows help minimize the

impact of the local nature of this step Sliding window optimization is shown in Figure 20.9

20.4.1 CELLMIRRORING ANDPINASSIGNMENT

The first class of local improvements we consider is cell mirroring; this technique is easy to under-stand Each cell in a design contains a set of input and output pins, and these are not usually

symmetrically placed By mirroring a cell around its X axis, interconnect lengths can be improved

in many cases An illustration of this is shown in Figure 20.10

An early study of the problem by Cheng [17] showed that finding optimal orientations is in fact NP-complete; as such, we cannot hope to find an optimal solution to the problem In practice, however, relatively simple heuristic methods can be quite effective One approach is to simply evaluate each orientation on a cell-by-cell basis, selecting the better choice in a greedy manner Within a few passes, solution quality converges

Detailed placement techniques normally operate

on a small window into the circuit, making

local improvements Most techniques assume

macroblocks are fixed, and optimize around them.

Sliding window optimization is common; windows typically overlap slightly Optimization windows may be single- or multirow Multiple passes of optimization are also commonplace.

Fixed macroblock

FIGURE 20.9 Detailed placement techniques are normally applied on a subset of the design, moving from

one area to another with a sliding window approach

Trang 4

A

B

B C2

C1 Out

Out

Out Out

Default orientation for C1 Mirrored orientation

A

B

Out

A B

Out

A B

FIGURE 20.10 Standard cells can normally be mirrored along the Y axis This can reduce the length of some

interconnect segments, and may be used to slightly shift routing demand from one area to another Macroblocks frequently allow mirroring along both axis, as well as rotation

For typical standard cell design, mirroring is only possible along the X axis; power and ground lines are integrated into the cell design, and mirroring along the Y axis might not be possible When

implementing a detailed placement approach, one should consider the nature of the standard cells; similarly, the detailed placement tool should be considered in cell design

For macroblocks, mirroring around both the X and Y axes may be possible; it may also be possible

to rotate a block

Related to cell mirroring is the technique of pin assignment For many logic elements, there may

be functionally equivalent pins, for example, the inputs of a NAND gate are equivalent A circuit

designer may specify that signal net n i is connected to input pin a, while signal net n jis connected to

input pin b; some detailed placement tools support optimization of the assignment Individual pins

may have differing delay characteristics (due to the internal layout of transistors), so optimization of

a pin assignment may benefit wirelength, delay, or both

Finally, cell mirroring and pin assignment may be beneficial in improving the routability of a design In channel routing tools, an internal data structure known as a vertical constraint graph is frequently used; the data structure reveals an ordering of interconnect segments that will allow easy routing Depending on precise locations of individual pins, there may be cycles in the constraint graph—significantly complicating routing Slight changes to pin locations, which can be achieved with mirroring or pin assignment, may eliminate cycles and simplify routing greatly

Early works by Her [18] and Swartz [19] illustrate a tight coupling between detailed placement and routing Slight adjustments at a local level can improve routing, and these adjustments are difficult

to make without actually performing the detail routing step

20.4.2 REORDERING OFCELLS

A second common detailed placement method, sometimes integrated with cell mirroring, is improving

a design through the reordering of small groups of cells using a sliding window approach Figure 20.11 shows two cases; in the first, wirelength can be improved by swapping the positions of a pair of a cells In the second, the optimization window size required to find a better placement is three cells For standard cell design, it is common to see optimization window sizes ranging from three to eight Much larger windows carry a heavy runtime cost, while windows of size two have relatively small wirelength benefit

Brute force enumeration is easy to implement; some speed improvements can be obtained using branch-and-bound techniques Caldwell [20] presented an extensive study of the topic, considering different ways to to enumerate permutations, and to bound the solution space Pseudocode for a simple branch-and-bound approach to cell reordering is shown in Algorithm 6

Trang 5

A C B D B A D C A D B C A B C D

An optimization window of

size 2 can improve the

initial placement (swapping

the cells B and C).

A slightly different initial placement cannot be improved with an optimization window of size 2 To obtain a better placement, a larger optimization window is required.

FIGURE 20.11 Order of cells in a row (or group of rows) can be refined using small optimization windows;

the size of the window limits the improvement possible, and also impacts runtime considerably

With designs that are densely packed, evaluating any given permutation is relatively simple; cells

of different sizes are simply placed in the order, with the position of cell c ibeing immediately after

cell c i−1 When there is open space, however, the problem is slightly more difficult

partial wirelengths, the number of solutions explored can be reduced {C} =cells in the optimization window;

{F} =empty set;

left=left side of optimization window;

reorder(C,F,left) {

ifCis empty then

Evaluate current configuration;

Store configuration if improved;

else

placeciatleft;

reorder(C−ci,F+ci,left+width(ci));

end for

end if

}

Consider a simple case with cells c i and c j, and a small open space There are two different

possible permutations of the cells (c i cj and c jci), but a potentially infinite number of ways to divide the available space between the cells The optimal amount of space to be inserted before the first cell, between the first and second cell, and after the second cell can be determined (see Section 20.4.4), but this complicates the reordering process

Reordering of cells across multiple rows is also fairly common When cells are of different sizes, not all permutations may be valid

One method of implementing a multiple row reordering function is to first divide the available space on a row-by-row basis For any given permutation, each cell is inserted into the top-most open row; when a row is filled, processing moves to the next row If the permutation does not fit, this will

be discovered along the way, and the potential solution can be discarded

When implementing a reordering technique, a number of performance trade-offs must be considered

Trang 6

• If the size of the optimization window is small, brute force enumeration may be the best approach Branch-and-bound methods require incremental wirelength computations, and this has a runtime overhead that may outweigh the benefits of solution space pruning

• If the size of the optimization window is large, branch-and-bound method can have signifi-cant benefit Many permutations can be eliminated easily, with the savings outweighing the extra cost for incremental wirelength computation

• If the design contains space between cells, one must decide how to handle it Optimal space allocation can be done, but with a runtime cost; for large optimization windows, this is unlikely to be practical Space can be treated as a single monolithic unit, or divided equally between cells; what is best may depend on the design itself

• For optimization across multiple rows, many permutations may be eliminated if cells are

of different sizes This can increase the size of a practical optimization window in some cases When the design contains space, this may make more permutations possible, thereby lowering the practical optimization window size

20.4.3 OPTIMALINTERLEAVING

Another method of altering the order of a group of cells is optimal interleaving by Hur [21] The approach utilizes dynamic programming to allow for large window sizes with low runtimes We suggest the text by Cormen [16] as a good reference for dynamic programming (Figure 20.12)

To begin with, let us assume the following:

• C is a given set of cells.

mean the placement of cells within a single row Here we use the terms arrangement and placement synonymously

• C1= c11, c12, , c 1n and C2= c21, c22, , c 2m are two proper subsets of C of size |C1| = n

and|C2| = m such that, C1∪ C2 = C and C1∩ C2 = ∅

• p ij denotes the position of cell c ij

• p ij < pkl means that c ij precedes c kl in some arragement Note that c klmay not be immediately

next to c ij

X

Y Z

Optimal arrangement for (AB)(X)

FIGURE 20.12 Interleaving of two groups of standard cells can be done in an efficient manner with dynamic

programming In this figure, we have two sets of cells that are to be interleaved; optimal partial solutions are stored in a matrix

Trang 7

We are now ready to define interleaving An interleaving I n,m of C1and C2is an arrangement of

cells in C1and C2satisfying the following conditions:

If p 1i < p 1j in Ainitial, then p 1i < p 1j in I n,m

If p 2k < p 2l in Ainitial, then p 2k < p 2l in I n,m

where,

i k

Note that there are

n + m n

different ways of interleaving C1and C2 An optimal interleaving

is the one with minimum cost The cost of an interleaving I i,j denoted by w (Ii,j)(1 ≤ i ≤ n and

1 ≤ j ≤ m) is its total linear wirelength, ignoring the cells in {C\I i,j} Because placement rows are

typically horizontal, by linear wirelength we mean the wirelength in X-dimension.

Hur and Lillis [21] proposed a polynomial time algorithm for solving the optimal interleaving problem that utilizes dynamic programming The recurrence equation is stated as follows:

I0,0= ∅, w(I0,0) = 0

I i,j =

Ii −1,j c 1i if w (Ii −1,j c 1i ) < w(Ii,j−1c 2j ) Ii,j−1c 2j otherwise

Note that a dynamic programming formulation is only possible because for any subsequence

Ii,j , w (Ii,j) is independent of the ordering of cells in {C \ Ii,j } The runtime of the algorithm is O(nm + np(n + m)) where np is the number of pins belonging to the nets of cells in C, which is typically

proportional to the number of cells

Optimal interleaving can be applied to the entire legalized placement using the sliding window

technique described above In Ref [21], the subsets C1and C2are chosen arbitrarily

20.4.4 LINEARPLACEMENT WITHFIXEDORDERINGS

The placement problem is NP-complete, even for the dimensional case [22] The one-dimensional problem is commonly known as linear placement In this problem, candidate cells belonging to a single placement row are given and we need to find a placement that minimizes the

wirelength Note that we are only allowed to change the X-locations of cells; hence the name linear

placement A restricted variant of this problem, in which we cannot alter the cell ordering, can be solved optimally in polynomial time One solution method was proposed by Kahng [23], and later

on improved on by Brenner [24]

First, to formally state the problem, we assume that we are given the following:

• Set of cells{c1, c2, , cn } with the width of cell c idenoted by width(ci)

• Placement row of widthW : W ≥n

i=1width(ci )

• Legal initial placement P = {p1, p

2, , p n }, where p i + width(c i) ≤ p i+1for i = 1, ,

n − 1, 0 ≤ p1and p

n ≤ [W − width(c n)]

The optimization objective is to find a legal placement P = {p1, p2, , pn} with the minimum

possible bounding box wirelength such that if p

i < p j in P then p i < pj in P.

It should be obvious that to impact wirelength, white space must be present If the cells are densely packed, there is only a single possible placement, and no improvement is possible

Trang 8

20.4.4.1 Notations and Assumptions

• We are only concerned with nets with at least one movable cell Let m be the total number

of such nets

• Let c N

L and c N be the leftmost and rightmost movable cells of net N with locations denoted

by p N

L and p N (recall that we are only dealing with X-location here) Note that c N

L and c Nare known beforehand and do not change during the course of solving the problem as we are not allowed to change the ordering of cells

• Let f N

L and f N stand for the locations of the leftmost and rightmost fixed pins of net N (A net without a fixed pin can be broken down into two nets; one that connects c N

L with a

dummy cell at location W and another one that connects c N

Rwith a dummy cell at location 0)

• For the sake of simplicity, let us assume that all pins are located at the left edge of their cells

Our objective is the following:

minimize

N

max

f N

R, p N

R − min f N

L, p N

L

or equivalently,

minimize

N

f N

L

+

n

i=1

wi

where w i is the contribution of cell c ito the objective function and is given by:

w i =

N:ci =cLN

max

f N

N:ci=c N

max

p i − f N

R, 0

A careful observation of the function w i indicates that cells that are neither c N

L nor c Nfor any net

N can be placed arbitrarily, as they do not contribute anything to our objective function These cells

can be merged with their predecessors The remainder of our discussion is valid only for cells that

do not fall in this category

Because every net under consideration has a unique c N

L and c N

Rand it is possible to have c N

R

for a net (a net that has a single movable cell), we have the relation n ≤ 2m.

20.4.4.2 Analysis of the Cost Function

One can observe that the function w i is convex piecewise linear This can be deduced as follows:

take a cell c i and create a sorted list of the f N

L and f N

R values of all nets for which c i = c N

R,

respectively, and mark them on a number line For any location x, let a (b) denote the number of

f N

R(f N

L) locations to the right (left) of x Three possibilities occur

• a < b: wi is linear and decreasing in this interval as x increases

• a = b: w iis constant and minimum

• a > b: wi is linear and increasing as x increases

20.4.4.3 Dynamic Programming Algorithm

Solutions to the linear placement with fixed ordering problem are based on dynamic programming

We use the following notations for this section:

Trang 9

• Let S i denote the set of sites where cell c ican be placed without overlapping with any of its neighboring cells

• Let P i j be an optimal prefix placement for cell c i where all cells to the left of c ihave been

placed optimally and p i ≤ s j(∈ Si).

• Let W i j be the total cost of P i j and w j i denote the cost of placing c i at s j (∈ Si ).

Our goal is to find P n W−width(cn)

The dynamic programming formulation can be stated as follows:

P j0= ∅, W j

P j i =

P j −width(ci−1)

i−1 ∪ {p i = s j } if W j −width(ci−1)

i−1 + w j

i < W j−1

i

20.5 LIMITS OF LEGALIZATION AND DETAILED PLACEMENT

All of the operations performed in legalization and detailed placement are in some sense local, and thus fail to address global optimization objectives These algorithms are typically applied in

an interative manner, in ad hoc mixtures Many groups have experimented with different legalizers, window sizes for reordering, methods to perform space allocation, and so forth

There is no simple method to obtain the right strategy Fortunately, most placement tools from academic groups can be run in a legalization or detailed placement mode, making it easy to test out different techniques When compared with the runtimes for global placement, detailed placement times are usually low; it is worthwhile to explore different combinations

The first optimization normally applied would be space insertion to reduce routing congestion

As a rule of thumb, one might wish to keep the routing demand in any area less than about 70 percent

of the available routing resource Dense routing frequently results in the failure to complete all nets,

or in large net detours The achievable routing density depends on the specific routing tool used The wirelength increase due to space insertion may be less than what one might anticipate If the logic elements are spread apart, with the placement area increasing by 10 percent, the actual spreading performed is less than 5 percent in horizontal directions For this case, the worst one should expect would be a 5 percent increase in total wirelength

Going from an abstract global placement to a legal one typically incurs a small wirelength increase (perhaps a few percent); abstract placements with relatively little overlap generally have smaller increases If there is excess space for routing, it is quite possible that the legalized placement could actually reduce wirelengths If there is an increase, reordering, cell mirroring, and optimized linear placement can normally recover some of the wirelength loss

Within a few iterations, however, the improvements of any technique become asymptotically small The first few percent of wirelength improvements come easily; afterward, the progress is slow In many respects, it is a decision of the circuit designer as to how long to carry out detailed placement For some, a minor improvement in wirelength might have a great deal of value, warranting the extra compute time For others, a quick-and-dirty solution may be sufficient

It should be stressed that one of the most crucial considerations is to minimize the amount

of overlap during global placement, and ensure that there are no large sections of the design that have area demands that exceed the space available Legalization methods handle space distribution imperfectly; if the problems can be solved in global placement, the results will almost certainly be better Comparing the locations of placements before and after legalization can provide a great deal of insight There are a number of visualization tools for placements; any area that changes significantly during legalization should be examined carefully

Trang 10

By mapping a graphic image to the cells in a design, it is possible to compare two different placements The original image on the left is mapped to placements from Capo and Feng Shui for the benchmark Peko01 (which has a known optimal solution) Both placements are clearly suboptimal; as the benchmark

lacks pads, the images are rotated.

It is frequently useful to compare images before and after legalization, and at various stages of detailed placement Large distortions can identify problems with legalization, or areas where global placement

has performed poorly.

Placement

Detailed placement

FIGURE 20.13 Detailed placement techniques can resolve local problems, but placement suboptimality

is both a local and global phenomena In experiments with synthetic designs, one can compare an optimal configuration with the output of a placement tool by mapping an image The distortions of the image reveal how the placement deviates from the desired result

Although there have been significant advances in placement, the solutions obtained are far from optimal Recently, Chang presented a set of benchmark circuits with known optimal configura-tions [25] These placement examples with known optimal (PEKO) circuits attracted a great deal of attention, and the results of many placement tools on the benchmarks was surprising In Figure 20.13,

we illustrate the results by mapping an image onto the optimal placement of a synthetic PEKO benchmark, and then rearranging the placements to match results of a number of academic tools The PEKO benchmarks contain no pads; thus, there are multiple optimal configurations, cor-responding to mirroring or flipping of the design From the distortions of the images, it should be clear that many placement tools are globally correct; at a very high level, the placements resemble the optimal result At a local level, however, there is a great deal of suboptimality; sections of each placement are stretched or warped, resulting in higher interconnect lengths

While there is some question as to how closely the synthetic PEKO benchmarks resemble real circuitry, it is obvious that placement results could be improved, and that this improvement must span both the local and global levels

REFERENCES

1 J Lou, S Krishanmoorthy, and H S Sheng Estimating routing congestion using probabilistic analysis In

Proceedings of International Symposium on Physical Design, pp 112–117, 2001.

2 C J Alpert, G -J Nam, P G Villarrubia, and M C Yildiz Placement stability metrics In Proceedings of Asia South Pacific Design Automation Conference, pp 1144–1147, 2005.

3 N Viswanathan and C C -N Chu Fastplace: Efficient analytical placement using cell shifting, iterative

local refinement and a hybrid net model In Proceedings of International Symposium on Physical Design,

pp 26–33, 2004

4 Z Xiu and R A Rutenbar Timing-driven placement by grid-warping In Proceedings of Design Automation Conference, pp 585–591, 2005.

5 K Doll, F M Johannes, and K J Antreich Iterative placement improvement by network flow methods

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 13(10):1189–1200,

1994

Định dạng
Số trang	10
Dung lượng	242,61 KB