Typical placement constraints that have been addressed for slicing floorplans are boundary constraints [18–20], range constraints [21], abutment constraints [22], and clustering constrai
Trang 12 3
4
5
54V3H2V1H
M1 54V2H3V1H M3 54V23HV1H M2 54H23HV1H
1
3 2 4 5
1
2 3
1
2
3 4
4 5
5
FIGURE 9.6 Illustration of three types of moves.
in the rest of this chapter It employs the technique of simulated annealing [17], and uses the set
of normalized Polish expressions as the solution space to avoid an unnecessarily large number
of states and thus enables the speedup of the search procedure significantly The following three types of moves, M1, M2, and M3, are used to modify a normalized Polish expression to get a neighboring one
M1: Swap two adjacent modules (Two modules are adjacent iff there is no other module between them in the expression.)
M2: Complement a chain of cuts (A chain of cuts is a sequence of consecutive elements in
the expression such that each element is a cut [i.e., V or H] V and H are complements of
each other.)
M3: Swap adjacent module and cut (A module and a cut are adjacent iff they are consecutive elements in the expression.)
It is clear that M1 and M2 each always produce a normalized Polish expression M3, however, might not produce a normalized Polish expression To generate an M3 move, a pair of adjacent module and cut is repeatedly chosen until swapping them will lead to a normalized Polish expression It is claimed that the three types of moves are sufficient to ensure that any normalized Polish expression can be reached from any other via a finite number of moves Figure 9.6 gives a demonstration of the three types of moves
Each normalized Polish expression generated in the annealing process will be evaluated as
follows Let T f denote the corresponding slicing structure for a normalized Polish expression f The area A and the total wirelength W of f are defined to be the area and the total wirelength of a minimum-area floorplan of T f The cost function for measuring the quality of a normalized Polish expression
is A + λW, where λ is a user-specified constant to control the relative importance of A and W.
The minimum-area floorplan can be computed efficiently by the slicing floorplan area optimization algorithms [3–5] In fact, it is observed that the calculation of a minimum-area floorplan can be done
in an incremental manner because the calculation only needs to be performed on tree nodes that are changed after a move of type M1, M2, or M3
9.6 SLICING FLOORPLAN DESIGN CONSIDERING PLACEMENT
CONSTRAINTS
In floorplan design, it is useful if a designer is allowed to specify some placement constraints
to be satisfied in the final floorplan Typical placement constraints that have been addressed for slicing floorplans are boundary constraints [18–20], range constraints [21], abutment constraints [22], and clustering constraints [23] In this section, we describe each type of placement constraint, and highlight existing techniques to handle it during floorplan design As a matter of fact, most of the techniques are extensions of the Wong–Liu algorithm [2]
Trang 29.6.1 BOUNDARYCONSTRAINTS
A boundary constraint forces some modules to be positioned along one of the four sides of a floorplan
It is particularly useful when a designer wants to place some modules along the boundaries for shorter input–output connections Besides, floorplan design is usually done in a hierarchical manner in which modules are grouped into different units and the floorplan of each unit is independently determined For this case, it helps if some modules are constrained to be placed along a boundary of the unit so that they are closer to other modules in neighboring units
Boundary constraints can be specified as follows The set of all modules is divided into five
disjoint module sets MF, ML , MR, MT, MB Each module in MF is a free module that can be placed
anywhere in a floorplan On the other hand, each module in ML (MR, MT, MB, respectively) is a
boundary-constrained module with the left (right, top, bottom, respectively) boundary constraint and has to be placed along the left (right, top, bottom, respectively) boundary of the floorplan Note that
ML (MR, MT, MB, respectively) may be an empty set if no module must be placed along the left (right,
top, bottom, respectively) boundary For example, for the two slicing floorplans shown in Figure 9.7a and b, if module 2 is constrained to be placed along the left boundary, then the one in Figure 9.7a is infeasible while the one in Figure 9.7b is feasible
A result on the boundary-constrained problem is reported by Young and Wong in Ref [18] They enhance the Wong–Liu algorithm [2] to handle boundary constraints Their main idea is to check the normalized Polish expression in each iteration of the simulated annealing process to see whether the given boundary constraints are satisfied Then their algorithm fixes the violated constraints (if any)
as much as possible, and includes a term in the cost function to penalize the remaining violations
A linear-time method is used to find the boundary information for each module in a normalized Polish expression The boundary information of a module tells whether there are modules on the left
of, on the right of, above, and below the module Because a Polish expression or a slicing tree has
the relative position information among modules (e.g., the Polish expression ijH means that module
i is below module j, while the expression ijV means that module i is on the left of module j), the following facts can be observed If module i must be placed on the right (left, respectively) boundary
of the final floorplan, i cannot be in the left (right, respectively) subtree of any internal node labeled with V On the other hand, if module i must be placed at the top (bottom, respectively) boundary of the final floorplan, i cannot be in the left (right, respectively) subtree of any internal node labeled with H On the basis of these facts, the boundary information of each module can be obtained by
scanning the Polish expression once from right to left and by using a stack
Once the boundary information is known, all the modules violating their boundary constraints can be determined, and the algorithm fixes as many violations as possible by shuffling the modules
If module i is not placed along the required boundary, it will be shuffled with another module j which is closest to i in the Polish expression and is placed at a position satisfying the boundary constraint of i This fixing procedure takes O (mn) for each expression, where m is the number of constrained modules and n is the total number of modules If there are still some constraints that
cannot be satisfied after all possible shufflings, a penalty term is included in the cost function The
1
4 3 2 1
4
FIGURE 9.7 Illustration of infeasible and feasible floorplans with a boundary constraint (a) An infeasible
floorplan where module 2 cannot be placed along the left boundary and (b) a feasible floorplan where module
2 can be placed along the left boundary
Trang 3(a) (b) (c) (d)
V
A
(R) (L)
B
B H
(R)
(L)
A B A (B L) (R)
A
(L)
(R)
B
H
FIGURE 9.8 Examples of feasible and infeasible topologies (a) A floorplan having an infeasible topology,
(b) a floorplan having a feasible topology, obtained by performing an O1operation on the root of the tree in (a),
(c) a floorplan having a feasible topology, obtained by performing an O2operation on the root of the tree in (a),
and (d) a floorplan having a feasible topology, obtained by performing an O3operation on the root of the tree
in (a)
penalty term is measured by the total distance of the modules from the boundaries of the floorplan along which they are required to be placed
Because the above-mentioned algorithm [18] also adopts the same set of moves as the Wong–Liu algorithm (i.e., M1, M2, and M3 described in Section 9.5.3) to generate a neighboring normalized Polish expression, it is very likely that only a subset of modules has changed the boundary information
in the new normalized Polish expression, and therefore only the boundary information for those modules needs to be recomputed On the basis of this observation, three speedup methods capable
of performing incremental calculation of boundary information are given in Ref [19]
A drawback with the algorithm in Ref [18] is that the shuffling method may not always resolve all constraint violations even though a penalty term is added to the cost function to account for those violations This has been empirically confirmed in Ref [20], implying that the algorithm in Ref [18] cannot guarantee that a floorplan satisfying all given boundary constraints is always obtainable unless the annealing process is long enough To cope with this difficulty, Liu et al developed a quadratic-time method that transforms a normalized Polish expression with constraint violations into another one with all violations eliminated [20] The main idea is to examine each internal node of a slicing tree (constructed from a normalized Polish expression) in a bottom-up fashion and determine if the node has a feasible topology or not (An internal node has a feasible topology if it is feasible to place each boundary-constrained module along the required boundary in its corresponding subfloorplan.)
If the node has an infeasible topology, the tree will be modified such that the node ends up with
a feasible topology Three operations, O1 , O2, and O3, are given to perform modifications An O1 operation changes the cut direction of a node, an O2operation swaps the left and the right subtrees
of a node, and an O3 operation performs an O1 operation followed by an O2operation For example,
suppose subfloorplan A contains modules having the right boundary constraint (denoted by R), subfloorplan B contains modules having the left boundary constraint (denoted by L), and both A and
B have feasible topologies Figure 9.8a produces a floorplan having an infeasible topology, but the
floorplans in Figure 9.8b through d all have feasible topologies It is clear that Figure 9.8b through d
can be transformed from Figure 9.8a by performing an O2, an O1 , and an O3operation on the root, respectively Although in most cases the three basic operations can transform nodes into ones having feasible topologies, there are also some cases where they fail Besides, even if an internal node has
a feasible topology, the tree needs to be modified as well for some cases to ensure that the root has a feasible topology later on To handle these difficult cases, additional transformation operations are provided [20]
A range constraint forces a module to be placed within a given rectangular region in the final floorplan It is less restrictive than a preplaced constraint, which requires a module to be placed at a fixed position in the final floorplan.*In fact, the range constraint problem is a more general problem because any preplaced constraint for a module can be written as a range constraint by specifying
* Note that the problem of floorplan design with obstacles can be solved by treating the obstacles as preplaced modules.
Trang 4(a) 0 1 (b)
1
1 2
2 3
3
4
4
5
Y
X
0
A h Y
X w
x1
y1
x1 +w
y1 +h
y2⫺h
y2
x2⫺w x2
right(A) =x1 +w
left(A) =x2⫺w
top(A) =y1 +h
bottom(A) =y2⫺h
FIGURE 9.9 (a) Preplaced and range constraints (b) Range constraint representation.
the rectangular region whose dimensions are the same as those of the module Figure 9.9a gives
an example where module 1 with a preplaced constraint must be placed with its lower left corner
at (3, 2) and module 2 with a range constraint must be placed within the dotted-line region The floorplan shown in Figure 9.9a satisfies both constraints
The preplaced constraint problem for slicing floorplans is considered in Ref [24] which extends the Wong–Liu algorithm [2] by using the notion of reference point to construct shape curves in the presence of preplaced constraints Young and Wong also present a slicing floorplanner with range constraints in Ref [21], and empirically observe that when their floorplanner is specified to handle preplaced constraints, it outperforms the one in Ref [24] Therefore in this subsection we only introduce the algorithm in Ref [21]
In the range constraint problem, two sets of modules, MF and MRange, are given, where each
module in MFdoes not have any placement constraint and each module in MRangeis a hard module and has a specified range constraint The algorithm in Ref [21] extends the Wong–Liu algorithm [2]
to handle range constraints The main contribution is a novel shape curve computation, which takes range constraints into consideration When vertically or horizontally combining two modules, if at least one of them has a range constraint, the resultant subfloorplan will also have a range constraint Therefore, the range constraint information will be propagated upward from the leaves to the root during the bottom-up shape curve construction process, and both the dimensional information, that
is, the height and the width, and the range constraint information, need to be kept
Let A be a subfloorplan (containing one or more modules) with a range constraint, the following
four variables are used to represent the constraint:
• Top(A): Shortest distance of the upper boundary of A from the x-axis
• Bottom(A): Longest distance of the lower boundary of A from the x-axis
• Right(A): Shortest distance of the right boundary of A from the y-axis
• Left(A): Longest distance of the left boundary of A from the y-axis
We use Figure 9.9b to explain the four variables In Figure 9.9b, A has width w and height h, and it
is constrained to be placed inside the rectangle{(x, y)|x1 ≤ x ≤ x2, y1 ≤ y ≤ y2} Then top(A) = y1 + h, bottom(A) = y2 − h, right(A) = x1 + w, and left(A) = x2 − w With a range constraint
represented in such a manner, the traditional shape curve construction methods [3–5] are enhanced
to compute the range constraint and dimensional information of a subfloorplan X from those of its two children subfloorplans A and B In addition, if a normalized Polish expression does not satisfy
all range constraints, the algorithm adds into the cost function a penalty term, which is measured by the total distance of the modules having range constraints from their desired regions
In foorplan design, a designer may want to have some modules abut one another to favor the trans-mission of data among those modules This abutment problem is common in practice, but few
Trang 5h i i j
w i
w j
h j
FIGURE 9.10 Horizontal abutment.
floorplanning algorithms can handle it In this subsection, we describe a solution [22] to this prob-lem for slicing floorplans Although the floorplanning algorithm given in Ref [22] is capable of handling L-shaped and T-shaped modules as well, a discussion of this is beyond the scope of this chapter
Two modules i and j are said to abut horizontally (Figure 9.10), denoted by Habut(i, j), iff a vertical boundary L i of module i and a vertical boundary L j of module j abut such that L iis immediately on
the left of L j and the abutment length is min{l(L i ), l(L j )}, where l(L i ) is the length of L i and l (L j )
is the length of L j The vertical abutment constraint Vabut(i, j) can be defined similarly Abutment
constraints can be also generalized to involve more than two modules
The algorithm extends the Wong–Liu algorithm [2] to handle abutment constraints In each step
of the simulated annealing process, through the use of a stack, a normalized Polish expression is scanned once to find the top, bottom, left, and right neighbors of every module These topological relationships are independent of the dimensions of the modules, and can be derived based on the
following observation Let L [X], R[X], T[X], and B[X] denote the set of modules lying along the left boundary, right boundary, top boundary, and bottom boundary of a subfloorplan X Consider putting
X to the left of Y to get a new subfloorplan If both R [X] and L[Y] have more than one module, the top module in R [X] will abut horizontally with the top module in L[Y] and the bottom module in
R [X] will abut horizontally with the bottom module in L[Y] On the other hand, if either R[X] or
L [Y] has only one module, every module in R[X] will abut horizontally with every module in L[Y].
The vertical neighborhood relationship can be observed similarly
Once the neighbors of each module are known, each abutment constraint can be checked If the normalized Polish expression does not satisfy all abutment constraints, the algorithm will swap modules to satisfy the abutment constraints as much as possible When a vertical abutment constraint
for modules i and j, Vabut (i, j), is violated, the algorithm will first try to move j to the top of i by swapping j with the closest top neighbor of i in the Polish expression If it fails, for example, all the top neighbors of i are fixed in their positions, the algorithm will try to move i to the bottom of j by swapping i with the closest bottom neighbor of j The fixing procedure for a horizontal abutment
constraint is defined similarly It is likely that some constraints are still violated after all the possible swappings, and therefore a penalty term is added in the cost function to penalize those violations
Scanning a Polish expression once to find the neighbors of every module takes O (n) time, and swapping modules to fix violated abutment constraints takes O (mn) time, where n is total number
of modules and m is the total number of abutment constraints.
A clustering constraint is to enforce some modules to be placed geometrically adjacent to each other
in the final floorplan By imposing the clustering constraint on modules that are heavily connected,
Trang 611 9
7 5
3 4
FIGURE 9.11 Feasible floorplan having three modules involved in a clustering constraint.
the routing cost among those modules can be reduced In this subsection, we describe a recent work
on slicing floorplan design considering clustering constraints [23]
A clustering constraint can be specified by a subset of modules such that all the modules in this subset are geometrically adjacent to each other in the final floorplan Figure 9.11 shows an example
of the clustering constraint, where modules 4, 5, and 8 are the modules to be clustered and they are placed adjacent to each other in the feasible floorplan as shown In Ref [23], an extension of the Wong–Liu algorithm [2] is presented to handle the clustering constraint problem The extension includes a linear-time method to locate neighbors of a module from a normalized Polish expression and a method to swap the modules to satisfy the given clustering constraint In each iteration of the annealing process, these two methods are used to transform a normalized Polish expression into another one that satisfies the given cluster constraint A term measured by the sum of the center-to-center distances between the modules with the same cluster constraint is added to the cost function such that those modules can be placed more closely
9.7 OTHER ADVANCES IN SLICING FLOORPLANS
We next address two important advances, one in the theoretical study on area-optimal slicing floorplans [25], and the other in the completeness of the slicing tree representation for general floor-plans [26] We then describe the respective algorithms [27,28] that make modern heterogeneous FPGAs and 3D ICs realizable in slicing floorplans
One possible concern about slicing floorplans is that even the slicing floorplan with optimal area may still be not good at packing modules tightly as a nonslicing one, and hence may introduce a larger chip area Although, there is empirical evidence showing that comparable slicing floorplan results can be obtained using much less runtime [18], it is important to have a mathematical analysis
to guarantee their performance
In Ref [25], the following area-optimal slicing floorplan design problem is addressed Given
n soft modules each with the same shape flexibility r (≥1), what is the minimum area among all possible slicing floorplans? Note that a module is said to have shape flexibility r iff the module
can be represented by any rectangle with the same area as long as the aspect ratio of the rectangle
is between 1/r and r It is proved that if r ≥ 2, then there exists a slicing floorplan F such that
area(F) ≤ min{1 + (1/ √r total, where Atotalis the total area of all the modules, Amax
is the maximum module area, andθ =√2Amax/(rAtotal) The shape of such a slicing floorplan closely resembles a square The first term in the upper bound, that is,(1 + (1/ √r total, favors larger r;
for example, if r = 25, the first term is 1.2Atotal The second term gives a better bound than the first one if r≤ 16 The third term considers the ratios of the module areas and gives a good bound when
all the module areas are small as compared with the total module area; for example, if r = 2 and
Trang 7Amax = Atotal/100, the third term is 1.1Atotal In fact, the experimental results reported in Ref [25] show that by applying the Wong–Liu slicing floorplanner [2] to more than 20 test cases each with 100 soft modules of shape flexibility 2, slicing floorplans with areas smaller than the above-mentioned mathematical bound can be produced
To get each term in the upper bound, a slicing floorplan is constructed such that its area will meet the bound Each construction uses a different method to classify modules into groups based
on module areas, and modules in different groups are represented by rectangles of different widths (while modules in the same group have the same width) Then modules are placed one at a time from the one with largest area to the one with smallest area by putting a module on the lowest possible level and moving it to the leftmost position on that level
In addition, when r is slightly less than 2, that is, r = 2 − ε with ε being a small positive
number, it can be also proved that there exists a slicing floorplan whose area is upper bounded by min{(5/4)[1 + (ε/2)], [1 + θ + (ε/2)]}Atotal.
Although it can be mathematically proved that a slicing floorplan capable of packing modules closely
is achievable, there are constraints (e.g., the same shape flexibility for modules) that must be satisfied beforehand On the other hand, it is still commonly believed that an area-optimal slicing floorplan may still suffer from a poor utilization of space when all modules are hard For this reason, many efforts have been devoted for creating nonslicing floorplan representations These representations are effective and efficient for handling hard modules, but many of them are still unable to fully exploit the shape flexibility of soft modules
In Ref [26], it is proved that when augmented with a simple compaction procedure, the slicing tree representation can generate all maximally compact placements of modules (Note that no module
in a maximally compact placement (or called admissible placement in Ref [29]) can be moved horizontally to the left or vertically downward without moving any other modules.) As a result, slicing tree is a complete representation of both slicing and nonslicing floorplans We now describe the idea behind the proof
Given a slicing tree, its corresponding slicing placement is defined to be the area-optimal floorplan such that no vertical cutlines can be moved to the left, no horizontal cutlines can be moved downward, and each module is placed in the lower left corner of a basic rectangle For example, Figure 9.12f shows a slicing placement of the slicing tree in Figure 9.12d
A horizontal adjacency graph G = (V, E) can be constructed from a placement of modules The set V of vertices corresponds to the set of modules There is an edge (u, v) in the edge set E iff the left boundary of v is immediately adjacent to the right boundary of u It is clear that G is a DAG A vertex u in G is said to be a left-boundary vertex iff u is a module placed along the left boundary of
the placement Clearly all left-boundary vertices have in-degree 0 In general the converse may not
be true, but it is true for maximally compact placements Another key fact is that for a maximally
compact placement, all vertices in G are connected to the set of left-boundary vertices.
Given any maximally compact placement P, let G P be the horizontal adjacency graph of P, and
B = {b1, b2, , b k } be the set of left-boundary vertices in G Because every vertex in G is reachable from at least one vertex in B, a spanning forest Q = {T1, T2, , T k } of G can be found, where T iis
a tree rooted at b i , i = 1, 2, , k For example, Figure 9.12a gives a maximally compact placement
of nine modules and Figure 9.12b shows the horizontal adjacency graph, where vertices 7, 4, 1 are left-boundary vertices Figure 9.12c shows a spanning forest containing three trees rooted at 7, 4, 1, respectively
Now for the trees rooted at b1 , b2, , b k , k − 1 horizontal cutlines dividing a floorplan into k
parts are constructed For each tree, the transformations shown in Figure 9.13 are then recursively applied to its child subtrees to further expand the slicing structure Figure 9.12d and e show the final
slicing tree and floorplan From the slicing floorplan, the corresponding slicing placement Pcan
Trang 8(a) (b) (c)
3 6 5 4
9
1
6
5 4
7
8
9
1 4 7
6 5
8 9
2 3 6
5 4
7
1
2 3 6
5 4
7
1
H
H
H
V
V
1
FIGURE 9.12 Generating a slicing placement from a maximally compact placement (a) A maximally compact
placement of 9 modules, (b) the horizontal adjacency graph of (a), (c) a spanning foust of (b), (d) the slicing tree derived from (c), (e) the slicing floorplan derived from (c), and (f) the slicing placement derived from (d) (or (e))
be constructed (Figure 9.12f) Because the positions of the modules in the x-direction in Pare the
same as those in P, performing a compaction along y-direction, which moves modules downward as far as possible, transforms Pinto P This concludes that slicing tree is a complete representation of
floorplans in the sense that by an augmenting compaction step, all maximally compact placements can be produced as well
Modern FPGAs can accommodate multimillion gates and their future generations will be even more complex As a result, a hierarchical approach based upon partitioning and floorplanning becomes necessary to successfully realize a design on an FPGA Owing to the heterogeneous logic and rout-ing resources on a modern FPGA, FPGA floorplannrout-ing is very different from floorplannrout-ing for application-specific integrated circuits (ASICs) Although there are also some previous works, for
Z
Z
Z T
T
T2
T2
FIGURE 9.13 Three types of transformations for expanding a slicing structure (a) Type 1 transformation,
applied when Z has no child, (b) Type 2 transformation, applied when Z has only one child, and (c) Type 3 transformation, applied when Z has two or more children.
Trang 9CLB RAM Multiplier
3
6
H H
V
V V
3
6
FIGURE 9.14 (a) Slicing floorplan (b) Slicing tree (c) Pattern.
example, Ref [30], on FPGA floorplanning, they all target at older generations of FPGAs consisting
of configurable logic blocks (CLBs) only In this subsection, we consider modern heterogeneous FPGA chips that consist of columns of CLBs, with column pairs of RAMs and multipliers inter-leaved between them (Figure 9.14a) The Xilinx Spartan3 family and Vertex-II family conform to this architecture [31] For such an FPGA architecture, the first slicing floorplan design algorithm
is given in Ref [27] The algorithm uses slicing trees to represent floorplans during a simulated annealing process similar to Ref [2], but it nontrivially extends the slicing floorplan area optimiza-tion algorithms [3,5] to find the optimal realizaoptimiza-tion for each slicing tree The main idea behind this extension is explained below
Assume that a set of modules is given, where each module has an associated resource require-ment vector ϕ = (m1, m2, m3), indicating that this module requires m1 CLBs, m2 RAMs, and m3
multipliers The FPGA floorplanning problem is to place modules on the chip such that each region assigned to a module satisfies the resource requirements of the module, regions for different mod-ules do not overlap, and a given cost function is optimized For example, if there are six modmod-ules, and their resource requirement vectors are ϕ1 = (12, 2, 1), ϕ2 = (30, 4, 4), ϕ3 = (15, 1, 1), ϕ4 =
(24, 4, 4), ϕ5 = (18, 2, 2), and ϕ6 = (30, 2, 2), then Figure 9.14a is a feasible slicing floorplan for
these modules (see Figure 9.14b for the corresponding slicing tree) For easier illustration, a coordi-nate system is adopted on the chip In Figure 9.14a the horizontal unit is the width of a CLB, and the vertical unit is the height of a CLB The lower left CLB has coordinate (0, 0), the lower left RAM occupies coordinates (1, 0) through (1, 2), and the lower left multiplier occupies coordinates (2, 0)
through (2, 2) Let H and W be the height and the width of the chip, respectively.
Any rectangular region r in the chip is denoted by a four-tuple (x, y, w, h), where (x, y), w, and
h are the lower left coordinate, the width, and the height of r, respectively The x (r), y(r), w(r), and h (r) each denote a corresponding field of r Let R i denote the set of rectangular regions in the
chip that satisfy the resource requirements of module i Each region in R iis said to be a realization
of module i because it is feasible to place module i in that region A realization r1 in R iis redundant
if there is another realization r2 in R isuch that both realizations have the same lower left coordinate
(i.e., r1(x) = r2(x), r1(y) = r2(y)) and r2 is not larger than r1 in both dimensions (i.e., r2(w) ≤ r1(w) and r2(h) ≤ r1 (h)) Clearly all redundant realizations can be discarded Let L(i, x, y) denote the irreducible realization list (IRL) for module i starting at coordinate (x, y) and it contains all the irredundant realizations in R i with (x, y) being their lower left coordinate Therefore, all irredundant
realizations of a module are organized into different IRLs for different starting coordinates Each IRL is sorted in the decreasing height order, and hence it is also sorted in the increasing width order The definition of an IRL can be extended to the nodes in a slicing tree Given two
rectangles r1 and r2, the bounding rectangle of r1 and r2 is a rectangle r with r (x)=
min{r1(x), r2(x)}, r(y)= min{r1(y), r2(y)}, r(w)= max{r1(w) + r1 (x), r2(w) + r2 (x)} − r(x), and
Trang 10r (h)= max{r1(h)+r1 (y), r2(h)+r2(y)}−r(y) Given a tree node u, if u represents module i, R u = R i.
On the other hand, if u is an internal node, let u1, u2be the left and the right children of u If u is a vertical cut (a horizontal cut, respectively), R u consists of all bounding rectangles of r1 in R u1 and
r2 in R u2 , where r1 is to the left side of (below, respectivly) r2 The IRL for a tree node u starting
at coordinate(x, y) is defined as L(u, x, y) = {r|r is irredundant in R u , x (r) = x, and y(r) = y}.
Therefore, the set of all irredundant realizations of a tree node is also organized into different IRLs for different starting coordinates
Given a slicing tree, the IRLs of each node are calculated from leaves to the root Obviously, the IRLs of each module only need to be calculated once, right at the beginning of the simulated annealing
process Suppose u is the internal node under consideration, and is a vertical cut Let u1, u2be the left
and the right children of u Assume L (u1, x, y ) = {r1 , r2, , r k} is sorted as expected It can be shown
that it is enough to combine every realization r i in L (u1, x, y ) with realizations in L(u2, x + w(r i ), y)
to generate L (u, x, y) Moreover, when combining r i with realizations in L (u2, x + w(r i ), y), we may not need to consider all combinations For those realizations in L (u2, x + w(r i ), y) with heights not larger than h (r i ), we only need to consider the highest one to get a minimum width We also do not need to combine r i with a realization r in L (u2 , x + w(r i ), y) if h(r) ≥ h(r i−1) The procedure to
construct L (u, x, y) can be derived similarly when u is a horizontal cut It takes O(l log l) time to construct L (u, x, y), where l = max{H, W}.
The above-mentioned method, however, should not be implemented directly on the chip, because finding IRLs for every coordinate makes the space complexity formidable Fortunately, a real FPGA chip is very regular with repetitions of a basic pattern Consider the example chip in Figure 9.14a whose basic pattern is shown in Figure 9.14c It turns out that this repetition property can be utilized such that only computation on the pattern instead of the whole chip needs to be done As a result, evaluating a slicing tree takes O(nml log l) time and needs O(nlm) memory space, where n is the number of modules and m is the number of points on the pattern.
Complementary metal oxide semiconductor (CMOS) technology has continuously scaled into nanometer regime, and it has become more difficult to improve the chip performance by just size shrinking in a planar wafer 3D ICs is a promising technology to keep the speed of an IC advanc-ing 3D ICs provide significant performance benefits over two-dimensional integrated circuits (2D ICs) mainly by reducing the interconnect lengths and introducing new geometrical arrangement of modules [32] To improve the performance, circuit modules will not be confined in only one layer in 3D ICs This produces a problem for current 2D floorplanning tools As a result, 3D floorplanning algorithms are required for 3D IC design
A 3D slicing floorplan design problem is addressed in Ref [28] This problem is formulated as that of placing a given set of 3D rectangular modules without overlapping while a given cost function
is optimized Assuming each 3D module is a hard module but with free rotation, an algorithm for solving the 3D slicing floorplan design problem is reported in Ref [28], which generalizes slicing trees to represent different 3D floorplans, and uses simulated annealing to search for a good slicing floorplan The main idea of the algorithm is highlighted below
A 3D slicing floorplan can be obtained by cutting a 3D block by 2D planes (which are
perpen-dicular to the x-, y-, or z-axis) into a set of 3D subblocks such that each 3D subblock is large enough
to accommodate the 3D module assigned to it (Figure 9.15a) Slicing trees can be generalized to
represent 3D slicing floorplans such that each internal node of a slicing tree is now labeled by X, Y ,
or Z The label X (Y , Z, respectively) means that the corresponding subfloorplan is cut by a plane that is perpendicular to the x-axis (y-axis, z-axis, respectively) Figure 9.15b gives a slicing tree to
represent the 3D floorplan shown in Figure 9.15a
Because a slicing tree is a full binary tree (due to the fact that each internal node of the tree has
two children), a static array with 2n− 1 elements can be used to represent all the nodes of the tree,