Brayton, Improvements to technology mapping for Lut-based FPGAs, in FPGA ’06: Proceedings of the 2006 ACM/SIGDA 14th International Symposium on Field Programmable Gate Arrays, pp.. Marek
Trang 1982 Handbook of Algorithms for Physical Design Automation
Sink
(c) Config 3 (d) Delay variation
0.5
1.0
1.0 0.9 0.9
FIGURE 46.19 Three critical path configurations and delay variations of a switch matrix (Based on
Matsumoto, Y et al., Proceedings of the 2007 ACM/SIGDA 15th International Symposium on Field Programmable Gate Arrays, ACM Press, New York, 2007 With permission.)
where Y1(Target) is defined as
Y1(Target) =
TTarget
−∞
In Equation 46.12, the likelihood that all n configurations fail is subtracted from 1 In their
work, they assume complete independence between critical paths in different configurations, which enables them to analytically evaluate Equations 46.12 and 46.13 This assumption is not valid, as
we know spatial correlations exist between circuit elements, and also critical paths across different configurations might share routing resources, especially close to the source and sink nodes They propose a routing algorithm that keeps track of the usage of routing resources by critical paths and tries to avoid them in consecutive configurations that are generated The method is similar
to the congestion avoidance procedure used in VPR, that is, resources that are used by critical paths
in other configurations are penalized so that the router avoids them if other paths with the same delay exist
REFERENCES
1 J Cong and K Minkovich, Optimality study of logic synthesis for Lut-based FPGAs, IEEE Transactions
on Computer-Aided Design of Integrated Circuits and Systems, 26(2): 230–239, 2007.
2 D Chen and J Cong, Daomap: A depth-optimal area optimization mapping algorithm for FPGA designs,
in ICCAD ’04: Proceedings of the 2004 IEEE/ACM International Conference on Computer-Aided Design,
pp 752–759, IEEE Computer Society, Washington DC, 2004
3 B L Synthesis and V Group, Abc: A system for sequential synthesis and verification Available at http://www.eecs.berkeley.edu/∼alanmi/abc/
4 Alan, S Chatterjee, and R Brayton, Improvements to technology mapping for Lut-based FPGAs, in FPGA
’06: Proceedings of the 2006 ACM/SIGDA 14th International Symposium on Field Programmable Gate Arrays, pp 41–49, ACM Press, New York, 2006.
5 J Cong and Y Ding, Flowmap: An optimal technology mapping algorithm for delay optimization in
lookup-table based FPGA designs, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), 13(1): 1–12, 1994.
Trang 2Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C046 Finals Page 983 9-10-2008 #28
6 V Betz and J Rose, VPR: A new packing, placement and routing tool for FPGA research, in Field-Programmable Logic and Applications (W Luk, P Y Cheung, and M Glesner, eds.), pp 213–222,
Springer-Verlag, Berlin, Germany, 1997
7 A S Marquardt, V Betz, and J Rose, Using cluster-based logic blocks and timing-driven packing to
improve FPGA speed and density, in Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays, Monterey, CA, pp 37–46, 1999.
8 E Bozorgzadeh, S Ogrenci-Memik, and M Sarrafzadeh, Rpack: Routability-driven packing for
cluster-based FPGAs, in Proceedings of the Asia-South Pacific Design Automation Conference, Yokohama, Japan,
2001, pp 629–634
9 A Singh and M Marek-Sadowska, Efficient circuit clustering for area and power reduction in FPGAs, in
Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays, Monterey,
CA, pp 59–66, 2002
10 A DeHon, Balancing interconnect and computation in a reconfiguable computing array (or, why you don’t
really want 100% LUT utilization), in Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays, Monterey, CA, pp 69–78, 1999.
11 L Cheng and M D F Wong, Floorplan design for multi-million gate FPGAs, in Proceedings of the IEEE/ACM International Conference on Computer-Aided Design, San Jose, CA, pp 292–299, 2004.
12 Y Sankar and J Rose, Trading quality for compile time: Ultra-fast placement for FPGAs, in Proceed-ings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays, San Jose, CA,
pp 157–166, 1999
13 J M Emmert and D Bhatia, A methodology for fast FPGA floorplanning, in Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays, Monterey, CA, pp 47–56, 1999.
14 K Bazargan, R Kastner, and M Sarrafzadeh, Fast template placement for reconfigurable computing
systems, IEEE Design and Test—Special Issue on Reconfigurable Computing, 17: 68–83, January 2000.
15 E L Horta, J W Lockwood, D E Taylor, and D Parlour, Dynamic hardware plugins in an FPGA with
partial runtime reconfiguration, in Proceedings of the ACM/IEEE Design Automation Conference, New
Orleans, LA, pp 343–347, 2002
16 J Chen, J Moon, and K Bazargan, A reconfigurable FPGA-based readback signal generator for hard-drive
read channel simulator, in Proceedings of the ACM/IEEE Design Automation Conference, New Orleans,
LA, pp 349–354, 2002
17 M Handa and R Vemuri, An efficient algorithm for finding empty space for online FPGA placement, in
Proceedings of the ACM/IEEE Design Automation Conference, San Diego, CA, pp 960–965, 2004.
18 L Singhal and E Bozorgzadeh, Multi-layer floorplanning on a sequence of reconfigurable designs, in
FPL’06: Proceedings of the 2006 International Conference on Field Programmable Logic and Applications,
Madrid, 2006
19 J Cong, M Romesis, and M Xie, Optimality and stability study of timing-driven placement algorithms,
in Proceedings of the IEEE/ACM International Conference on Computer-Aided Design, San Jose, CA,
p 472, 2003
20 C -L E Cheng, Risa: Accurate and efficient placement routability modeling, in Proceedings of the IEEE/ACM International Conference on Computer-Aided Design, San Jose, CA, pp 690–695, 1994.
21 A Marquardt, V Betz, and J Rose, Timing-driven placement for FPGAs, in Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays, Monterey, CA, pp 203–213, 2000.
22 S Nag and R A Rutenbar, Performance-driven simultaneous placement and routing for FPGA’s IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), 17(6): 499–518, 1998.
23 P Maidee, C Ababei, and K Bazargan, Timing-driven partitioning-based placement for island style
FPGAs, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), 24(3):
395–406, 2005
24 S A Senouci, A Amoura, H Krupnova, and G Saucier, Timing driven floorplanning on programmable
hierarchical targets, in Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays, Monterey, CA, pp 85–92, 1998.
25 M Hutton, K Adibsamii, and A Leaver, Timing-driven placement for hierarchical programmable logic
devices, in Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays,
Monterey, CA, pp 3–11, 2001
26 G Chen and J Cong, Simultaneous timing-driven placement and duplication, in Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays, Monterey, CA, pp 51–59, 2005.
Trang 3984 Handbook of Algorithms for Physical Design Automation
27 D P Singh and S D Brown, Incremental placement for layout-driven optimizations on FPGAs, in
Proceedings of the IEEE/ACM International Conference on Computer-Aided Design, San Jose, CA,
pp 752–759, 2002
28 S -W Hur and J Lillis, Mongrel: Hybrid techniques for standard cell placement, in Proceedings of the IEEE/ACM International Conference on Computer-Aided Design, San Jose, CA, pp 165–170, 2000.
29 T J Callahan, P Chong, A DeHon, and J Wawrzynek, Fast module mapping and placement for datapaths in
FPGAs, in Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays,
Monterey, CA, pp 123–132, 1998
30 C Ababei and K Bazargan, Non-contiguous linear placement for reconfigurable fabrics, International Journal of Embedded Systems (IJES)—esp issue on Reconfigurable Architectures Workshop (RAW),
2(1/2): 86–94, 2006
31 M Hutton, Y Lin, and L He, Placement and timing for FPGAs considering variations, in FPL’06: Pro-ceedings of the 2006 International Conference on Field Programmable Logic and Applications, Madrid,
2006
32 L Cheng, J Xiong, L He, and M Hutton, FPGA performance optimization via chipwise placement
considering process variations, in FPL’06: Proceedings of the 2006 International Conference on Field Programmable Logic and Applications, Madrid, 2006.
33 C Visweswariah, K Ravindran, K Kalafala, S G Walker, S Narayan, D K Beece, J Piaget, N
Venkateswaran, and J G Hemmett, First-order incremental block-based statistical timing analysis, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 25: 2170–2180, October 2006.
34 Y Lin and L He, Stochastic physical synthesis for FPGAs with pre-routing interconnect uncertainty and
process variation, in FPGA ’07: Proceedings of the 2007 ACM/SIGDA 15th International Symposium on Field Programmable Gate Arrays, pp 80–88, ACM Press, New York, 2007.
35 A Gayasen, Y Tsai, N Vijaykrishnan, M Kandemir, M J Irwin, and T Tuan, Reducing leakage energy
in fpgas using region-constrained placement, in Proceedings of the ACM/SIGDA International Symposium
on Field Programmable Gate Arrays, Monterey, CA, pp 51–58, 2004.
36 Y Lin and L He, Leakage efficient chip-level dual-vdd assignment with time slack allocation for FPGA
power reduction, in Proceedings of the ACM/IEEE Design Automation Conference, Anaheim, CA, pp 720–
725, 2005
37 L McMuchie and C Ebeling, Pathfinder: A negotiation-based performance-driven router for FPGAs, in
Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays, Monterey,
CA, pp 473–482, 1995
38 Y -W Chang, K Zhu, and D F Wong, Timing-driven routing for symmetrical array-based FPGAs, ACM Transactions on Design Automation of Electronic Systems, 5(3): 433–450, 2000.
39 G -J Nam, K A Sakallah, and R A Rutenbar, A new FPGA detailed routing approach via search-based
Boolean satisfiability, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), 21(6): 674–684, 2002.
40 J -M Lin, S -R Pan, and Y -W Chang, Graph matching-based algorithms for array-based FPGA
seg-mentation design and routing, in Proceedings of the Asia-South Pacific Design Automation Conference,
Kitakyushu, Japan, pp 851–854, 2003
41 N Sherwani, Algorithms for VLSI Physical Design Automation, 2 edn Kluwer Academic Publishers,
Boston, MA, 1995
42 K Eguro and S Hauck, Armada: Timing-driven pipeline-aware routing for FPGAs, in Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays, Monterey, CA, pp 169–
178, 2006
43 P Kannan, S Balachandran, and D Bhatia, On metrics for comparing routability estimation methods for
FPGAs, in Proceedings of the ACM/IEEE Design Automation Conference, New Orleans, LA, pp 70–
75, 2002
44 S Sivaswamy and K Bazargan, Variation-aware routing for FPGAs, in FPGA ’07: Proceedings of the 2007 ACM/SIGDA 15th International Symposium on Field Programmable Gate Arrays, pp 71–79, ACM Press,
New York 2007
45 Y Matsumoto, M Hioki, T Kawanami, T Tsutsumi, T Nakagawa, T Sekigawa, and H Koike, Performance
and yield enhancement of FPGAs with within-die variation using multiple configurations, in FPGA ’07: Proceedings of the 2007 ACM/SIGDA 15th International Symposium on Field Programmable Gate Arrays,
pp 169–177, ACM Press, New York 2007
Trang 4Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C047 Finals Page 985 10-10-2008 #2
Three-Dimensional
Circuits
Kia Bazargan and Sachin S Sapatnekar
CONTENTS
47.1 Introduction 985
47.2 Standard Cell-Based Designs 987
47.2.1 Thermal Vias 987
47.2.2 3D Floorplanning 989
47.2.3 3D Placement 990
47.2.4 Routing Algorithms 991
47.3 3D FPGA Designs 993
47.3.1 Estimation Methods 994
47.3.2 Placement and Routing Algorithms 997
47.3.2.1 Partitioning the Circuit between Tiers 998
47.3.2.2 Partitioning-Based Placement within Tiers 999
47.3.2.3 Simulated Annealing Placement Phase 1000
References 1000
47.1 INTRODUCTION
Recent advances in process technology have brought three-dimensional (3D) circuits to the realm of reality This new design paradigm will require a major change from contemporary design method-ologies, because an optimal 3D design has very different characteristics from an optimal 2D design The move from conventional 2D to 3D is inherently a topological change, and therefore, many of the problems that are unique to 3D circuits lie in the domain of physical design
The essential idea of a 3D circuit is to place multiple tiers of active devices (transistors) above each other, as opposed to a conventional 2D circuit where all transistors and gates lie in a single tier
An example of 3D circuit is shown in Figure 47.1
One of the primary motivators for 3D technologies is related to the dominant effects of intercon-nects in nanoscale technologies, and the addition of a third dimension provides significant relief in this respect This is achieved by reductions in the average interconnect lengths (in comparison with 2D implementations, for the same circuit size), lower wire congestion, as well as by denser integra-tion, which results in the replacement of chip-to-chip interconnections by intrachip connections In addition, the increased packing density improves the computation per unit volume
For instance, Figure 47.2 shows a 2D layout on a chip of dimension 2L × 2L on the left, where the longest (nondetoured) wire, going from one end of the layout to the other, has a length of 4L.
If this design is built on four tiers, as shown at right, assuming the same total silicon area and a
square aspect ratio for each tier, the silicon area in each tier is L × L Therefore, the longest possible
985
Trang 5986 Handbook of Algorithms for Physical Design Automation
Intratier wires Devices
Intertier via
Silicon substrate
Tier 1 Tier 2 Tier 3 Tier 4
FIGURE 47.1 Schematic of a 3D integrated circuit.
undetoured wirelength, going from one end in the lowest tier to the other end in the uppermost tier,
is approximately 2L (because the intertier thickness is negligible) Because, for a buffered two-pin
interconnect, the delay of a wire is proportional to its length, this implies that the delay is halved Moreover, the reduced wire lengths also reduce the likelihood of congestion bottlenecks, potentially reducing the need to detour wires A more precise distribution of the wirelength has been reported
in Ref [1], which shows that the histogram of wirelength distributions moves progressively to the left as the number of tiers is increased
In addition, 3D designs can result in new paradigms, for example, heterogeneous integration, where each tier could be a different material (e.g.,a silicon-based circuit on one tier and a GaAs-based circuit on another) Even for purely silicon-based circuits, 3D designs permit analog/RF and digital circuits to be build on different tiers, which improves their noise behavior; additionally, it is possible
to construct shielding structures such as Faraday cages between tiers for enhanced noise reduction Various flavors of 3D technologies have been proposed and are in use One of the simplest forms involves wafer stacking, where the distance between active devices in the third dimension (or the
“z dimension”) equals the thickness of a wafer However, the thickness of a wafer is of the order of
several hundreds of microns, and the full potential of 3D is not achieved by this approach due to
the long distance that a wire must traverse in the z dimension Further progress has resulted in the
development of integrated 3D circuits in industrial [2], government [3], and academic [4] settings, which have demonstrated 3D designs with intertier separations of the order of a few microns Today, it is only possible to build a few tiers in the third dimension, as a result of which many
of these technologies are often referred to as 2.5D rather than fully 3D Nevertheless, even the half dimension can provide the potential for substantial performance improvements, and perhaps future technological improvements will enable truly 3D integration
In this chapter, we present an overview of physical design technologies for 3D circuits We begin with a brief overview of a typical 3D technology, and then discuss physical design problems in the custom/ASIC design as well as the FPGA paradigms Generally speaking, the number of tiers is taken in as a technology input by the 3D tools described in this chapter
2L
2L
L
L
FIGURE 47.2 Comparison of the maximum wirelength in a 2D layout (left) and in its 3D counterpart (right).
For clarity, the intertier thicknesses in the 3D circuit are shown to be exaggeratedly large
Trang 6Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C047 Finals Page 987 10-10-2008 #4
47.2 STANDARD CELL-BASED DESIGNS
A typical cell-based flow begins with a floorplanning step, where the system is laid out at the level
of macroblocks, detailed placement of the cells in the layout, and routing In the 3D context, each of these must be modified to adapt to the constraints imposed by 3D circuits In addition to conventional metrics, 3D-specific geometrical considerations must be used, for example, for wirelength metrics
In addition, temperature is treated as a first-class citizen during these optimizations.∗Moreover, intertier via reduction is considered to be a desirable goal, because the number of available vias is restricted and must be shared between signal nets and supply and clock nets
In addition to floorplanning, placement, and routing, a 3D-specific optimization that makes the temperature distribution more uniform is the judicious positioning of thermal vias within the layout These vias correspond to intertier metal connections that have no electrical function, but instead, constitute a passive cooling technology that draws heat from the problem areas to the heat sink, and can be built into each of these steps or performed as an independent postprocessing step, depending
on the design methodology
It is instructive to view the result of a typical 3D thermally aware placement [5]: a layout for the benchmark circuit, IBM01, in a four-tier 3D process, is displayed in Figure 47.3 The cells are positioned in ordered rows on each tier, and the layout in each individual tier looks similar to a 2D standard cell layout The heat sink is placed at the bottom of the 3D chip, and the lighter shaded regions are hotter than the darker shaded regions The coolest cells are those in the bottom tier, next
to the heat sink, and the temperature increases as we move to higher tiers The thermal placement method consciously mitigates the temperature by making the upper tiers sparser, in terms of the percentage of area populated by the cells, than the lower tiers
47.2.1 THERMALVIAS
Although silicon is a good thermal conductor, with half or more of the conductivity of typical metals, many of the materials used in 3D technologies are strong insulators that place severe restrictions on the amount of heat that can be removed, even under the best placement solution The materials include epoxy bonding materials used to attach 3D tiers, or field oxide, or the insulator in an SOI technology Therefore, the use of deliberate metal lines that serve as heat-removing channels, called thermal vias, are an important ingredient of the total thermal solution The second step in the flow determines the optimal positions of thermal vias in the placement that provide an overall improvement in the
Hot Cool
0 0.5
⫺ 0.5
1 ⫻ 10 − 5
⫺ 1 0.015
0.005
⫺ 0.005 0.01
⫺ 0.01 0
⫺ 0.015 0.015
0.01
0 0.005
⫺ 0.005 ⫺ 0.01 ⫺ 0.015
FIGURE 47.3 Placement for the benchmark ibm01 in a four-tier 3D technology (From Ababei, C., et al.,
IEEE Design and Test, 22, 520, 2005 Copyright IEEE With permission.)
∗ A description of techniques for thermal analysis is provided in Section 3.4
Trang 7988 Handbook of Algorithms for Physical Design Automation temperature distribution In realistic 3D technologies, the footprints of these intertier vias are of the order 5× 5 µm
In principle, the problem of placing thermal vias can be viewed as one of determining one of two conductivities (corresponding to the presence or absence of metal) at every candidate point where a thermal via may be placed in the chip However, in practice, it is easy to see that such an approach could lead to an extremely large search space that is exponential in the number of possible positions; note that the set of possible positions in itself is extremely large
Quite apart from the size of the search space, such an approach is unrealistic for several other reasons First, the wanton addition of thermal vias in any arbitrary region of the layout would lead
to nightmares for a router, which would have to navigate around these blockages Second, from a practical standpoint, it is unreasonable to perform full-chip thermal analysis, particularly in the inner loop of an optimizer, at the granularity of individual thermal vias At this level of detail, individual elements would have to correspond to the size of a thermal via, and the size of the thermal simulation matrix would become extremely large
Fortunately, there are reasonable ways to overcome each of these issues The blockage problem may be controlled by enforcing discipline within the design, designating a specific set of areas within the chip as potential thermal via sites These could be chosen as specific interrow regions in the cell-based layout, and the optimizer would determine the density with which these are filled with thermal vias The advantage to the router is obvious, because only these regions are potential blockages, which
is much easier to handle To control the finite element analysis (FEA) stiffness matrix size, one could work with a two-level scheme with relatively large elements, where the average thermal conductivity
of each region is a design variable Once this average conductivity is chosen, it could be translated back into a precise distribution of thermal vias within the element that achieves that average conductivity Various published methods take different approaches to thermal via insertion We now describe
an algorithm to postfacto thermal via insertion [6]; other procedures perform thermal via insertion during floorplanning, placement or routing are discussed in the appropriate sections
For a given placed 3D circuit, an iterative method was developed in which, during each iteration, the thermal conductivities of certain FEA elements (thermal via regions) are incrementally modified
so that thermal problems are reduced or eliminated Thermal vias are generically added to elements
to achieve the desired thermal conductivities The goal of this method is to satisfy given thermal requirements using as few thermal vias as possible, that is, keeping the thermal conductivities as low
as possible
The approach uses the finite element equations to determine a target thermal conductivity
A key observation in this work is that the insertion of thermal vias is most useful in areas with
a high thermal gradient, rather than areas with a high temperature Effectively, the thermal via acts
as a pipe that allows the heat to be conducted from the higher temperature region to the lower temperature region; this, in turn, leads to temperature reductions in areas of high temperature This is illustrated in Figure 47.4, which shows the 3D layout of the benchmarkstruct, before and after the addition of thermal vias The hottest region is the center of the uppermost tier, and a major reason for its elevated temperature is because the tier below it is hot Adding thermal vias to remove heat from the second tier, therefore, effectively also significantly reduces the temperature
of the top tier For this reason, the regions where the insertion of thermal vias is most effective are those that have high thermal gradients
Therefore the method in Ref [6] employs an iterative update formula of the type
Knew
i = Kold
i
gold
i
g i,ideal
is employed, where Knew
i and Kold
i are, respectively, the new and old thermal conductivities in each
direction, before and after each iteration, gold
i is the old thermal gradient, and g i,idealis a heuristically selected ideal thermal gradient
Trang 8Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C047 Finals Page 989 10-10-2008 #6
0
− 0.01
y
1
0.8
0.6
0.01
− 1
− 0.8
− 0.6
− 0.4
− 0.2
0.4
0.2
0
⫻ 10 ⫺ 5 Before thermal via placement ⫻ 10 ⫺ 5 After thermal via placement
1 0.8 0.6
− 1
− 0.8
− 0.6
− 0.4
− 0.2
0.4 0.2 0
0
− 0.01
0.01
y
0.005
FIGURE 47.4 Thermal profile of struct before (left) and after (right) thermal via insertion The top four layers
of the figure at right correspond to the four layers in the figure at left (From Goplen, B and Sapatnekar, S S.,
IEEE Transactions on Computer-Aided Design, 26, 692, 2006 Copyright IEEE With permission.)
Each iteration begins with a distribution of the thermal vias; this distribution is corrected using the
above update formula, and the Knew
i value is then translated to a thermal via density, and then a precise layout of thermal vias, using precharacterization The iterations end when the desired temperature profile is achieved This essential iterative idea has also been used in other methods for thermal-via insertion steps that are integrated within floorplanning, placement, and routing, as described in succeeding sections This general framework has been used in several other published techniques that insert thermal vias either concurrently during another optimization, or as an independent step
47.2.2 3D FLOORPLANNING
The 3D floorplanning problem is analogous to the 2D problem discussed in Chapters 8 through 13, with all the constraints and opportunities that arise with the move to the third dimension Typical cost functions include a mix of the conventional wirelength and total area costs, and the temperature and the number of intertier vias
The approach in Ref [7] presented one of the first approaches to 3D floorplanning, and used the transitive closure graph (TCG) representation [8], described in Section 11.7, for each tier, and a bucket structure for the third dimension Each bucket represents a 2D region over all tiers, and stores, for each tier, the indices of the blocks that intersect that bucket In other words, the TCG and this bucket structure can quickly determine any adjacency information A simulated annealing engine is then utilized, with the moves corresponding to perturbations within a tier and across tiers; in each such case, the corresponding TCGs and buckets are updated, as necessary
A simple thermal analysis procedure is built into this solution, using a finite difference
approx-imation of the thermal network to build an RC thermal network Under the assumption that heat flows purely in the z direction and there is no lateral heat conduction, the RC model obtained from a
finite difference approximation has a tree structure, and Elmore-like computations (Section 47.3.1) can be performed to determine the temperature The optimization heuristically attempts to make this a self-fulfilling assumption, by discouraging lateral heat conduction, introducing a cost function parameter that discourages strong horizontal gradients A hybrid approach performs an exact thermal analysis once every 20 iterations or so and uses the approximate approach for the other iterations The work in Ref [9] expands the idea of thermally driven floorplanning by integrating thermal via insertion into the simulated annealing procedure A thermal analysis procedure based on random walks [10] is built into the method, and an iterative formula, similar to Ref [6], is used in a thermal-via insertion step between successive simulated annealing iterations
Trang 9990 Handbook of Algorithms for Physical Design Automation
47.2.3 3D PLACEMENT
In the placement step, the precise positions of cells in a layout are determined, and they are arranged
in rows within the tiers of the 3D circuit Because thermal considerations are particularly important
in 3D cell-based circuits, this procedure must spread the cells to achieve a reasonable temperature distribution, while also capturing traditional placement requirements
Several approaches to 3D placement have been proposed in the literature The work in Ref [11] embeds the netlist hypergraph into the layout area A recursive bipartitioning procedure is used to assign nodes of the hypergraph to partitions, using mincut as the primary objective and under partition
capacity constraints Partitioning in the z direction corresponds to tier assignment, and xy partitions
to assigning standard cells to rows No thermal considerations are taken into account
The procedure in Ref [5] presents a 3D-specific force-directed placer that incorporates thermal objectives directly into the placer Instead of the finite difference method that is used in many floorplanners, this approach employs FEA, which discretizes the design space into regions known
as elements For rectangular structures of the type encountered in integrated circuits, a rectangular cuboidal element can simulate heat conduction in the lateral directions without aberrations in the prime directions As described in Chapter 3, FEA results in a matrix of the type
The left hand side matrix, K, known as the global stiffness matrix, can be constructed using stamps
for the finite elements and the boundary conditions The FEA equations are solved rapidly using an iterative linear solver, with clever adjustments of the convergence criteria to achieve greater or lesser accuracy, as required at different stages of the iterative placement process
The placement engine is based on a force-directed approach, the key idea of which is described
in Chapter 18 Attractive forces are created between interconnected cells, and these are proportional
to the quadratic function of the cell coordinates that represents the Euclidean distance between the blocks The constants of proportionality are chosen to be higher in the z direction to discourage intertier vias
Apart from design criteria such as cell overlap, in the 3D context, thermal criteria are also used
to generate repulsive forces, to prevent hot spots The temperature gradient (which itself can be related to the stiffness matrix and its derivative) is used to determine the magnitudes and directions
of these forces
Once the entire system of attractive and repulsive forces is generated, repulsive forces are added, the system is solved for the minimum energy state, that is, the equilibrium location Ideally, this minimizes the wirelengths while at the same time satisfying the other design criteria such as the temperature distribution The iterative force-directed approach follows the following steps in the main loop Initially, forces are updated based on the previous placement Using these new forces, the cell positions are then calculated These two steps of calculating forces and finding cell positions are repeated until the exit criteria are satisfied The specifics of the force-directed approach to thermal placement, including the mathematical details, are presented in Ref [5] Once the iterations converge,
a final postprocessing step is used to legalize the placement Even though forces have been added
to discourage overlaps, the force-directed engine solves the problem in the continuous domain, and the task of legalization is to align cells to tiers, and to rows within each tier
Another method in Ref [12] maps an existing 2D placement to a 3D placement through trans-formations based on dividing the layout into 2k regions, for integer values of k, and then defining
local transformations to heuristically refine the layout
More recent work in Ref [13] observes that because 3D layouts have very limited flexibility in the third dimension (with a small number of layers and a fixed set of discrete locations), partitioning works better than a force-directed method Accordingly, this work performs global placement using recursive bisectioning Thermal effects are incorporated through thermal resistance reduction nets, which are attractive forces that induce high power nets to remain close to the heat sink The global
Trang 10Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C047 Finals Page 991 10-10-2008 #8
placement step is followed by coarse legalization, in which a novel cell-shifting approach is proposed This generalizes the methods in FastPlace, described in Chapter 18, by allowing shift moves to adjust the boundaries of both sparsely and densely populated cells using a computationally simple method Finally, detailed legalization generates a final nonoverlapping layout The approach is shown to provide excellent trade-offs between parameters such as the number of interlayer vias, wirelength, and temperature
47.2.4 ROUTINGALGORITHMS
During routing, several objectives and constraints must be taken into consideration, including avoid-ing blockages due to areas occupied by thermal vias, incorporatavoid-ing the effect of temperature on the delays of the routed wires, and of course, traditional objectives such as wirelength, timing, congestion, and routing completion
Once the cells have been placed and the locations of the thermal vias determined, the routing stage finds the optimal interconnections between the wires As in 2D routing, it is important to optimize the wirelength, the delay, and the congestion In addition, several 3D-specific issues come into play First, the delay of a wire increases with its temperature, so that more critical wires should avoid the hottest regions, as far as possible Second, intertier vias are a valuable resource that must
be optimally allocated among the nets Third, congestion management and blockage avoidance is more complex with the addition of a third dimension For instance, a signal via or thermal via that spans two or more tiers constitutes a blockage that wires must navigate around
Consider the problem of routing in a three-tier technology, as illustrated in Figure 47.5 The layout is gridded into rectangular tiles, each with a horizontal and vertical capacity that determines the number of wires that can traverse the tile, and an intertier via capacity that determines the number
of free vias available in that tile These capacities account for the resources allocated for nonsignal wires (e.g., power and clock wires) as well as the resources used by thermal vias For a single net,
as shown in the figure, the degrees of freedom that are available are in choosing the locations of the intertier vias, and selecting the precise routes within each tier The locations of intertier vias will depend on the resource contention for vias within each grid Moreover, critical wires should avoid the high-temperature tiles, as far as possible
The work in Ref [14] presents a thermally conscious router, using a multilevel routing paradigm similar to Ref [15,16], with integrated intertier via planning and incorporating thermal considera-tions An initial routing solution is constructed by building a 3D minimum spanning tree (MST) for each multipin net, and using maze routing to avoid obstacles
At each level of the multilevel scheme, the intertier via planning problem assigns vias in a given
region at level k − 1 of the multilevel hierarchy to tiles at level k The problem is formulated as
Tier 1
Tier 2
Tier 3
FIGURE 47.5 Example route for a net in a three-tier 3D technology (From Ababei, C., et al., IEEE Design
and Test, 22, 520, 2005 Copyright IEEE With permission.)