Handbook of algorithms for physical design automation part 101 potx

Brayton, Improvements to technology mapping for Lut-based FPGAs, in FPGA ’06: Proceedings of the 2006 ACM/SIGDA 14th International Symposium on Field Programmable Gate Arrays, pp.. Marek

Trang 1

982 Handbook of Algorithms for Physical Design Automation

Sink

(c) Config 3 (d) Delay variation

0.5

1.0

1.0 0.9 0.9

FIGURE 46.19 Three critical path configurations and delay variations of a switch matrix (Based on

Matsumoto, Y et al., Proceedings of the 2007 ACM/SIGDA 15th International Symposium on Field Programmable Gate Arrays, ACM Press, New York, 2007 With permission.)

where Y1(Target) is defined as

Y1(Target) =

TTarget

−∞

In Equation 46.12, the likelihood that all n configurations fail is subtracted from 1 In their

work, they assume complete independence between critical paths in different configurations, which enables them to analytically evaluate Equations 46.12 and 46.13 This assumption is not valid, as

we know spatial correlations exist between circuit elements, and also critical paths across different configurations might share routing resources, especially close to the source and sink nodes They propose a routing algorithm that keeps track of the usage of routing resources by critical paths and tries to avoid them in consecutive configurations that are generated The method is similar

to the congestion avoidance procedure used in VPR, that is, resources that are used by critical paths

in other configurations are penalized so that the router avoids them if other paths with the same delay exist

REFERENCES

1 J Cong and K Minkovich, Optimality study of logic synthesis for Lut-based FPGAs, IEEE Transactions

on Computer-Aided Design of Integrated Circuits and Systems, 26(2): 230–239, 2007.

2 D Chen and J Cong, Daomap: A depth-optimal area optimization mapping algorithm for FPGA designs,

in ICCAD ’04: Proceedings of the 2004 IEEE/ACM International Conference on Computer-Aided Design,

pp 752–759, IEEE Computer Society, Washington DC, 2004

3 B L Synthesis and V Group, Abc: A system for sequential synthesis and verification Available at http://www.eecs.berkeley.edu/∼alanmi/abc/

4 Alan, S Chatterjee, and R Brayton, Improvements to technology mapping for Lut-based FPGAs, in FPGA

’06: Proceedings of the 2006 ACM/SIGDA 14th International Symposium on Field Programmable Gate Arrays, pp 41–49, ACM Press, New York, 2006.

5 J Cong and Y Ding, Flowmap: An optimal technology mapping algorithm for delay optimization in

lookup-table based FPGA designs, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), 13(1): 1–12, 1994.

Trang 2

Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C046 Finals Page 983 9-10-2008 #28

6 V Betz and J Rose, VPR: A new packing, placement and routing tool for FPGA research, in Field-Programmable Logic and Applications (W Luk, P Y Cheung, and M Glesner, eds.), pp 213–222,

Springer-Verlag, Berlin, Germany, 1997

7 A S Marquardt, V Betz, and J Rose, Using cluster-based logic blocks and timing-driven packing to

improve FPGA speed and density, in Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays, Monterey, CA, pp 37–46, 1999.

8 E Bozorgzadeh, S Ogrenci-Memik, and M Sarrafzadeh, Rpack: Routability-driven packing for

cluster-based FPGAs, in Proceedings of the Asia-South Pacific Design Automation Conference, Yokohama, Japan,

2001, pp 629–634

9 A Singh and M Marek-Sadowska, Efficient circuit clustering for area and power reduction in FPGAs, in

Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays, Monterey,

CA, pp 59–66, 2002

10 A DeHon, Balancing interconnect and computation in a reconfiguable computing array (or, why you don’t

really want 100% LUT utilization), in Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays, Monterey, CA, pp 69–78, 1999.

11 L Cheng and M D F Wong, Floorplan design for multi-million gate FPGAs, in Proceedings of the IEEE/ACM International Conference on Computer-Aided Design, San Jose, CA, pp 292–299, 2004.

12 Y Sankar and J Rose, Trading quality for compile time: Ultra-fast placement for FPGAs, in Proceed-ings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays, San Jose, CA,

pp 157–166, 1999

13 J M Emmert and D Bhatia, A methodology for fast FPGA floorplanning, in Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays, Monterey, CA, pp 47–56, 1999.

14 K Bazargan, R Kastner, and M Sarrafzadeh, Fast template placement for reconfigurable computing

systems, IEEE Design and Test—Special Issue on Reconfigurable Computing, 17: 68–83, January 2000.

15 E L Horta, J W Lockwood, D E Taylor, and D Parlour, Dynamic hardware plugins in an FPGA with

partial runtime reconfiguration, in Proceedings of the ACM/IEEE Design Automation Conference, New

Orleans, LA, pp 343–347, 2002

16 J Chen, J Moon, and K Bazargan, A reconfigurable FPGA-based readback signal generator for hard-drive

read channel simulator, in Proceedings of the ACM/IEEE Design Automation Conference, New Orleans,

LA, pp 349–354, 2002

17 M Handa and R Vemuri, An efficient algorithm for finding empty space for online FPGA placement, in

Proceedings of the ACM/IEEE Design Automation Conference, San Diego, CA, pp 960–965, 2004.

18 L Singhal and E Bozorgzadeh, Multi-layer floorplanning on a sequence of reconfigurable designs, in

FPL’06: Proceedings of the 2006 International Conference on Field Programmable Logic and Applications,

Madrid, 2006

19 J Cong, M Romesis, and M Xie, Optimality and stability study of timing-driven placement algorithms,

in Proceedings of the IEEE/ACM International Conference on Computer-Aided Design, San Jose, CA,

p 472, 2003

20 C -L E Cheng, Risa: Accurate and efficient placement routability modeling, in Proceedings of the IEEE/ACM International Conference on Computer-Aided Design, San Jose, CA, pp 690–695, 1994.

21 A Marquardt, V Betz, and J Rose, Timing-driven placement for FPGAs, in Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays, Monterey, CA, pp 203–213, 2000.

22 S Nag and R A Rutenbar, Performance-driven simultaneous placement and routing for FPGA’s IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), 17(6): 499–518, 1998.

23 P Maidee, C Ababei, and K Bazargan, Timing-driven partitioning-based placement for island style

FPGAs, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), 24(3):

395–406, 2005

24 S A Senouci, A Amoura, H Krupnova, and G Saucier, Timing driven floorplanning on programmable

hierarchical targets, in Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays, Monterey, CA, pp 85–92, 1998.

25 M Hutton, K Adibsamii, and A Leaver, Timing-driven placement for hierarchical programmable logic

devices, in Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays,

Monterey, CA, pp 3–11, 2001

26 G Chen and J Cong, Simultaneous timing-driven placement and duplication, in Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays, Monterey, CA, pp 51–59, 2005.

Trang 3

27 D P Singh and S D Brown, Incremental placement for layout-driven optimizations on FPGAs, in

Proceedings of the IEEE/ACM International Conference on Computer-Aided Design, San Jose, CA,

pp 752–759, 2002

28 S -W Hur and J Lillis, Mongrel: Hybrid techniques for standard cell placement, in Proceedings of the IEEE/ACM International Conference on Computer-Aided Design, San Jose, CA, pp 165–170, 2000.

29 T J Callahan, P Chong, A DeHon, and J Wawrzynek, Fast module mapping and placement for datapaths in

FPGAs, in Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays,

Monterey, CA, pp 123–132, 1998

30 C Ababei and K Bazargan, Non-contiguous linear placement for reconfigurable fabrics, International Journal of Embedded Systems (IJES)—esp issue on Reconfigurable Architectures Workshop (RAW),

2(1/2): 86–94, 2006

31 M Hutton, Y Lin, and L He, Placement and timing for FPGAs considering variations, in FPL’06: Pro-ceedings of the 2006 International Conference on Field Programmable Logic and Applications, Madrid,

2006

32 L Cheng, J Xiong, L He, and M Hutton, FPGA performance optimization via chipwise placement

considering process variations, in FPL’06: Proceedings of the 2006 International Conference on Field Programmable Logic and Applications, Madrid, 2006.

33 C Visweswariah, K Ravindran, K Kalafala, S G Walker, S Narayan, D K Beece, J Piaget, N

Venkateswaran, and J G Hemmett, First-order incremental block-based statistical timing analysis, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 25: 2170–2180, October 2006.

34 Y Lin and L He, Stochastic physical synthesis for FPGAs with pre-routing interconnect uncertainty and

process variation, in FPGA ’07: Proceedings of the 2007 ACM/SIGDA 15th International Symposium on Field Programmable Gate Arrays, pp 80–88, ACM Press, New York, 2007.

35 A Gayasen, Y Tsai, N Vijaykrishnan, M Kandemir, M J Irwin, and T Tuan, Reducing leakage energy

in fpgas using region-constrained placement, in Proceedings of the ACM/SIGDA International Symposium

on Field Programmable Gate Arrays, Monterey, CA, pp 51–58, 2004.

36 Y Lin and L He, Leakage efficient chip-level dual-vdd assignment with time slack allocation for FPGA

power reduction, in Proceedings of the ACM/IEEE Design Automation Conference, Anaheim, CA, pp 720–

725, 2005

37 L McMuchie and C Ebeling, Pathfinder: A negotiation-based performance-driven router for FPGAs, in

Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays, Monterey,

CA, pp 473–482, 1995

38 Y -W Chang, K Zhu, and D F Wong, Timing-driven routing for symmetrical array-based FPGAs, ACM Transactions on Design Automation of Electronic Systems, 5(3): 433–450, 2000.

39 G -J Nam, K A Sakallah, and R A Rutenbar, A new FPGA detailed routing approach via search-based

Boolean satisfiability, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), 21(6): 674–684, 2002.

40 J -M Lin, S -R Pan, and Y -W Chang, Graph matching-based algorithms for array-based FPGA

seg-mentation design and routing, in Proceedings of the Asia-South Pacific Design Automation Conference,

Kitakyushu, Japan, pp 851–854, 2003

41 N Sherwani, Algorithms for VLSI Physical Design Automation, 2 edn Kluwer Academic Publishers,

Boston, MA, 1995

42 K Eguro and S Hauck, Armada: Timing-driven pipeline-aware routing for FPGAs, in Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays, Monterey, CA, pp 169–

178, 2006

43 P Kannan, S Balachandran, and D Bhatia, On metrics for comparing routability estimation methods for

FPGAs, in Proceedings of the ACM/IEEE Design Automation Conference, New Orleans, LA, pp 70–

75, 2002

44 S Sivaswamy and K Bazargan, Variation-aware routing for FPGAs, in FPGA ’07: Proceedings of the 2007 ACM/SIGDA 15th International Symposium on Field Programmable Gate Arrays, pp 71–79, ACM Press,

New York 2007

45 Y Matsumoto, M Hioki, T Kawanami, T Tsutsumi, T Nakagawa, T Sekigawa, and H Koike, Performance

and yield enhancement of FPGAs with within-die variation using multiple configurations, in FPGA ’07: Proceedings of the 2007 ACM/SIGDA 15th International Symposium on Field Programmable Gate Arrays,

pp 169–177, ACM Press, New York 2007

Trang 4

Three-Dimensional

Circuits

Kia Bazargan and Sachin S Sapatnekar

CONTENTS

47.1 Introduction 985

47.2 Standard Cell-Based Designs 987

47.2.1 Thermal Vias 987

47.2.2 3D Floorplanning 989

47.2.3 3D Placement 990

47.2.4 Routing Algorithms 991

47.3 3D FPGA Designs 993

47.3.1 Estimation Methods 994

47.3.2 Placement and Routing Algorithms 997

47.3.2.1 Partitioning the Circuit between Tiers 998

47.3.2.2 Partitioning-Based Placement within Tiers 999

47.3.2.3 Simulated Annealing Placement Phase 1000

References 1000

47.1 INTRODUCTION

Recent advances in process technology have brought three-dimensional (3D) circuits to the realm of reality This new design paradigm will require a major change from contemporary design method-ologies, because an optimal 3D design has very different characteristics from an optimal 2D design The move from conventional 2D to 3D is inherently a topological change, and therefore, many of the problems that are unique to 3D circuits lie in the domain of physical design

The essential idea of a 3D circuit is to place multiple tiers of active devices (transistors) above each other, as opposed to a conventional 2D circuit where all transistors and gates lie in a single tier

An example of 3D circuit is shown in Figure 47.1

One of the primary motivators for 3D technologies is related to the dominant effects of intercon-nects in nanoscale technologies, and the addition of a third dimension provides significant relief in this respect This is achieved by reductions in the average interconnect lengths (in comparison with 2D implementations, for the same circuit size), lower wire congestion, as well as by denser integra-tion, which results in the replacement of chip-to-chip interconnections by intrachip connections In addition, the increased packing density improves the computation per unit volume

For instance, Figure 47.2 shows a 2D layout on a chip of dimension 2L × 2L on the left, where the longest (nondetoured) wire, going from one end of the layout to the other, has a length of 4L.

If this design is built on four tiers, as shown at right, assuming the same total silicon area and a

square aspect ratio for each tier, the silicon area in each tier is L × L Therefore, the longest possible

985

Trang 5

Intratier wires Devices

Intertier via

Silicon substrate

Tier 1 Tier 2 Tier 3 Tier 4

FIGURE 47.1 Schematic of a 3D integrated circuit.

undetoured wirelength, going from one end in the lowest tier to the other end in the uppermost tier,

is approximately 2L (because the intertier thickness is negligible) Because, for a buffered two-pin

interconnect, the delay of a wire is proportional to its length, this implies that the delay is halved Moreover, the reduced wire lengths also reduce the likelihood of congestion bottlenecks, potentially reducing the need to detour wires A more precise distribution of the wirelength has been reported

in Ref [1], which shows that the histogram of wirelength distributions moves progressively to the left as the number of tiers is increased

In addition, 3D designs can result in new paradigms, for example, heterogeneous integration, where each tier could be a different material (e.g.,a silicon-based circuit on one tier and a GaAs-based circuit on another) Even for purely silicon-based circuits, 3D designs permit analog/RF and digital circuits to be build on different tiers, which improves their noise behavior; additionally, it is possible

to construct shielding structures such as Faraday cages between tiers for enhanced noise reduction Various flavors of 3D technologies have been proposed and are in use One of the simplest forms involves wafer stacking, where the distance between active devices in the third dimension (or the

“z dimension”) equals the thickness of a wafer However, the thickness of a wafer is of the order of

several hundreds of microns, and the full potential of 3D is not achieved by this approach due to

the long distance that a wire must traverse in the z dimension Further progress has resulted in the

development of integrated 3D circuits in industrial [2], government [3], and academic [4] settings, which have demonstrated 3D designs with intertier separations of the order of a few microns Today, it is only possible to build a few tiers in the third dimension, as a result of which many

of these technologies are often referred to as 2.5D rather than fully 3D Nevertheless, even the half dimension can provide the potential for substantial performance improvements, and perhaps future technological improvements will enable truly 3D integration

In this chapter, we present an overview of physical design technologies for 3D circuits We begin with a brief overview of a typical 3D technology, and then discuss physical design problems in the custom/ASIC design as well as the FPGA paradigms Generally speaking, the number of tiers is taken in as a technology input by the 3D tools described in this chapter

2L

L

FIGURE 47.2 Comparison of the maximum wirelength in a 2D layout (left) and in its 3D counterpart (right).

For clarity, the intertier thicknesses in the 3D circuit are shown to be exaggeratedly large

Trang 6

47.2 STANDARD CELL-BASED DESIGNS

A typical cell-based flow begins with a floorplanning step, where the system is laid out at the level

of macroblocks, detailed placement of the cells in the layout, and routing In the 3D context, each of these must be modified to adapt to the constraints imposed by 3D circuits In addition to conventional metrics, 3D-specific geometrical considerations must be used, for example, for wirelength metrics

In addition, temperature is treated as a first-class citizen during these optimizations.∗Moreover, intertier via reduction is considered to be a desirable goal, because the number of available vias is restricted and must be shared between signal nets and supply and clock nets

In addition to floorplanning, placement, and routing, a 3D-specific optimization that makes the temperature distribution more uniform is the judicious positioning of thermal vias within the layout These vias correspond to intertier metal connections that have no electrical function, but instead, constitute a passive cooling technology that draws heat from the problem areas to the heat sink, and can be built into each of these steps or performed as an independent postprocessing step, depending

on the design methodology

It is instructive to view the result of a typical 3D thermally aware placement [5]: a layout for the benchmark circuit, IBM01, in a four-tier 3D process, is displayed in Figure 47.3 The cells are positioned in ordered rows on each tier, and the layout in each individual tier looks similar to a 2D standard cell layout The heat sink is placed at the bottom of the 3D chip, and the lighter shaded regions are hotter than the darker shaded regions The coolest cells are those in the bottom tier, next

to the heat sink, and the temperature increases as we move to higher tiers The thermal placement method consciously mitigates the temperature by making the upper tiers sparser, in terms of the percentage of area populated by the cells, than the lower tiers

47.2.1 THERMALVIAS

Although silicon is a good thermal conductor, with half or more of the conductivity of typical metals, many of the materials used in 3D technologies are strong insulators that place severe restrictions on the amount of heat that can be removed, even under the best placement solution The materials include epoxy bonding materials used to attach 3D tiers, or field oxide, or the insulator in an SOI technology Therefore, the use of deliberate metal lines that serve as heat-removing channels, called thermal vias, are an important ingredient of the total thermal solution The second step in the flow determines the optimal positions of thermal vias in the placement that provide an overall improvement in the

Hot Cool

0 0.5

⫺ 0.5

1 ⫻ 10 − 5

⫺ 1 0.015

0.005

⫺ 0.005 0.01

⫺ 0.01 0

⫺ 0.015 0.015

0.01

0 0.005

⫺ 0.005 ⫺ 0.01 ⫺ 0.015

FIGURE 47.3 Placement for the benchmark ibm01 in a four-tier 3D technology (From Ababei, C., et al.,

IEEE Design and Test, 22, 520, 2005 Copyright IEEE With permission.)

∗ A description of techniques for thermal analysis is provided in Section 3.4

Trang 7

988 Handbook of Algorithms for Physical Design Automation temperature distribution In realistic 3D technologies, the footprints of these intertier vias are of the order 5× 5 µm

In principle, the problem of placing thermal vias can be viewed as one of determining one of two conductivities (corresponding to the presence or absence of metal) at every candidate point where a thermal via may be placed in the chip However, in practice, it is easy to see that such an approach could lead to an extremely large search space that is exponential in the number of possible positions; note that the set of possible positions in itself is extremely large

Quite apart from the size of the search space, such an approach is unrealistic for several other reasons First, the wanton addition of thermal vias in any arbitrary region of the layout would lead

to nightmares for a router, which would have to navigate around these blockages Second, from a practical standpoint, it is unreasonable to perform full-chip thermal analysis, particularly in the inner loop of an optimizer, at the granularity of individual thermal vias At this level of detail, individual elements would have to correspond to the size of a thermal via, and the size of the thermal simulation matrix would become extremely large

Fortunately, there are reasonable ways to overcome each of these issues The blockage problem may be controlled by enforcing discipline within the design, designating a specific set of areas within the chip as potential thermal via sites These could be chosen as specific interrow regions in the cell-based layout, and the optimizer would determine the density with which these are filled with thermal vias The advantage to the router is obvious, because only these regions are potential blockages, which

is much easier to handle To control the finite element analysis (FEA) stiffness matrix size, one could work with a two-level scheme with relatively large elements, where the average thermal conductivity

of each region is a design variable Once this average conductivity is chosen, it could be translated back into a precise distribution of thermal vias within the element that achieves that average conductivity Various published methods take different approaches to thermal via insertion We now describe

an algorithm to postfacto thermal via insertion [6]; other procedures perform thermal via insertion during floorplanning, placement or routing are discussed in the appropriate sections

For a given placed 3D circuit, an iterative method was developed in which, during each iteration, the thermal conductivities of certain FEA elements (thermal via regions) are incrementally modified

so that thermal problems are reduced or eliminated Thermal vias are generically added to elements

to achieve the desired thermal conductivities The goal of this method is to satisfy given thermal requirements using as few thermal vias as possible, that is, keeping the thermal conductivities as low

as possible

The approach uses the finite element equations to determine a target thermal conductivity

A key observation in this work is that the insertion of thermal vias is most useful in areas with

a high thermal gradient, rather than areas with a high temperature Effectively, the thermal via acts

as a pipe that allows the heat to be conducted from the higher temperature region to the lower temperature region; this, in turn, leads to temperature reductions in areas of high temperature This is illustrated in Figure 47.4, which shows the 3D layout of the benchmarkstruct, before and after the addition of thermal vias The hottest region is the center of the uppermost tier, and a major reason for its elevated temperature is because the tier below it is hot Adding thermal vias to remove heat from the second tier, therefore, effectively also significantly reduces the temperature

of the top tier For this reason, the regions where the insertion of thermal vias is most effective are those that have high thermal gradients

Therefore the method in Ref [6] employs an iterative update formula of the type

Knew

i = Kold

i

gold

i

g i,ideal

is employed, where Knew

i and Kold

i are, respectively, the new and old thermal conductivities in each

direction, before and after each iteration, gold

i is the old thermal gradient, and g i,idealis a heuristically selected ideal thermal gradient

Trang 8

0

− 0.01

y

1

0.8

0.6

0.01

− 1

− 0.8

− 0.6

− 0.4

− 0.2

0.4

0.2

0

⫻ 10 ⫺ 5 Before thermal via placement ⫻ 10 ⫺ 5 After thermal via placement

1 0.8 0.6

− 1

− 0.8

− 0.6

− 0.4

− 0.2

0.4 0.2 0

0

− 0.01

0.01

y

0.005

FIGURE 47.4 Thermal profile of struct before (left) and after (right) thermal via insertion The top four layers

of the figure at right correspond to the four layers in the figure at left (From Goplen, B and Sapatnekar, S S.,

IEEE Transactions on Computer-Aided Design, 26, 692, 2006 Copyright IEEE With permission.)

Each iteration begins with a distribution of the thermal vias; this distribution is corrected using the

above update formula, and the Knew

i value is then translated to a thermal via density, and then a precise layout of thermal vias, using precharacterization The iterations end when the desired temperature profile is achieved This essential iterative idea has also been used in other methods for thermal-via insertion steps that are integrated within floorplanning, placement, and routing, as described in succeeding sections This general framework has been used in several other published techniques that insert thermal vias either concurrently during another optimization, or as an independent step

47.2.2 3D FLOORPLANNING

The 3D floorplanning problem is analogous to the 2D problem discussed in Chapters 8 through 13, with all the constraints and opportunities that arise with the move to the third dimension Typical cost functions include a mix of the conventional wirelength and total area costs, and the temperature and the number of intertier vias

The approach in Ref [7] presented one of the first approaches to 3D floorplanning, and used the transitive closure graph (TCG) representation [8], described in Section 11.7, for each tier, and a bucket structure for the third dimension Each bucket represents a 2D region over all tiers, and stores, for each tier, the indices of the blocks that intersect that bucket In other words, the TCG and this bucket structure can quickly determine any adjacency information A simulated annealing engine is then utilized, with the moves corresponding to perturbations within a tier and across tiers; in each such case, the corresponding TCGs and buckets are updated, as necessary

A simple thermal analysis procedure is built into this solution, using a finite difference

approx-imation of the thermal network to build an RC thermal network Under the assumption that heat flows purely in the z direction and there is no lateral heat conduction, the RC model obtained from a

finite difference approximation has a tree structure, and Elmore-like computations (Section 47.3.1) can be performed to determine the temperature The optimization heuristically attempts to make this a self-fulfilling assumption, by discouraging lateral heat conduction, introducing a cost function parameter that discourages strong horizontal gradients A hybrid approach performs an exact thermal analysis once every 20 iterations or so and uses the approximate approach for the other iterations The work in Ref [9] expands the idea of thermally driven floorplanning by integrating thermal via insertion into the simulated annealing procedure A thermal analysis procedure based on random walks [10] is built into the method, and an iterative formula, similar to Ref [6], is used in a thermal-via insertion step between successive simulated annealing iterations

Trang 9

47.2.3 3D PLACEMENT

In the placement step, the precise positions of cells in a layout are determined, and they are arranged

in rows within the tiers of the 3D circuit Because thermal considerations are particularly important

in 3D cell-based circuits, this procedure must spread the cells to achieve a reasonable temperature distribution, while also capturing traditional placement requirements

Several approaches to 3D placement have been proposed in the literature The work in Ref [11] embeds the netlist hypergraph into the layout area A recursive bipartitioning procedure is used to assign nodes of the hypergraph to partitions, using mincut as the primary objective and under partition

capacity constraints Partitioning in the z direction corresponds to tier assignment, and xy partitions

to assigning standard cells to rows No thermal considerations are taken into account

The procedure in Ref [5] presents a 3D-specific force-directed placer that incorporates thermal objectives directly into the placer Instead of the finite difference method that is used in many floorplanners, this approach employs FEA, which discretizes the design space into regions known

as elements For rectangular structures of the type encountered in integrated circuits, a rectangular cuboidal element can simulate heat conduction in the lateral directions without aberrations in the prime directions As described in Chapter 3, FEA results in a matrix of the type

The left hand side matrix, K, known as the global stiffness matrix, can be constructed using stamps

for the finite elements and the boundary conditions The FEA equations are solved rapidly using an iterative linear solver, with clever adjustments of the convergence criteria to achieve greater or lesser accuracy, as required at different stages of the iterative placement process

The placement engine is based on a force-directed approach, the key idea of which is described

in Chapter 18 Attractive forces are created between interconnected cells, and these are proportional

to the quadratic function of the cell coordinates that represents the Euclidean distance between the blocks The constants of proportionality are chosen to be higher in the z direction to discourage intertier vias

Apart from design criteria such as cell overlap, in the 3D context, thermal criteria are also used

to generate repulsive forces, to prevent hot spots The temperature gradient (which itself can be related to the stiffness matrix and its derivative) is used to determine the magnitudes and directions

of these forces

Once the entire system of attractive and repulsive forces is generated, repulsive forces are added, the system is solved for the minimum energy state, that is, the equilibrium location Ideally, this minimizes the wirelengths while at the same time satisfying the other design criteria such as the temperature distribution The iterative force-directed approach follows the following steps in the main loop Initially, forces are updated based on the previous placement Using these new forces, the cell positions are then calculated These two steps of calculating forces and finding cell positions are repeated until the exit criteria are satisfied The specifics of the force-directed approach to thermal placement, including the mathematical details, are presented in Ref [5] Once the iterations converge,

a final postprocessing step is used to legalize the placement Even though forces have been added

to discourage overlaps, the force-directed engine solves the problem in the continuous domain, and the task of legalization is to align cells to tiers, and to rows within each tier

Another method in Ref [12] maps an existing 2D placement to a 3D placement through trans-formations based on dividing the layout into 2k regions, for integer values of k, and then defining

local transformations to heuristically refine the layout

More recent work in Ref [13] observes that because 3D layouts have very limited flexibility in the third dimension (with a small number of layers and a fixed set of discrete locations), partitioning works better than a force-directed method Accordingly, this work performs global placement using recursive bisectioning Thermal effects are incorporated through thermal resistance reduction nets, which are attractive forces that induce high power nets to remain close to the heat sink The global

Trang 10

placement step is followed by coarse legalization, in which a novel cell-shifting approach is proposed This generalizes the methods in FastPlace, described in Chapter 18, by allowing shift moves to adjust the boundaries of both sparsely and densely populated cells using a computationally simple method Finally, detailed legalization generates a final nonoverlapping layout The approach is shown to provide excellent trade-offs between parameters such as the number of interlayer vias, wirelength, and temperature

47.2.4 ROUTINGALGORITHMS

During routing, several objectives and constraints must be taken into consideration, including avoid-ing blockages due to areas occupied by thermal vias, incorporatavoid-ing the effect of temperature on the delays of the routed wires, and of course, traditional objectives such as wirelength, timing, congestion, and routing completion

Once the cells have been placed and the locations of the thermal vias determined, the routing stage finds the optimal interconnections between the wires As in 2D routing, it is important to optimize the wirelength, the delay, and the congestion In addition, several 3D-specific issues come into play First, the delay of a wire increases with its temperature, so that more critical wires should avoid the hottest regions, as far as possible Second, intertier vias are a valuable resource that must

be optimally allocated among the nets Third, congestion management and blockage avoidance is more complex with the addition of a third dimension For instance, a signal via or thermal via that spans two or more tiers constitutes a blockage that wires must navigate around

Consider the problem of routing in a three-tier technology, as illustrated in Figure 47.5 The layout is gridded into rectangular tiles, each with a horizontal and vertical capacity that determines the number of wires that can traverse the tile, and an intertier via capacity that determines the number

of free vias available in that tile These capacities account for the resources allocated for nonsignal wires (e.g., power and clock wires) as well as the resources used by thermal vias For a single net,

as shown in the figure, the degrees of freedom that are available are in choosing the locations of the intertier vias, and selecting the precise routes within each tier The locations of intertier vias will depend on the resource contention for vias within each grid Moreover, critical wires should avoid the high-temperature tiles, as far as possible

The work in Ref [14] presents a thermally conscious router, using a multilevel routing paradigm similar to Ref [15,16], with integrated intertier via planning and incorporating thermal considera-tions An initial routing solution is constructed by building a 3D minimum spanning tree (MST) for each multipin net, and using maze routing to avoid obstacles

At each level of the multilevel scheme, the intertier via planning problem assigns vias in a given

region at level k − 1 of the multilevel hierarchy to tiles at level k The problem is formulated as

Tier 1

Tier 2

Tier 3

FIGURE 47.5 Example route for a net in a three-tier 3D technology (From Ababei, C., et al., IEEE Design

Định dạng
Số trang	10
Dung lượng	280,76 KB