handbook of algorithms for physical automation alpert, mehta sapatnekar 2008 11 12 Cấu trúc dữ liệu và giải thuật

The importance of automation was reflected in the scientificcommunity by the formation of the Design Automation Conference in 1963 and both the International Conference on Computer-Aided

Trang 2

Handbook of

Algorithms for

Physical design Automation

Trang 4

Handbook of

Algorithms for

Physical design Automation

Edited by Charles J alpert Dinesh p mehta Sachin S Sapatnekar

A N A U E R B A C H B O O K

CRC Press is an imprint of the

Taylor & Francis Group, an informa business

Boca Raton London New York

Trang 5

Auerbach Publications

Taylor & Francis Group

6000 Broken Sound Parkway NW, Suite 300

Boca Raton, FL 33487-2742

Auerbach is an imprint of Taylor & Francis Group, an Informa business

No claim to original U.S Government works

Printed in the United States of America on acid-free paper

10 9 8 7 6 5 4 3 2 1

International Standard Book Number-13: 978-0-8493-7242-1 (Hardcover)

This book contains information obtained from authentic and highly regarded sources Reasonable efforts have been made to publish reliable data and information, but the author and publisher cannot assume responsibility for the validity of all materials or the consequences of their use The authors and publishers have attempted to trace the copyright holders of all material reproduced in this publication and apologize to copyright holders if permission to publish in this form has not been obtained If any copyright material has not been acknowledged please write and let us know so we may rectify in any future reprint.

Except as permitted under U.S Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or lized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopy- ing, microfilming, and recording, or in any information storage or retrieval system, without written permission from the publishers.

uti-For permission to photocopy or use material electronically from this work, please access www.copyright.com (http:// www.copyright.com/) or contact the Copyright Clearance Center, Inc (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400 CCC is a not-for-profit organization that provides licenses and registration for a variety of users For orga- nizations that have been granted a photocopy license by the CCC, a separate system of payment has been arranged.

Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for

identification and explanation without intent to infringe.

Library of Congress Cataloging-in-Publication Data

Handbook of algorithms for physical design automation / edited by Charles J Alpert, Dinesh P Mehta, Sachin S Sapatnekar.

p cm.

Includes bibliographical references and index.

ISBN-13: 978-0-8493-7242-1

ISBN-10: 0-8493-7242-9

1 Integrated circuit layout Mathematics Handbooks, manuals, etc 2 Integrated circuit

layout Data processing Handbooks, manuals, etc 3 Integrated circuits Very large scale

integration Design and construction Data processing Handbooks, manuals, etc 4 Algorithms I

Alpert, Charles J II Mehta, Dinesh P III Sapatnekar, Sachin S., 1967- IV Title.

Trang 6

To the wonderful girls in my life:

Cheryl, Candice, Ciara, and Charlie

Charles J Alpert

To the memory of my grandparents:

Nalinee and Gajanan Kamat, Radha and Shreenath Mehta

Dinesh P Mehta

To Ofelia and Arunito

Sachin S Sapatnekar

Trang 8

Editors xiiiContributors xv

Chapter 1 Introduction to Physical Design 3

Charles J Alpert, Dinesh P Mehta, and Sachin S Sapatnekar

Chapter 2 Layout Synthesis: A Retrospective 9

Ralph H.J.M Otten

Chapter 3 Metrics Used in Physical Design 29

Frank Liu and Sachin S Sapatnekar

Chapter 4 Basic Data Structures 55

Dinesh P Mehta and Hai Zhou

Chapter 5 Basic Algorithmic Techniques 73

Vishal Khandelwal and Ankur Srivastava

Chapter 6 Optimization Techniques for Circuit Design Applications 89

Zhi-Quan Luo

Chapter 7 Partitioning and Clustering 109

Dorothy Kucar

Chapter 8 Floorplanning: Early Research 139

Susmita Sur-Kolay

vii

Trang 9

Chapter 9 Slicing Floorplans 161

Ting-Chi Wang and Martin D.F Wong

Chapter 10 Floorplan Representations 185

Evangeline F.Y Young

Chapter 11 Packing Floorplan Representations 203

Tung-Chieh Chen and Yao-Wen Chang

Chapter 12 Recent Advances in Floorplanning 239

Dinesh P Mehta and Yan Feng

Chapter 13 Industrial Floorplanning and Prototyping 257

Louis K Scheffer

Chapter 14 Placement: Introduction/Problem Formulation 277

Gi-Joon Nam and Paul G Villarrubia

Chapter 15 Partitioning-Based Methods 289

Jarrod A Roy and Igor L Markov

Chapter 16 Placement Using Simulated Annealing 311

William Swartz

Chapter 17 Analytical Methods in Placement 327

Ulrich Brenner and Jens Vygen

Chapter 18 Force-Directed and Other Continuous Placement Methods 347

Andrew Kennings and Kristofer Vorwerk

Chapter 19 Enhancing Placement with Multilevel Techniques 377

Jason Cong and Joseph R Shinnerl

Chapter 20 Legalization and Detailed Placement 399

Ameya R Agnihotri and Patrick H Madden

Trang 10

Chapter 21 Timing-Driven Placement 423

David Z Pan, Bill Halpin, and Haoxing Ren

Chapter 22 Congestion-Driven Physical Design 447

Saurabh N Adya and Xiaojian Yang

Chapter 23 Global Routing Formulation and Maze Routing 469

Muhammet Mustafa Ozdal and Martin D.F Wong

Chapter 24 Minimum Steiner Tree Construction 487

Gabriel Robins and Alexander Zelikovsky

Chapter 25 Timing-Driven Interconnect Synthesis 509

Jiang Hu, Gabriel Robins, and Cliff C N Sze

Chapter 26 Buffer Insertion Basics 535

Jiang Hu, Zhuo Li, and Shiyan Hu

Chapter 27 Generalized Buffer Insertion 557

Miloš Hrki ´c and John Lillis

Chapter 28 Buffering in the Layout Environment 569

Jiang Hu and Cliff C N Sze

Chapter 29 Wire Sizing 585

Sanghamitra Roy and Charlie Chung-Ping Chen

Chapter 30 Estimation of Routing Congestion 599

Rupesh S Shelar and Prashant Saxena

Chapter 31 Rip-Up and Reroute 615

Jeffrey S Salowe

Trang 11

Chapter 32 Optimization Techniques in Routing 627

Christoph Albrecht

Chapter 33 Global Interconnect Planning 645

Cheng-Kok Koh, Evangeline F.Y Young, and Yao-Wen Chang

Chapter 34 Coupling Noise 673

Rajendran Panda, Vladimir Zolotov, and Murat Becer

Chapter 35 Modeling and Computational Lithography 695

Franklin M Schellenberg

Chapter 36 CMP Fill Synthesis: A Survey of Recent Studies 737

Andrew B Kahng and Kambiz Samadi

Chapter 37 Yield Analysis and Optimization 771

Puneet Gupta and Evanthia Papadopoulou

Chapter 38 Manufacturability-Aware Routing 791

Minsik Cho, Joydeep Mitra, and David Z Pan

Chapter 39 Placement-Driven Synthesis Design Closure Tool 813

Charles J Alpert, Nathaniel Hieter, Arjen Mets, Ruchir Puri, Lakshmi Reddy, Haoxing Ren, and Louise Trevillyan

Chapter 40 X Architecture Place and Route: Physical Design for the X Interconnect

Architecture 835

Steve Teig, Asmus Hetzel, Joseph Ganley, Jon Frankle, and Aki Fujimura

Chapter 41 Inductance Effects in Global Nets 865

Yehea I Ismail

Trang 12

Chapter 42 Clock Network Design: Basics 881

Chris Chu and Min Pan

Chapter 43 Practical Issues in Clock Network Design 897

Chris Chu and Min Pan

Chapter 44 Power Grid Design 913

Haihua Su and Sani Nassif

Chapter 45 Field-Programmable Gate Array Architectures 941

Steven J.E Wilton, Nathalie Chan King Choy, Scott Y.L Chin,

and Kara K.W Poon

Chapter 46 FPGA Technology Mapping, Placement, and Routing 957

Kia Bazargan

Chapter 47 Physical Design for Three-Dimensional Circuits 985

Kia Bazargan and Sachin S Sapatnekar

Index .1003

Trang 14

Charles J Alpert(Chuck) was born in Bethesda, Maryland, in 1969 He received two undergraduatedegrees from Stanford University in 1991 and his doctorate from the University of California, LosAngeles, California in 1996, in computer science Upon graduation, Chuck joined IBM’s AustinResearch Laboratory where he currently manages the Design Productivity Group, whose mission is

to develop design automation tools and methodologies to improve designer productivity and reducedesign cost Chuck has over 100 conference and journal publications and has thrice received thebest paper award from the ACM/IEEE Design Automation Conference He has been active in theacademic community, serving as chair for the Tau Workshop on Timing Issues and the International

Symposium on Physical Design He also serves as an associate editor of IEEE Transactions on Computer-Aided Design He received the Mahboob Khan Mentor Award in 2001 and 2007 for his

work in mentoring He was also named the IEEE fellow in 2005

Dinesh P Mehtareceived his BTech in computer science and engineering from the Indian Institute ofTechnology, Bombay, India, in 1987; his MS in computer science from the University of Minnesota,Minneapolis, Minnesota, in 1990; and his PhD in computer science from the University of Florida,Gainesville, Florida, in 1992 He was on the faculty at the University of Tennessee Space Institute,Tullahoma, Tennessee from 1992 to 2000, where he received the Vice President’s Award for TeachingExcellence in 1997 He was a visiting professor at Intel’s Strategic CAD Labs in 1996 and 1997

He has been on the faculty in the mathematical and computer science departments at the ColoradoSchool of Mines, Golden, Colorado since 2000, where he is a professor and currently also serves as

department head He is a coauthor of Fundamentals of Data Structures in C++ and a coeditor of

Handbook of Data Structures and Applications His publications and research interests are in VLSI

design automation, and applied algorithms and data structures He is a former associate editor of the

IEEE Transactions on Circuits and Systems-I.

Sachin S Sapatnekarreceived his BTech from the Indian Institute of Technology, Bombay, India

in 1987; his MS from Syracuse University, New York, in 1989; and his PhD from the University ofIllinois at Urbana–Champaign, Urbana, Illinois, in 1992 From 1992 to 1997, he was an assistantprofessor in the Department of Electrical and Computer Engineering at Iowa State University, Ames,Iowa Since then, he has been on the faculty of the Department of Electrical and Computer Engi-neering at the University of Minnesota, Minneapolis, Minnesota, where he is currently the Robertand Marjorie Henle Professor He has published widely in the area of computer-aided design ofVLSI circuits, particularly in the areas of timing, layout, and power He has held positions on the

editorial board of the IEEE Transactions on CAD (he is currently the deputy editor-in-chief), the IEEE Transactions on VLSI Systems, and the IEEE Transactions on Circuits and Systems II He has

served on the technical program committee for various conferences, as a technical program co-chairfor Design Automation Conference (DAC), and as a technical program and general chair for boththe IEEE/ACM Tau Workshop and the ACM International Symposium on Physical Design He is

a recipient of the NSF Career Award, three best paper awards at DAC, and one at InternationalConference on Computer Design (ICCD), and the Semiconductor Research Corporation TechnicalExcellence award He is a fellow of the IEEE

xiii

Trang 16

Magma Design Automation

San Jose, California

Department of Electrical Engineering

and Graduate Institute of Electronics

Engineering

National Taiwan University

Taipei, Taiwan

Charlie Chung-Ping Chen

Department of Electrical Engineering

Taipei, Taiwan

Tung-Chieh Chen

Graduate Institute of Electronics Engineering

Taipei, Taiwan

Scott Y.L Chin

Electrical and Computer EngineeringUniversity of British ColumbiaVancouver, British Columbia, Canada

Minsik Cho

Electrical and Computer EngineeringDepartment

University of TexasAustin, Texas

Nathalie Chan King Choy

Trang 17

Asmus Hetzel

Magma Design Automation, Inc

Texas A & M University

College Station, Texas

Electrical and Computer Engineering and

Computer Science and Engineering

Zhuo Li

IBM CorporationAustin, Texas

Zhi-Quan Luo

Department of Electrical and ComputerEngineering

University of MinnesotaMinneapolis, Minnesota

Patrick H Madden

Computer Science DepartmentBinghamton UniversityBinghamton, New York

Igor L Markov

Department of Electrical Engineeringand Computer Science

University of MichiganAnn Arbor, Michigan

Joydeep Mitra

University of TexasAustin, Texas

Gi-Joon Nam

IBM CorporationAustin, Texas

Trang 18

Sani Nassif

IBM Corporation

Austin, Texas

Ralph H.J.M Otten

Eindhoven University of Technology

Eindhoven, the Netherlands

Muhammet Mustafa Ozdal

Cadence Design Systems, Inc

Electrical and Computer Engineering

University of British Columbia

Vancouver, British Columbia, Canada

Sanghamitra Roy

Department of Electrical and ComputerEngineering

University of Wisconsin–MadisonMadison, Wisconsin

Sachin S Sapatnekar

University of MinnesotaMinneapolis, Minnesota

Trang 19

Steven J.E Wilton

Martin D.F Wong

Department of Electrical andComputer EngineeringUniversity of Illinois at Urbana–ChampaignUrbana, Illinois

Xiaojian Yang

Synopsys, Inc

Sunnyvale, California

Evangeline F.Y Young

Department of Computer Science andEngineering

Chinese University of Hong Kong ShatinHong Kong, China

Vladimir Zolotov

IBM CorporationYorktown Heights, New York

Trang 20

Part I

Introduction

Trang 22

1 Introduction to Physical

Design

Charles J Alpert, Dinesh P Mehta,

and Sachin S Sapatnekar

CONTENTS

1.1 Introduction 31.2 Overview of the Physical Design Process 41.3 Overview of the Handbook 51.4 Intended Audience 7Note about References 7

1.1 INTRODUCTION

The purpose of VLSI physical design is to embed an abstract circuit description, such as a netlist, intosilicon, creating a detailed geometric layout on a die In the early years of semiconductor technology,the task of laying out gates and interconnect wires was carried out manually (i.e., by hand on graphpaper, or later through the use of layout editors) However, as semiconductor fabrication processesimproved, making it possible to incorporate large numbers of transistors onto a single chip (a trendthat is well captured by Moore’s law), it became imperative for the design community to turn tothe use of automation to address the resulting problem of scale Automation was facilitated by theimprovement in the speed of computers that would be used to create the next generation of computerchips resulting in their own replacement! The importance of automation was reflected in the scientificcommunity by the formation of the Design Automation Conference in 1963 and both the International

Conference on Computer-Aided Design and the IEEE Transactions on Computer-Aided Design in

1983; today, there are several other conferences and journals on design automation

While the problems of scale have been one motivator for automation, other factors have alsocome into play Most notably, improvements in technology have resulted in the invalidation of somecritical assumptions made during physical design: one of these is related to the relative delay betweengates and the interconnect wires used to connect gates to each other Initially, gate delays dominatedinterconnect delays to such an extent that interconnect delay could essentially be ignored whencomputing the delay of a circuit With technology scaling causing feature sizes to shrink by a factor

of 0.7 every 18 months or so, gates became faster from one generation to the next, while wiresbecame more resistive and slower Early metrics that modeled interconnect delay as proportional

to the length of the wire first became invalid (as wire delays scale quadratically with their lengths)and then valid again (as optimally buffered interconnects show such a trend) New signal integrityeffects began to manifest themselves as power grid noise or in the form of increased crosstalk aswire cross-sections became “taller and thinner” from one technology generation to the next Otherproblems came into play: for instance, the number of buffers required on a chip began to show trendsthat increased at alarming rates; the delays of long interconnects increased to the range of severalclock cycles; and new technologies emerged such as 3D stacked structures with multiple layers of

3

Trang 23

active devices, opening up, literally and figuratively, a new dimension in physical design All of thesehave changed, and are continuing to change, the fundamental nature of classical physical design.

A major consequence of interconnect dominance is that the role of physical design movedupstream to other stages of the design cycle Synthesis was among the first to feel the impact:traditional 1980s-style logic synthesis (which lasted well into the 1990s) used simplified wire-loadmodels for each gate, but the corresponding synthesis decisions were later unable to meet timingspecifications, because they operated under gross and incorrect timing estimates This realization led

to the advent of physical synthesis techniques, where synthesis and physical design work hand inhand More recently, multicyle interconnects have been seen to impact architectural decisions, andthere has been much research on physically driven microarchitectural design

These are not the only issues facing the designer In sub-90 nm technologies, manufacturabilityissues have come to the forefront, and many of them are seen to impact physical design Traditionally,design and manufacturing inhabited different worlds, with minimal handoffs between the two, but

in light of∗issues related to subwavelength lithography and planarization, a new area of physicaldesign has opened up, where manufacturability has entered the equation The explosion in maskcosts associated with these issues has resulted in the emergence of special niches for field program-mable gate arrays (FPGAs) for lower performance designs and for fast prototyping; physical designproblems for FPGAs have their own flavors and peculiarities

Although there were some early texts on physical design automation in the 1980s (such as the ones

by Preas/Lorenzetti and Lengauer), university-level courses in VLSI physical design did not becomecommonplace until the 1990s when more recent texts became available The field continues to changerapidly with new problems coming up in successive technology generations The developments inthis area have motivated the formation of the International Symposium on Physical Design (ISPD), aconference that is devoted solely to the discipline of VLSI physical design; this and other conferencesbecame the major forum for the learning and dissemination of new knowledge However, existingtextbooks have failed to keep pace with these changes One of the goals of this handbook is toprovide a detailed survey of the field of VLSI physical design automation with a particular emphasis

on state-of-the-art techniques, trends, and improvements that have emerged as a result of the dramaticchanges seen in the field in the last decade

1.2 OVERVIEW OF THE PHYSICAL DESIGN PROCESS

Back when the world was young and life was simple, when Madonna and Springsteen ruled thepop charts, interconnect delays were insignificant and physical design was a fairly simple process.Starting with a synthesized netlist, the designer used floorplanning to figure out where big blocks(such as arrays) were placed, and then placement handled the rest of the logic If the design metits timing constraints before placement, then it would typically meet its timing constraints afterplacement as well One could perform clock tree synthesis followed by routing and iterate over theseprocess in a local manner

Of course, designs of today are much larger and more complex, which requires a more complexphysical design flow Floorplanning is harder than ever, and despite all the algorithms and innovationsdescribed here, it is still a very manual process During floorplanning, the designers plan their I/Osand global interconnect, and restrict the location of logic to certain areas, and of course, the blocks(of which there are more than ever) They often must do this in the face of incomplete timing data.Designers iterate on their floorplans by performing fast physical synthesis and routing congestionestimation to identify key problem areas

Once the main blocks are fixed in location and other logic is restricted, global placement is used toplace the rest of the cells, followed by detailed placement to make local improvements The placing ofcells introduces long wires that increase delays in unexpected places These delays are then reduced

∗ Pun unintended.

Trang 24

by wire synthesis techniques of buffering and wire sizing Iteration between incremental placementand incremental synthesis to satisfy timing constraints today takes place in a single process calledphysical synthesis Physical synthesis embodies just about all traditional physical design processes:floorplanning, placement, clock tree construction, and routing while sprinkling in the ability toadapt to the timing of the design Of course, with a poor floorplan, physical synthesis will fail,

so the designer must use this process to identify poor block and logic placement and plan globalinterconnects in an iterative process

The successful exit of physical synthesis still requires post-timing-closure fix-up to addressnoise, variability, and manufacturability issues Unfortunately, repairing these can sometimes forcethe designer back to earlier stages in the flow

Of course, this explanation is an oversimplification The physical design flow depends on the size

of the design, the technology, the number of designers, the clock frequency, and the time to completethe design As technology advances and design styles change, physical design flows are constantlyreinvented as traditional phases are removed or combined by advances in algorithms (e.g., physicalsynthesis) while new ones are added to accommodate changes in technology

1.3 OVERVIEW OF THE HANDBOOK

This handbook consists of the following ten parts:

1 Introduction: In addition to this chapter, this part includes a personal perspective from RalphOtten, looking back on the major technical milestones in the history of physical designautomation A discussion of physical design objective functions that drive the techniquesdiscussed in subsequent parts is also included in this part

2 Foundations: This part includes reviews of the underlying data structures and basic mic and optimization techniques that form the basis of the more sophisticated techniquesused in physical design automation This part also includes a chapter on partitioning andclustering Many texts on physical design have traditionally included partitioning as anintegral step of physical design Our view is that partitioning is an important step in sev-eral stages of the design automation process, and not just in physical design; therefore, wedecided to include a chapter on it here rather than devote a full handbook part

algorith-3 Floorplanning: This identifies relative locations for the major components of a chip and may

be used as early as the architecture stage This part includes a chapter on early methods forfloorplanning that mostly viewed floorplanning as a two-step process (topology generationand sizing) and reviews techniques such as rectangular dualization, analytic floorplanning,and hierarchical floorplanning The next chapter exclusively discusses the slicing floorplanrepresentation, which was first used in the early 1970s and is still used in a lot of the recentliterature The succeeding two chapters describe floorplan representations that are moregeneral: an active area of research during the last decade The first of these focuses onmosaic floorplan representations (these consider the floorplan to be a dissection of the chiprectangle into rooms that will be populated by modules, one to each room) and the second

on packing representations (these view the floorplan as directly consisting of modules thatneed to be packed together) The penultimate chapter describes recent variations of thefloorplanning problem It explores formulations that more accurately account for intercon-nect and formulations for specialized architectures such as analog designs, FPGAs, andthree-dimensional ICs The final chapter in this part describes the role of floorplanning andprototyping in industrial design methodologies

4 Placement: This is a classic physical design problem for which design automation solutionsdate back to the 1970s Placement has evolved from a pure wirelength-driven formulation toone that better understands the needs of design closure: routability, white space distribution,big block placement, and timing The first chapter in this part overviews how the placement

Trang 25

problem has changed with technology scaling and explains the new types of constraints andobjectives that this problem must now address.

There has been a renaissance in placement algorithms over the last few years, andthis can be gleaned from the chapters on cut-based, force-directed, multilevel, and analyticmethods This part also explores specific aspects of placement in the context of designclosure: detailed placement, timing, congestion, noise, and power

5 Net Layout and Optimization: During the design closure process, one needs to frequentlyestimate the layout of a particular net to understand its expected capacitance and impact ontiming and routability Traditionally, maze routing and Steiner tree algorithms have beenused for laying out a given net’s topology, and this is still the case today The first twochapters of this part overview these fundamental physical design techniques

Technology scaling for transistors has occurred much faster than for wires, which meansthat interconnect delays dominate much more than for previous generations The delays due

to interconnect are much more significant, thus more care needs to be taken when layingout a net’s topology The third chapter in this part overviews timing-driven interconnectstructures, and the next three chapters show how buffering interconnect has become anabsolutely essential step in timing closure The buffers in effect create shorter wires, whichmitigate the effect of technology scaling Buffering is not a simple problem, because one has

to not only create a solution for a given net but also needs to be cognizant of the routing andplacement resources available for the rest of the design The final chapter explores anotherdimension of reducing interconnect delay, wire sizing

6 Routing Multiple Signal Nets: The previous part focused on optimization techniques for

a single net These approaches need conflict resolution techniques when there are scarcerouting resources The first chapter explores fast techniques for predicting routing conges-tion so that other optimizations have a chance to mitigate routing congestion without having

to actually perform global routing The next two chapters focus on techniques for globalrouting: the former on the classic rip-up and reroute approach and the latter on alternativetechniques like network flows The next chapter discusses planning of interconnect, espe-cially in the context of global buffer insertion The final chapter addresses a very importanteffect from technology scaling: the impact of noise on coupled interconnect lines Noiseissues must be modeled and mitigated earlier in the design closure flows, as they havebecome so pervasive

7 Manufacturability and Detailed Routing: The requirements imposed by ity and yield considerations place new requirements on the physical design process Thispart discusses various aspects of manufacturability, including the use of metal fills, andresolution-enhancement techniques and subresolution assist features These techniques havehad a major impact on design rules, so that classical techniques for detailed routing cannot beused directly, and we will proceed to discuss the impact of manufacturability considerations

manufacturabil-on detailed routing

8 Physical Synthesis: Owing to the effects that have become apparent in deep submicrontechnologies, wires play an increasingly dominant role in determining the circuit perfor-mance Therefore, traditional approaches to synthesis that ignored physical design havebeen supplanted by a new generation of physical synthesis methods that integrate logicsynthesis with physical design This part overviews the most prominent approaches in thisdomain

9 Designing Large Global Nets: In addition to signal nets, global nets for supply and clocksignals consume a substantial fraction of on-chip routing resources, and play a vital role inthe functional correctness of the chip This part presents an overview of design techniquesthat are used to route and optimize these nets

10 Physical Design for Specialized Technologies: Although most of the book deals with stream microprocessor or ASIC style designs, the ideas described in this book are largely

Trang 26

main-applicable to other paradigms such as FPGAs and to emerging technologies such as 3Dintegration These problems require unique solution techniques that can satisfy these require-ments The last part overviews constraints in these specialized domains, and the physicaldesign solutions that address the related problems.

NOTE ABOUT REFERENCES

The following abbreviations may have been used to refer to conferences and journals in whichphysical design automation papers are published

ASPDAC Asian South Pacific Design Automation Conference

DAC Design Automation Conference

EDAC European Design Automation Conference

GLSVLSI Great Lakes Symposium on VLSI

ICCAD International Conference on Computer-Aided Design

ICCD International Conference on Computer Design

ISCAS International Symposium on Circuits and Systems

ISPD International Symposium on Physical Design

IEEE TCAD IEEE Transactions on the Computer-Aided Design of Integrated Circuits

IEEE TCAS IEEE Transactions on Circuits and Systems

ACM TODAES ACM Transactions on the Design Automation of Electronic Systems

IEEE TVLSI IEEE Transactions on VLSI Systems

Trang 28

2.1 THE FIRST ALGORITHMS (UP TO 1970)

Design automation has a history of over half a century if we look at its algorithms The first algorithmswere not motivated by design of electronic circuits Willard Van Orman Quine’s work on simplifyingtruth functions emanated from the philosopher’s research and teaching on mathematical logic Itproduced a procedure for simplifying two-level logic that remained at the core of logic synthesisfor decades (and still is in most of its textbooks) Closely involved in its development were thefirst pioneers in layout synthesis: Sheldon B Akers and Chester Y Lee Their work on switchingnetworks, both combinational and sequential, and their representation as binary decision programscame from the same laboratory as the above simplification procedure, and preceded the landmark

1961 paper on routing

9

Trang 29

2.1.1 LEE ’ SROUTER

What Lee [1] described is now called a grid expansion algorithm or maze runner, to set it apart fromearlier independent research on the similar abstract problem: the early paper of Edsger W Dijkstra

on shortest path and labyrinth problems [2] and Edward F Moore’s paper on shortest paths through

a maze [3] were already written in 1959 But in Lee’s paper the problem of connecting two points

on a grid with its application to printed circuit boards was developed through a systematization ofthe intuitive procedure: identify all grid cells that can be reached in an increasing number of stepsuntil the target is among them, or no unlabeled, nonblocked cells are left In the latter case, no suchpath exists In the former case, retracing provides a shortest path between the source and the target(Figure 2.1)

The input consists of a grid with blocked and nonblocked cells The algorithm then goes throughthree phases after the source and target have been chosen, and the source has been labeled with 0:

1 Wave propagation in which all unlabeled, nonblocked neighbors of labeled cells are labeledone higher than in the preceding wave

2 Retracing starts when the target has received a label and consists of repeatedly finding aneighboring cell with a lower label, thus marking a shortest path between the source andthe target

3 Label clearance prepares the grid for another search by adding the cells of the path justfound to the set of blocked cells and removing all labels

The time needed to find a path isO(L2) if L is the length of the path This makes it worst case O(N2) on an N × N grid (and if each cell has to be part of the input, that is any cell can be initially

blocked, it is a linear-time algorithm) Its space complexity is alsoO(N2) These complexities were

13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13 13

13 13 13 13 13

12 12 12 12

12

12 12 12

12 12 12 12 12 12 12 12 12 12 12 12 12 12 12

12 12 12 12 12

11

11 11 11 11 11 11

11 11 11 11 11 11 11 11 11 11 11 11

11 11 11 11 11 11 11

11

10 10 10 10

10

10 10 10

10 10 10 10

10 10

10 10 10 10 10 10 10 10 10

10 10 10 10 10

9 9 9

9

9 9

9 9 9 9 9 9 9 9 9 9

9 9 9 9 9 9

9

9 9 9

T

8

8 8 8 8

8

8 8 8 8 8

8 8 8 8 8 8 8 8 8

7 7 7

7

7 7 7

7

7 7 7

7 7

6 6

6

6 6 6

6 6 6 6 6

6 6

5 5 5

5

5 5

5 5 5 5

5

4 4

3

3 3 3

2 2

2

2 2

1

S

FIGURE 2.1 Wave propagation and retracing Waves are sets of grid cells with the same label The source

S gets label 0 The target T gets the length of the shortest path as a label (if any) Retracing is not unique in

general

Trang 30

soon seen as serious problems when applied to real-world cases Some relief in memory use wasfound in coding the labels: instead of labeling each explored cell with its distance to the source, itsuffices to record that number modulo 3, which works for any path search on an unweighted graph.Here, however, the underlying structure is bipartite, and Akers [4] observed that wave fronts with alabel sequence in which a certain label bit is twice on and then twice off (i.e., 1, 1, 0, 0, 1, 1, 0, …)suffice Trivial speedup techniques were soon standard in maze running, mostly aimed at reducingthe wave size Examples are designating the most off-center terminal as the source, starting wavesfrom both terminals, and limiting the search to a box slightly larger than the minimum containingthe terminals.

More significant techniques to reduce complexity were discovered in the second part of thedecade There are two techniques that deserve a mention in retrospective The first technique, lineprobing, was discovered by David W Hightower [5] and independently by Koichi Mikami andKinya Tabuchi [6] It addressed both the memory and time aspects of the router’s complexity Theidea is for each so-called base point to investigate the perpendicular line segments that containthe base point and extend those segments to the first obstacles on their way The first base pointsare the terminals and their lines are called trial lines of level 0 Mikami and Tabuchi choose next asbase points all grid points on the lines thus generated The trial lines of the next level are the linesegments perpendicular to the trial line containing their base point The process is stopped whenlines originating from different terminals intersect The algorithm guarantees a path if one existsand it will have the lowest possible number of bends This guarantee soon becomes very expensive,because all possible trial lines of the deepest possible level have to be examined Hightower thereforetraded it for more efficiency in the early stages by limiting the base points to the so-called escapepoints, that is, only the closest grid point that allows extension beyond the obstacle that blocked thetrial line of the previous level Line expansion, a combination of maze running and line probing,came some ten years later [7], with the salient feature of producing a path whenever one existed,though not necessarily with the minimum number of bends

The essence of line probing is in working with line segments for representing the routing spaceand paths Intuitively, it saves memory and time, especially when the search space is not congested.The complexity very much depends on the data structures maintained by the algorithm The originalpapers were vague about this, and it was not until the 1980s that specialists in computational geometrycould come up with a rigorous analysis [8] In practice, line probers were used for the first nets withdistant terminals Once the routing space gets congested, more like a labyrinth where trial lines arebound to be very short, a maze runner takes over

The second technique worth mentioning is based on the observation that from a graph theoreticalpoint of view, Lee’s router is just a breadth-first search that may take advantage of special features likeregularity and bipartiteness But significant speed advantage can be achieved by including a sense

of direction in the wave propagation phase, preferring cells closer to the target Frank Rubin [9]implements such an idea by sorting the cells in the wavefront with a key representing the griddistance to the target It shifts the character of the algorithm from breadth-first to depth-first search.This came close to what was developed simultaneously, but in the field of artificial intelligence:the A∗algorithm [10] Here the search is ordered by an optimistic estimate of the source–targetpathlength through the cell The sum of the number of steps to reach that cell (exactly as in theoriginal paper of Lee) plus the grid distance to the target (as introduced by Rubin) is a satisfactoryestimate, because the result can never be more than that estimate This means that it will find theshortest route, while exploring the least number of grid cells See Chapter 23 for a more detaileddescription of maze routing

Lee’s concept combined with A∗is still the basis of modern industrial routers But many moreissues than just the shortest two-pin net have to be considered An extension to multiterminal nets iseasy (e.g., after connecting two pins, take the cells on that route as the initial wavefront and find theshortest path to another terminal, etc.), but it will not in general produce the shortest connecting tree(for this the Steiner problem on a grid has to be solved, a well-known NP-hard problem, which is

Trang 31

discussed in Chapter 24) Routing in many wiring layers can also straightforwardly be incorporated

by adopting a three-dimensional grid Even bipartiteness is preserved, but looses its significancebecause of preferences in layers and usually built-in resistance against creating vias The latter andsome other desirable features can be taken care of by using other cost functions than just distance andtuning these costs for satisfactory results Also a net ordering strategy has to be determined, mostly

to achieve close to full wire list completion And taking into account sufficient effects of moderntechnology (e.g., cross talk, antenna phenomena, metal fill, lithography demands) makes routerdesign a formidable task, today even more than in the past This will be the subject of Chapters 34through 36 and 38

2.1.2 ASSIGNMENT ANDPLACEMENT

Placement is initially seen as an assignment problem where n modules have to be assigned to at least n slots The easiest formulation associated a cost with every module assignment to each slot,

independent of other assignments The Hungarian method (also known as Munkres’ algorithm [11])was already known and solved the problem in polynomial time This was however an unsatisfactoryproblem formulation, and the cost function was soon replaced by

d p (i),p(j) is the distance between the slots assigned to modules i and j

a i,p (i) is a cost associated with assigning module i to slot p (i)

c i,j is a weight factor (e.g., the number of wires between module i and j) penalizing the distance between the modules i and j

With all c i,j equal to zero, it reduces to the assignment problem above and with all a equal to

zero, it is called the quadratic assignment problem that is now known to be NP hard (the travelingsalesperson problem is but a special case)

Paul C Gilmore [12] soon provided (in 1962) a branch-and-bound solution to the quadraticassignment problem, even before that approach had got this name In spite of its bounding tech-niques, it was already impractical for some 15 modules, and was therefore unable to replace anearlier heuristic of Leon Steinberg [13] He used the fact that the problem can be easily solved when

all c i,j = 0, in an iterative technique to find an acceptable solution for the general problem His rithm generated some independent sets (originally all maximal independent sets, but the algorithmgenerated independent sets in increasing size and one can stop any time) For each such set, thewiring cost for all its members for all positions occupied by that set (and the empty positions) wascalculated These numbers are of course independent of the positions of the other members of thatset By applying the Hungarian method, these modules were placed with minimum cost Cyclingthrough these independent sets continues until no improvement is achieved during one completecycle Steinberg’s method was repeatedly improved and generalized in 1960s.∗

algo-Among the other iterative methods to improve such assignments proposed in these early yearswere force-directed relaxation [14] and pairwise interchange [15] In the former method, two modules

in a placement are assumed to attract each other with a force proportional to their distance The

proportionality constant is something like the weight factor c i,j above As a result, a module issubjected to a resultant force that is the vector sum of all attracting forces between pairs it is involved

in If modules could move freely, they would move to the lowest energy state of the system This

∗ Steinberg’s 34-module/36-slot example, the first benchmark in layout synthesis, is only recently optimally solved for Euclidean norm, almost 40 years after its publication in 1961 The wirelength was 4119.74 The best result of the 1960s was by Frederick S Hiller (4475.28).

Trang 32

is mostly not a desirable assignment because many modules may opt for the same slot Algorithmstherefore are moved one module at a time to a position close to the zero-tension point

A simple method to avoid occupied slots is pairwise interchange Two modules are selectedand if interchanging their slot positions improves the assignment, the interchange takes place Ofcourse only the cost contribution of the signal nets involved has to be updated However, the pairselection is not obvious Random selection is an option, ordering modules by connectedness wasalready tried before 1960, and using the forces above in various ways quickly followed after the ideagot in publication But a really satisfactory pair selection was not shown to exist

The constructive methods in the remainder of that decade had the same problem They weread-hoc heuristics based on a selection rule (the next module to be placed had to have the strongestbond with the ones already placed) followed by a positioning rule (such as pair linking and clusterdevelopment) They were used in industrial tools of 1970s, but were readily replaced by simulatedannealing when that became available But one development was overlooked, probably because itwas published in a journal not at all read by the community involved in layout synthesis It was thefirst analytic placer [16], minimizing in one dimension

with the constraints p T p = 1 andi p (i) = 0, to avoid the trivial solution where all components

of p are the same That is, an objective that is the weighted sum of all squared distances Simply

rewriting that objective in matrix notation yields

2p T Ap

where A = D − C, D being the diagonal matrix of row sums of C All eigenvalues of such a

matrix are nonnegative If the wiring structure is connected, there will be exactly one eigenvalue

of A equal to 0 (corresponding to that trivial solution), and the eigenvector associated with the

next smallest eigenvalue will minimize the objective under the given constraints The minimizationproblem is the same for the other dimension, but to avoid a solution where all modules would beplaced on one line we add the constraint that the two vectors must be orthogonal The solution of thetwo-dimensional problem is the one where the coordinates correspond with the components of theeigenvectors associated with second and third smallest eigenvalues

The placement method is called Hall placement to give credit to the inventor Kenneth M Hall.When applied to the placement of components on chip or board, it corresponds to the quadraticplacement problem Whether this is the right way to formulate the wire-length objective will beextensively discussed in Chapters 17 and 18, but it predates the first analytic placer in layout synthesis

by more than a decade!

2.1.3 SINGLE-LAYERWIRING

Most of the above industrial developments were meant for printed circuit boards (in which integratedcircuits with at most a few tens of transistors are interconnected in two or more layers) and backplanes

Trang 33

(in which boards are combined and connected) Integrated circuits were not yet subject to automation.Research, both in industry and academia, started to get interesting toward the end of the decade Withonly one metal layer available, the link with graph planarity was quickly discovered Lots of effortwent into designing planarity tests, a problem soon to be solved with linear-time algorithms Whatwas needed, of course, was planarization: using technological possibilities (sharing collector islands,small diffusion resistors, multiple substrate contacts, etc.) to implement a circuit using a planarizedmodel Embedding the planar result onto the plane while accounting for the formation of isolatedislands, and connecting the component pins were the remaining steps [17].

Today the constraints of those early chips are obsolete Extensions are still of some validity

in analogue applications, but are swamped by a multitude of more severe demands Planarizationresurfaced when rectangular duals got attention in floorplan design Planar mapping as used in theseearly design flows started a whole new area in graph theory, the so-called visibility graphs, butwithout further applications in layout synthesis.∗

The geometry of the islands provided the first models for rectangular dissections and theiroptimization, and for the compaction algorithms based on longest path search in constraint graphs.These graphs, originally called polar graphs and illustrated in Figure 2.3, were borrowed†from earlyworks in combinatorics (how to dissect rectangles into squares?) [20] They enabled systematicgenerations of all dissection topologies, and for each such topology a set of linear equations as part

of the optimization tableau for obtaining the smallest rectangle under (often linearized) constraints.The generation could not be done in polynomial time of course, but linear optimization was laterproven to be efficient

A straightforward application of Lee’s router for single-layer wiring was not adequate, becauseplanarity had to be preserved Its ideas however were used in what was a first form of contour routing.Contour routing turned out to be useful in the more practical channel routers of the 1980s

2.2 EMERGING HIERARCHIES (1970–1980)

Ten years of design automation for layout synthesis produced a small research community with afirm basis in graph theory and a growing awareness of computational complexity Stephen Cook’sfamous theorem was not yet published and complexity issues were tackled by bounding techniques,smart speedups, and of course heuristics Ultimately, and in fact quite soon, they proved to be insuffi-cient Divide-and-conquer strategies were the obvious next approaches, leading to hierarchies, bothuniform requiring few well-defined subproblems and pluriform leaving many questions unanswered

2.2.1 DECOMPOSING THEROUTINGSPACE

A very effective and elegant way of decomposing a problem was achieved by dividing the routingspace into channels, and solving each channel by using a channel router It found immediate appli-cation in two design styles: standard cell or polycell where the channels were height adjustable andchannel routing tried to use as few tracks as possible (Figure 2.2 for terminology), and gate arrayswhere the channels had a fixed height, which meant that channel router had to find a solution within

a given number of tracks If efficient minimization were possible, the same algorithm would suffice,

of course The decision problems, however, were shown to be NP complete

The classical channel-routing problem allows two layers of wires: one containing the pins at gridpositions and all latitudinal parts (branches), exactly one per pin, and one containing all longitudinalparts (trunks), exactly one for each net This generates two kinds of constraints: nets with overlappingintervals need different tracks (these are called horizontal constraints), and wires that have pins at thesame longitudinal height must change layer before they overlap (the so-called vertical constraints)

∗ In this context, they were called horvert representations [18].

† The introduction of polar graphs in layout synthesis [19] was one on the many contributions that Tatsuo Ohtsuki gave to the community.

Trang 34

FIGURE 2.2 Terminology in channel routing.

The problem does not always have a solution If the vertical constraints form cycles, then the routingcannot be completed in the classical model Otherwise a routing does exist, but finding the minimumnumber of tracks is NP hard [21]

In the absence of vertical constraints, the problem can be solved optimally in almost linear time

by a pretty simple algorithm [22], originally owing to Akihiro Hashimoto and James Stevens, that

is known as the left-edge algorithm.∗Actually there are two simple greedy implementations bothdelivering a solution with the minimum number of tracks One is filling the tracks one by one fromleft to right each time trying the unplaced intervals in sequence of their left edges The other placesthe intervals in that sequence in the first available track that can take it In practice, the left-edgealgorithm gets quite far in routing channels, in spite of possible vertical constraints Many heuristicstherefore started with left-edge solutions

To obtain a properly wired channel in two layers, the requirements that latitudinal parts are to-one with the pins and that each net can have only one longitudinal part are mostly dropped byintroducing doglegs.†Allowing doglegs enables in practice always a two-layer routing with latitudinaland longitudinal parts never in the same layer, although in theory problems exist that cannot be solved

one-It has been shown that the presence of a single column without pins guarantees the existence of asolution [23] Finding the solution with the least number of tracks remains NP hard [24]

Numerous channel routers have been published, mainly because it was a problem that could beeasily isolated The most effective implementation, without the more or less artificial constraints ofthe classical problem and its derivations, is the contour router of Patrick R Groeneveld [25] Itsolves all problems although in practice not many really difficult channels were encountered In mod-ern technologies, with a number of layers approaching ten, channel routing has lost its significance

Trang 35

in essence following Moore’s law of exponential complexity growth Partitioning was seen as theway to manage complex design Familiarity with partitioning was already present, because the firstpioneers were involved in or close to teams that had to make sure that subsystems of a logic designcould be built in cabinets of convenient size These subsystems were divided over cards, and thesecards might contain replaceable standard units One of these pioneers, Uno R Kodres, who hadalready provided in 1959 an algorithm for the geometrical positioning of circuit elements [26] in acomputer, possibly the first placement algorithm in the field, gave an excellent overview of theseearly partitioners [27] They started with one or more seed modules for each block in the partitioning.Then, based once more on a selection rule, blocks are extended by assigning one module at a time toone block Many variations are possible and were tried, but all these early attempts were soon wipedout by module migration methods, and first by the one of Brian W Kernighan and Shen Lin [28].They started from a balanced two-partition of the netlist, that is, division of all modules into twononoverlapping blocks of approximately equal size The quality of that two-partition was measured

in the number of nets connecting modules in both blocks, the so-called cutsize This number was

to be made as low as possible This was tried in a number of iterations For each iteration, the gain

of swapping two modules, one from each block, was calculated, that is, the reduction in cutsize as

a consequence of that swap Gains can be positive, zero, or negative The pairs are unlocked andordered from largest to smallest gain In that order each unlocked pair is swapped, locked to prevent

it from moving back, and its consequence (new blocks and updated gains) is recorded When allmodules (except possibly one) are locked the best cutsize encountered is accepted A new iterationcan take place if there is a positive gain left

Famous as it is, the Kernighan–Lin procedure left plenty of room for improvement Halfway

in the decade, it was proven that the decision problem of graph partition was NP complete, so thefact that it mostly only produced a local optimum was unavoidable, but the limitations to balanced

partitions and only two-pin nets had to be removed Besides a time-complexity of O (n3) for an n-module problem was soon unacceptable The repair of these shortcomings appeared in a 1982

paper by Charles M Fiduccia and Robert M Mattheyses [29] It handled hyperedges (and thereforemultipin nets), and instead of pair swapping it used module moves while keeping bounds on balancedeviations, possibly with weighted modules More importantly, it introduced a bucket data structurethat enabled a linear-time updating scheme Details can be found in Chapter 7

At the same time, one was not unaware of the relation between partitioning and eigenvalues Thisrelation, not unlike the theory behind Hall’s placement [16], was extensively researched by William

E Donath and Alan J Hoffman [30] Apart from experiments with simulated annealing (not veryadequate for the partitioning problem in spite of the very early analogon with spin glasses) and usingmigration methods for multiway partitioning, it would be well into the 1990s before partitioning wascarefully scrutinized again

2.2.3 MINCUTPLACEMENT

Applying partitioning in a recursive fashion while at the same time slicing the rectangular siliconestate in two subrectangles according to the area demand of each block is called mincut placement.The process continues until blocks with known layouts or suitable for dedicated algorithms areobtained The slicing cuts can alternate between horizontal and vertical cuts, or have the directiondepend on the shape of the subrectangle or the area demand Later, also procedures performingfour-way partition (quadrisection) along with dividing in four subrectangles were developed Astrict alternation scheme is not necessary and many more sophisticated cut-line sequences have beendeveloped Melvin A Breuer’s paper [31] on mincut placement did not envision deep partitioning, butlarge geometrically fixed blocks had to be arranged in a nonoverlapping configuration by positioningand orienting Ulrich Lauther [32] connected the process with the polar graph illustrated in Figure 2.3.The mincut process by itself builds a series-parallel polar graph, but Lauther also defined three localoperations, to wit mirroring, rotating, and squeezing, that more or less preserved the relative positions

Trang 36

FIGURE 2.3 Polar graph of a rectangle dissection.

The first two are pretty obvious and do not change the topology of the polar graph The last one,squeezing, does change the graph and might result in a polar graph that is not series parallel.The intuition behind mincut placement is that if fewer wires cross the first cut lines, there will

be fewer long connections in the final layout An important drawback of the early mincut placers,however, is that they treat lower levels of partitioning independent from the blocks created earlier,that is, without any awareness of the subrectangles to which connected modules were assigned.Modules in those external blocks may be connected to modules in the block to be partitioned, and

be forced unnecessarily far from those modules Al Dunlop and Kernighan [33] therefore tried tocapture such connectivities by propagating modules external to the block to be partitioned as fixedterminals to the periphery of that block This way their connections to the inner modules are takeninto account when calculating cutsizes Of course, now the order in which blocks are treated has animpact on the final result

2.2.4 CHIPFABRICATION ANDLAYOUTSTYLES

Layout synthesis provides masks for chip fabrication, or more precisely, it provides data structuresfrom which masks are derived Hundreds of masks may be needed in a modern process, and withtoday’s feature sizes, optical correction is needed in addition to numerous constraints on the con-figurations Still, layout synthesis is only concerned with a few partitions of the Euclidean plane tospecify these masks

When all masks are specific to producing a particular chip, we speak of full-custom design It

is the most expensive setup and usually needs high volume to be cost effective Generic memoryalways was in that category, but certain application specific designs also qualified Even in the early1970s, the major computer seller of the day saw the advantage of sharing masks over as many aspossible different products They called it the master image, but it became known ten years later

as the gate-array style in the literature Customization in these styles was limited to the connectionlayers, that is, the layers in which fixed rows of components were provided with their interconnect.Because many masks were never changed in a generation of gate-array designs, these were known

as semi-custom designs Wiring was kept in channels of fixed width in early gate arrays

Another master-image style was developed in the 1990s that differed from gate arrays by notleaving space for wires between the components It was called sea-of-gates, because the unwired

chip was mostly nothing else than alternating rows of p-type and n-type metal oxide semiconductor

(MOS)-transistors Contacts with the gates were made on either side of the row, although channel

Trang 37

contacts were made between the gates A combination of routers was used to achieve this over-the-cellrouting The routers were mostly based on channel routers developed for full-custom chips.Early field programmable gate arrays predated (and survived) the sea-of-gates approach, whichnever became more than niche in the cost-profit landscape of the chip market It allows indi-vidualization away from the chip production plant by establishing or removing small pieces ofinterconnect.

Academia believed in full-custom, probably biased by its initial focus on chips for analogueapplications Much of their early adventures in complete chip design for digital applications grew out

of the experience described in Section 2.1.3 and were encouraged by publications from researchers

in industry such as Satoshi Goto [34], and Bryan T Preas and Charles W Gwyn [35] Rather than amethodology, suggested by the award-winning paper in 1978, it established a terminology Macrocelllayout and general-cell assemblies in particular remained for several years names for styles withoutmuch of a method behind it

Standard-cell (or polycell) layout was a full-custom style that lent itself to automation Cellswith uniform height and aligned supply and clock lines were called from a library to form rows

in accordance with a placement result Channel routing was used to determine the geometry of thewires in between the rows The main difference with gate-array channels was that the width was to

be determined by the algorithm Whereas in gate-array styles, the routers had to fit all interconnect inchannels of fixed width, the problem in standard-cell layouts was to minimize the number of tracks,and whatever the result, reserve enough space on the chip to accommodate them

2.3 ITERATION-FREE DESIGN

By 1980, industrial tools had developed in what was called spaghetti code, depending on a few peoplewith inside knowledge of how it had developed from the initial straightforward idea sufficient for thesimple examples of the early 1970s, into a sequence of patches with multiple escapes from where itcould end up in almost any part of the code In the meantime, academia were dreaming of compilingchips Carver A Mead and Lynn (or Robert) Conway wrote the seminal textbook [36] on very largescale integration between 1977 and 1979, and, although not spelled out, the idea of (automatically)deriving masks from a functional specification was born shortly after the publication in 1980 A yearlater, David L Johannsen defended his thesis on silicon compilation

2.3.1 FLOORPLANDESIGN

From the various independent algorithms for special problems grew the layout synthesis as constrainedoptimization: wirelength and area minimization under technology design rules The target wasfunctionality with acceptable yield Speed was not yet an issue Optimum performance was achievedwith multichip designs, and it would take another ten years before single-chip microprocessors wouldcome into their ball park

The real challenge in those days was the phase problem between placement and routing.Obviously, placement has a great impact on what is achievable with routing, and can even renderunroutable configurations Yet, it was difficult to think about routing without coordinates, geomet-rical positions of modules with pins to be connected The dream of silicon compilation and designsscalable over many generations of technology was in 1980 not more than a firm belief in hierarchicalapproaches with little to go by apart from severe restrictions in routing architecture.∗A breakthroughcame with the introduction of the concept of floorplans in the design trajectory of chips by RalphH.J.M Otten [37] A floorplan was a data structure capturing relative positions rather than fixed

∗ There was an exception: when in 1970 Akers teamed up with James M Geyer and Donald L Roberts [38] and tried grid expansion to make designs routable It consisted of finding cuts of horizontal and vertical segments of only conductor areas

in one direction and conductor free lines in the other Furthermore, the cutting segment in the conductor area should be perpendicular to all wires cut The problems that it created were an early inspiration for slicing.

Trang 38

coordinates In a sense, floorplan design is a generalization of placement Instead of manipulatingfixed geometrical objects in a nonoverlapping arrangement in the plane, floorplan design treats mod-ules as objects with varying degrees of flexibility and tries to decide on their position relative to theposition of others.

In the original paper, the relative positions were captured by a point configuration in the plane By

a clever transformation of the netlist into the so-called dutch metric, an optimal embedding of thesepoints could be obtained The points became the centers of rectangular modules with an appropriatesize that led to a set of overlapping rectangles when the point configuration was more or less fit

in the assessed chip footprint The removal of overlap was done by formulating the problem as amathematical program

Other data structures than Cartesian coordinates were proposed A significant related data ture was the sequence pair of Hiroshi Murata, Kunihiro Fujiyoshi, Shigetoshi Nakatake, and YojiKajitani in 1997 [39] Before that, a number of graphs, including the good old-polar graphs fromcombinatorial theory, were used and especially around the year 2000 many other proposals werepublished Chapters 9 through 11 will describe several floorplan data structures

struc-The term floorplan design came from house architecture Already in 1960s, James Grason [40]tried to convert preferred neighbor relationships into rectangles realizing these relations The questioncame down to whether a given graph of such relations had a rectangular dual He characterized suchgraphs in a forbidden-graph theorem The algorithms he proposed were hopelessly complex, butthe ideas found new following in the mid-1980s Soon simple, necessary, and sufficient conditionswere formulated, and Jayaram Bhasker and Sartaj Sahni produced in 1986 a linear-time algorithm fortesting the existence of a rectangular dual and, in case of the affirmative, constructing a correspondingdissection [41]

The success of floorplanning was partially due to giving answers that seemed to fit the questions

of the day like a glove: it lent itself naturally to hierarchical approaches∗and enabled global wiring as

a preparation for detailed routing that took place after the geometrical optimization of the floorplan Itwas also helped by the fact that the original method could reconstruct good solutions from abstracteddata in extremely short computation times even for thousands of modules The latter was also aweakness because basically it was the projection of a multidimensional Euclidean space with theexact Dutch distances onto the plane of its main axes Significant distances perpendicular to thatplane were annihilated

2.3.2 CELLCOMPILATION

Hierarchical application of floorplanning ultimately leads to modules that are not further dissected.They are to be filled with a library cell, or by a special algorithm determining the layout of that celldepending on specification and assessed environment The former has a shape constraint with fixeddimensions (sometimes rotatable) The latter is often macrocells with a standard-cell layout style.They lead to staircase functions as shape constraints where a step corresponds to a choice of thenumber of rows

In the years of research toward silicon compilers, circuit families tended to grow The elementarystatic complementary metal oxide semiconductor (CMOS)-gate has limitations, specifically in thenumber of transistors in series This limits the number of distinct gates severely The new circuittechniques allowed larger families Domino logic, for example, having only a pull-down networkdetermining its function, allows much more variety Single gates with up to 60 transistors have beenused in designs of the 1980s This could only be supported if cells could be compiled from theirfunctional specification

The core of the problem was finding a linear-transistor array, where only transistors sharingcontact areas could be neighbors This implied that the charge or discharge network needed a topology

of an Euler graph In static cmos, both networks had to be Eulerian, preferably with the same sequence

∗ Many even identified floorplanning with hierarchical layout design, clearly an undervaluation of the concept.

Trang 39

of input signals controlling the gate The problem even attracted a later fields medallist in the person ofCurtis T McMullen [42], but the final word came from the thesis of Robert L Maziasz [43], a student

of John P Hayes Once the sequence was established, the left-edge algorithm could complete thenetwork, if the number of tracks would fit on the array, which was a mild constraint in practice; but

an interesting open question for research is to find an Euler path leading to a number of tracks under

a given maximum

2.3.3 LAYOUTCOMPACTION

Area minimization was considered to be the most important objective in layout synthesis before

1990 It was believed that other objectives such as minimum signal delay and yield would benefitfrom it A direct relation between yield and active area was not difficult to derive and with gatedelay dominating the overall speed performance, chips usually came out faster than expected Theplacement tools of the day had the reputation of using more chip area than needed, a belief thatwas based mainly on the fact that manual design often outperformed automatic generation of celllayouts This was considered infeasible for emerging chip complexities, and it was felt that a finalcompaction step could only improve the result Systematic ways of taking a complete layout of a chipand producing a smaller design-rule correct chip, while preserving the topology, therefore became

of much interest

Compaction is difficult (one may see it as the translation of topologies in the graph domain tomask geometries that have to satisfy the design rules of the target technology) Several concepts wereproposed to provide a handle on the problem: symbolic layout systems, layout languages, virtualgrids, etc At the bottom, there is the combinatorial problem of minimizing the size of a complicatedarrangement of many objects in several related and aligned planes Even for simple abstractionsthe two-dimensional problem is complex (most of them are NP hard) An acceptable solution wasoften found in a sequence of one-dimensional compactions, combined with heuristics to handle theinteraction between the two dimensions (sometimes called 11

2-compaction) Many one-dimensionalcompaction routines are efficiently solvable, often in linear time The basis is found in longest-pathproblem, already popular in this context during 1970s Compaction is discussed in several texts onVLSI physical design such as those authored by Majid Sarrafzadeh and Chak-Kuen Wong [44], Sadiq

M Sait and Habib Youssef [45], and Naveed Sherwani [46], but above all in the book of ThomasLengauer [47]

2.3.4 FLOORPLANOPTIMIZATION

Floorplan optimization is the derivation of a compatible (i.e., relative positions of the floorplan arerespected) rectangle dissection, optimal under a given contour score e.g., area and perimeter thatare possibly constrained, in which each undissected rectangle satisfies its shape constraint A shapeconstraint can be a size requirement with or without minima imposed on the lengths of its sides, but

in general any constraint where the length of one side is monotonically nonincreasing with respect

to the length of the other side

The common method well into the 1980s was to capture the relative positions as Kirchhoff tions of the polar graph This yields a set of linear equalities For piecewise linear shape constraintsthat are convex, a number of linear inequalities can be added The perimeter can then be optimized

equa-in polynomial time For nonconvex shape constraequa-ints or nonlequa-inear objectives, one had to resort tobranch-and-bound or cutting-plane methods: for general rectangle dissections with nonconvex shapeconstraints the problem is NP hard Larry Stockmeyer [48] proved that even a pseudo-polynomialalgorithm does not exist when P= NP

The initial success of floorplan design was, beside the facts mentioned in Section 2.3.1, alsodue to a restraint that was introduced already in the original paper It was called slicing because thegeometry of compatible rectangle dissection was recognizable by cutting lines recursively slicingcompletely through the rectangle That is rectangles resulting from slicing the parent rectangle could

Trang 40

either be sliced as well or were not further dissected This induces a tree, the slicing tree, which in

an hierarchical approach that started with a functional hierarchy produced a refinement: functionalsubmodules remained descendants of their supermodule

More importantly, many optimization problems were tractable for slicing structures, amongwhich was floorplan optimization A rectangle dissection has the slicing property iff its polar graph

is series parallel It is straightforward to derive the slicing tree from that graph Dynamic programmingcan then produce a compatible rectangle dissection, optimal under any quasi-concave contour score,and satisfying all shape constraints [49] Also labeling a partition tree with slicing directions can bedone optimally in polynomial time if the tree is more or less balanced and the shape constraints arestaircase functions as Lengauer [50] showed Together with Lukas P.P.P van Ginneken, Otten thenshowed that floorplans given as point configurations could be converted to such optimal rectangledissections, compatible in the sense that slices in the same slice respect the relative point positions

[51] The complexity of that optimization for N rectangles was however O(N6), unacceptable for

hundreds of modules The procedure was therefore not used for more than 30 modules, and wasreduced toO(N3) by simple but reasonable tricks Modules with more than 30 modules were treated

as flexible rectangles with limitations on their aspect ratio

2.3.5 BEYONDLAYOUTSYNTHESIS

It cannot be denied that research in layout synthesis had an impact on optimization in other contextsand optimization in general The left-edge algorithm may be rather simple and restricted (it needs aninterval representation), simulated annealing is of all approaches the most generic A patent requestwas submitted in 1981 by C Daniel Gelatt and E Scott Kirkpatrick, but by then its implementation(MCPlace) was already compared (by having Donald W Jepsen watching the process at a screen andresetting temperature if it seemed stuck in local minimum) against IBM’s warhorse in placement(APlace) and soon replaced it [52] Independent research by Vladimir Cerny [53] was conductedaround the same time Both used the metropolis loop from 1953 [54] that analyzed energy content of

a system of particles at a given temperature, and used an analogy from metallurgy were large crystalswith few defects were obtained by annealing, that is, controlled slow cooling

The invention was called simulated annealing but could not be called an optimization algorithmbecause of many uncertainties about the schedule (begin temperature, decrements, stopping criterion,loop length, etc.) and the manual intervention The annealing algorithm was therefore developed fromthe idea to optimize the performance within a given amount of elapsed CPU time to be used [55].Given this one parameter, the algorithm resolved the uncertainties by creating a Markov chain thatenhanced the probability of a low final score

The generic nature of the method led to many applications Further research, notably by Sara

A Solla, Gregory B Sorkin, and Steve R White, showed that, in spite of some statements aboutits asymptotic behavior, annealing was not the method of choice in many cases [56] Even theapplication described in the original paper of 1983, graph partitioning, did not allow the construction

of a state space suitable for efficient search in that way It was also shown however that placementwith wirelength minimization as objective lent itself quite well, in the sense that even simple pairwiseinterchange produced a space with the properties shown to be desirable by the above researchers.Carl Sechen exploited that fact and with coworkers he created a sequence of releases of the widelyused timberwolf program [57], a tool based on annealing for placement It is described in detail inChapter 16

It is not at all clear that simulated annealing performs well for floorplan design where sizes ofobjects differ in orders of magnitude Yet, almost invariably, it is the method of choice There was

of course the success of Martin D.F Wong and Chung Laung (Dave) Liu [58] who represented theslicing tree in polish notation and defined a move set on it (that move set by the way is not unbiased,violating a requirement underlying many statements about annealing) Since then the communityhas been flooded with innovative representations of floorplans, slicing and nonslicing, each time

Định dạng
Số trang	1.044
Dung lượng	21,06 MB