I Background and MotivationTopics in This Part Chapter 1 Combinational Digital Circuits Chapter 2 Digital Circuits with Memory Chapter 3 Computer System Technology Chapter 4 Computer Per
Trang 1Part I
Background and Motivation
Trang 2About This Presentation
This presentation is intended to support the use of the textbook
Computer Architecture: From Microprocessors to Supercomputers,
Oxford University Press, 2005, ISBN 0-19-515455-X It is updated regularly by the author as part of his teaching of the upper-division course ECE 154, Introduction to Computer Architecture, at the
University of California, Santa Barbara Instructors can use these slides freely in classroom teaching and for other educational
purposes Any other use is strictly prohibited © Behrooz Parhami
First June 2003 July 2004 June 2005 Mar 2006 Jan 2007
Trang 3I Background and Motivation
Topics in This Part
Chapter 1 Combinational Digital Circuits
Chapter 2 Digital Circuits with Memory
Chapter 3 Computer System Technology
Chapter 4 Computer Performance
Provide motivation, paint the big picture, introduce tools:
• Review components used in building digital circuits
• Present an overview of computer technology
• Understand the meaning of computer performance (or why a 2 GHz processor isn’t 2× as fast as a 1 GHz model)
Trang 41 Combinational Digital Circuits
First of two chapters containing a review of digital design:
• Combinational, or memoryless, circuits in Chapter 1
• Sequential circuits, with memory, in Chapter 2
Topics in This Chapter
1.1 Signals, Logic Operators, and Gates
1.2 Boolean Functions and Expressions
1.3 Designing Gate Networks
1.4 Useful Combinational Parts
1.5 Programmable Combinational Parts
1.6 Timing and Circuit Considerations
Trang 51.1 Signals, Logic Operators, and Gates
Figure 1.1 Some basic elements of digital logic circuits, with
operator signs used in this book highlighted
At least one input is 1
Inputs are not equal
Trang 6The Arithmetic Substitution Method
(when doing the algebra, set z k = z)
x ∨ y = x + y − xy OR converted to arithmetic form
x ⊕ y = x + y − 2xy XOR converted to arithmetic form
Example: Prove the identity xyz ∨ x′ ∨ y ′ ∨ z′ ≡? 1
Trang 7Variations in Gate Symbols
Figure 1.2 Gates with more than two inputs and/or with
inverted signals at input or output
Trang 8Gates as Control Elements
Figure 1.3 An AND gate and a tristate buffer act as controlled switches
or valves An inverting buffer is logically the same as a NOT gate
(c) Model for AND switch
Trang 9Wired OR and Bus Connections
Figure 1.4 Wired OR allows tying together of several
(b) Wired OR of t ristate outputs
Trang 10Control/Data Signals and Signal Bundles
Figure 1.5 Arrays of logic gates represented by a single gate symbol
(b) 32 AND gat es (c) k XOR gat es
(a) 8 NOR gates
Trang 111.2 Boolean Functions and Expressions
Ways of specifying a logic function
• Truth table: 2n row, “don’t-care” in input or output
• Logic expression: w ′ (x ∨ y ∨ z), product-of-sums,
sum-of-products, equivalent expressions
• Word statement: Alarm will sound if the door
is opened while the security system is engaged,
or when the smoke detector is triggered
• Logic circuit diagram: Synthesis vs analysis
Trang 12Table 1.2 Laws (basic identities) of Boolean algebra
DeMorgan’s (x ∨ y)′ = x ′ y ′ (x y)′ = x ′ ∨ y ′
Manipulating Logic Expressions
Trang 13Proving the Equivalence of Logic Expressions
• Case analysis: two cases, x = 0 or x = 1
• Logic expression manipulation
Example: x ⊕ y ≡? x ′y∨xy′
x + y – 2xy ≡? (1 – x)y + x(1 – y) – (1 – x)yx(1 – y)
Trang 141.3 Designing Gate Networks
• AND-OR, NAND-NAND, OR-AND, NOR-NOR
• Logic optimization: cost, speed, power dissipation
(a) A ND-OR circuit
Trang 15Seven-Segment Display of Decimal Digits
Figure 1.7 Seven-segment display
of decimal digits The three open segments may be optionally used The digit 1 can be displayed in two ways, with the more common right-side version shown
Optional segment
Trang 171.4 Useful Combinational Parts
• High-level building blocks
• Much like prefab parts used in building a house
• Arithmetic components (adders, multipliers, ALUs) will be covered in Part III
• Here we cover three useful parts:
multiplexers, decoders/demultiplexers, encoders
Trang 18Figure 1.9 Multiplexer (mux), or selector, allows one of several inputs
to be selected and routed to output depending on the binary value of a set of selection or address signals provided to it
(a) 2-to-1 mux (b) Switch view (c) Mux symbol
(d) Mux array (e) 4-to-1 mux with enable (e) 4-to-1 mux design
Trang 19Figure 1.10 A decoder allows the selection of one of 2a options using
an a-bit address as input A demultiplexer (demux) is a decoder that
only selects an output if its enable signal is asserted
(Enable)
Trang 20Figure 1.11 A 2a -to-a encoder outputs an a-bit binary number
equal to the index of the single 1 among its 2a inputs
(a) 4-to-2 encoder (b) Enc oder symbol
Trang 211.5 Programmable Combinational Parts
• Programmable ROM (PROM)
• Programmable array logic (PAL)
• Programmable logic array (PLA)
A programmable combinational part can do the job of many gates or gate networks
Programmed by cutting existing connections (fuses)
or establishing new connections (antifuses)
Trang 22Figure 1.12 Programmable connections and their use in a PROM
Inputs
Outputs (a) Programmable
Trang 23PALs and PLAs
Figure 1.13 Programmable combinational logic: general structure and two classes known as PAL and PLA devices Not shown is PROM withfixed AND array (a decoder) and programmable OR array
Inputs
Outputs (a) General programmable
combinational logic
(b) PAL: programmable AND array, fixed OR array
8-input ANDs
(c) PLA: programmable AND and OR arrays
6-input ANDs
4-input ORs
Trang 241.6 Timing and Circuit Considerations
• Gate delay δ: a fraction of, to a few, nanoseconds
• Wire delay, previously negligible, is now important (electronic signals travel about 15 cm per ns)
• Circuit simulation to verify function and timing
Changes in gate/circuit output, triggered by changes in its inputs, are not instantaneous
Trang 26CMOS Transmission Gates
Figure 1.15 A CMOS transmission gate and its use in building
(a) CM OS transmission gate:
circuit and symbol
(b) Two-input mux built of t wo
Trang 272 Digital Circuits with Memory
Second of two chapters containing a review of digital design:
• Combinational (memoryless) circuits in Chapter 1
• Sequential circuits (with memory) in Chapter 2
Topics in This Chapter
2.1 Latches, Flip-Flops, and Registers2.2 Finite-State Machines
2.3 Designing Sequential Circuits2.4 Useful Sequential Parts
2.5 Programmable Sequential Parts2.6 Clocks and Timing of Events
Trang 282.1 Latches, Flip-Flops, and Registers
Figure 2.1 Latches, flip-flops, and registers
Trang 30Reading and Modifying FFs in the Same Cycle
Figure 2.3 Register-to-register operation with edge-triggered
Clock Propagation delay
Trang 312.2 Finite-State Machines
Example 2.1
Figure 2.4 State table and state diagram for a vending
machine coin reception unit
Dime Dime
Quarter Dime
Quarter
Dime Quarter
Dime Quarter
Reset
Reset
Reset Reset
is the initial state
is the final state
Next state
Dime Quarter
Trang 32Sequential Machine Implementation
Figure 2.5 Hardware realization of Moore and Mealy
sequential machines
Next-state logic
Present
state
Output logic
Only for Mealy machine
Trang 332.3 Designing Sequential Circuits
is 1xx
Trang 342.4 Useful Sequential Parts
• High-level building blocks
• Much like prefab closets used in building a house
• Other memory components will be covered in
Chapter 17 (SRAM details, DRAM, Flash)
• Here we cover three useful parts:
shift register, register file (SRAM basics), counter
Trang 35Shift Register
Figure 2.8 Register with single-bit left shift and parallel load
capabilities For logical left shift, serial data in line is connected to 0
Parallel data out
Serial data out
MSB
Trang 36Register File and FIFO
Read data 0
Write
data
Read enable
Read addr 0
Write data Write addr
Read data 0
Read enable
Read data 1
(a) Register file with random access
(b) Graphic symbol for register file
/
k
Input Output
Pop
Full
Empty
(c) FIFO symbol
Trang 37Row buffer
Row Column
g bits data out
Chip
select
(a) SRAM block diagram (b) SRAM read mechanism
Trang 392.5 Programmable Sequential Parts
• Programmable array logic (PAL)
• Field-programmable gate array (FPGA)
• Both types contain macrocells and interconnects
A programmable sequential part contain gates and
memory elements
Programmed by cutting existing connections (fuses)
or establishing new connections (antifuses)
Trang 40PAL and FPGA
(a) Portion of PAL with storable output (b) Generic structure of an FPGA
8-input ANDs
Programmable connections
CLB
CLB
CLB
CLB
Trang 412.6 Clocks and Timing of Events
Clock is a periodic signal: clock rate = clock frequency
The inverse of clock rate is the clock period: 1 GHz ↔ 1 ns
Constraint: Clock period ≥ tprop + tcomb + tsetup + tskew
Figure 2.13 Determining the required length of the clock period
Other inputs
Combinational logic
Clock period
FF1 begins
to change
FF1 change observed
Must be wide enough
to accommodate worst-cas e delays
Trang 42Synch version
Synch version
(a) Simple synchroniz er (b) Two-FF synchronizer
(c) Input and output waveforms
Q
C Q
D FF1
Trang 43Level-Sensitive Operation
Figure 2.15 Two-phase clocking with nonoverlapping clock signals
Combi- national logic
Trang 443 Computer System Technology
Interplay between architecture, hardware, and software
• Architectural innovations influence technology
• Technological advances drive changes in architecture
Topics in This Chapter
3.1 From Components to Applications3.2 Computer Systems and Their Parts3.3 Generations of Progress
3.4 Processor and Memory Technologies3.5 Peripherals, I/O, and Communications3.6 Software Systems and Applications
Trang 453.1 From Components to Applications
Figure 3.1 Subfields or views in computer system engineering
Trang 46What Is (Computer) Architecture?
Figure 3.2 Like a building architect, whose place at the
engineering/arts and goals/means interfaces is seen in this diagram, a computer architect reconciles many conflicting or competing demands
material, codes,
Trang 473.2 Computer Systems and Their Parts
Figure 3.3 The space of computer systems, with what we normally mean by the word “computer” highlighted
Trang 49Automotive Embedded Computers
Figure 3.5 Embedded computers are ubiquitous, yet invisible They are found in our automobiles, appliances, and many other places
Brakes Airbags
Trang 50Personal Computers and Workstations
Figure 3.6 Notebooks, a common class of portable computers, are much smaller than desktops but offer substantially the same capabilities What are the main reasons for the size difference?
Trang 51Digital Computer Subsystems
Figure 3.7 The (three, four, five, or) six main units of a digital computer Usually, the link unit (a simple bus or a more elaborate network) is not explicitly included in such diagrams
Trang 52Memory innovations
I/O devices introduced
Dominant look & fell
drum
Paper tape, magnetic tape
Hall-size cabinet
core
Drum, printer, text terminal
Room-size mainframe
chip
Disk, keyboard, video monitor
Desk-size mini
Sensor/actuator, point/click
Invisible, embedded
Trang 53Figure 3.8 The manufacturing process for an IC part
IC Production and Yield
~1 cm
Die tester
Microchip
or other part Mounting
Part tester
Usable part
to ship
Trang 54Figure 3.9 Visualizing the dramatic decrease in yield with larger dies
Effect of Die Size on Yield
120 dies, 109 good 26 dies, 15 good
Die yield =def (number of good dies) / (total number of dies)
Die yield = Wafer yield × [1 + (Defect density × Die area) / a]–a
Die cost = (cost of wafer) / (total number of dies × die yield)
= (cost of wafer) × (die area / wafer area) / (die yield)
Trang 553.4 Processor and Memory Technologies
Figure 3.11 Packaging of processor, memory, and other components
PC board
Backplane
Memory
CPU Bus
Connector
(b) 3D packaging of the future (a) 2D or 2.5D packaging now common
Stacked layers glued together
Interlayer connections deposited on the outside of the stack Die
Trang 56Figure 3.10 Trends in processor performance and DRAM
Trang 57Pitfalls of Computer Technology Forecasting
“DOS addresses only 1 MB of RAM because we cannot imagine any applications needing more.” Microsoft, 1980
“640K ought to be enough for anybody.” Bill Gates, 1981
“Computers in the future may weigh no more than 1.5
tons.” Popular Mechanics
“I think there is a world market for maybe five
computers.” Thomas Watson, IBM Chairman, 1943
“There is no reason anyone would want a computer in their home.” Ken Olsen, DEC founder, 1977
“The 32-bit machine would be an overkill for a personal
computer.” Sol Libes, ByteLines
Trang 583.5 Input/Output and Communications
Figure 3.12 Magnetic and optical disk memory units
(a) Cutaway view of a hard disk drive (b) Some removable storage media
Typically
2-9 cm
Floppy disk CD-ROM
Magnetic tape cartridge
Trang 59
Figure 3.13 Latency and bandwidth characteristics of different
classes of communication links
I/O network
System-area network
network (LAN)
Metro-area network (MAN)
Wide-area network (WAN)
Geographically distributed
Same geographic location
Trang 603.6 Software Systems and Applications
Figure 3.15 Categorization of software, with examples in each class
System
Manager:
virtual memory, security, file system,
Coordinator:
scheduling, load balancing, diagnostics,
Enabler:
disk driver, display driver, printing,
Trang 61Figure 3.14 Models and abstractions in programming.
High- vs Low-Level Programming
Assembly language instructions, mnemonic
Machi ne language instructions, binary (hex)
One task = many statements
One statement = several instructions
Mostly one-to-one
More abstract, machine-independent;
easier to write, read, debug, or maintai n
More conc rete, machine-specific, error-prone; harder to write, read, debug, or mai ntain
Trang 624 Computer Performance
Performance is key in design decisions; also cost and power
• It has been a driving force for innovation
• Isn’t quite the same as speed (higher clock rate)
Topics in This Chapter
4.1 Cost, Performance, and Cost/Performance4.2 Defining Computer Performance
4.3 Performance Enhancement and Amdahl’s Law4.4 Performance Measurement vs Modeling
4.5 Reporting Computer Performance4.6 The Quest for Higher Performance
Trang 634.1 Cost, Performance, and Cost/Performance
Trang 64Figure 4.1 Performance improvement as a function of cost.
Sublinear:
diminishing returns
Linear (ideal?)
Trang 654.2 Defining Computer Performance
Figure 4.2 Pipeline analogy shows that imbalance between processing
power and I/O capabilities leads to a performance bottleneck
Processing Input Output
CPU-bound task
I/O-bound task
Trang 66Performance of Aircraft: An Analogy
Table 4.1 Key characteristics of six passenger aircraft: all figures are approximate; some relate to a specific model/configuration of
the aircraft or are averages of cited range of values
(km)
Speed (km/h)
Price ($M)
Trang 67Different Views of Performance
Performance from the viewpoint of a passenger: Speed
Note, however, that flight time is but one part of total travel time.Also, if the travel distance exceeds the range of a faster plane,
a slower plane may be better due to not needing a refueling stop
Performance from the viewpoint of an airline: Throughput
Measured in passenger-km per hour (relevant if ticket price were proportional to distance traveled, which in reality it is not)