Advanced Computer Architecture - Lecture 2: Quantitative principles. This lecture will cover the following: detailed discussion on the computer performance – the key to quantitative design and analysis; growth in processor performance; price-performance design; CPU performance metrics;...
Trang 1CS 704 Advanced Computer Architecture
Lecture 2
Quantitative Principles
Detailed discussion on the computer Performance – the key to
Trang 2Summary
Trang 3Recap of Lecture 1
Computer Systems:
Architecture refers to those attributes of a
computer visible to a programmer or compiler
writer; e.g instruction set, addressing techniques, I/O mechanisms etc
Organization refers to how the features of a
computer are implemented? i.e., control signals
Trang 4Recap of Lecture 1
Computer Development:
•Academically , modern computer developments have
their infancy in 1944-49
•Commercially, the first machine was built by
Eckert-Mauchly Computer Corporation in 1949
•Technological developments , from vacuum tubes to VLSI circuits, dynamic memory and network technology gave birth to four different generations of computers
•Microprocessor and PCs were introduced in 1971
Trang 5Recap of Lecture 1
Design Perspectives:
Processor – ISA, ILP and Cache Memory hierarchy: Multilevel
cache and Virtual memory
input/output and storages
Trang 6MAC/VU-Advanced Computer Architecture 6
Recap of Lecture 1
Computer Design Cycle:
• The computer design and development has been under the influence of
Trang 7Growth in Processor Performance
Insert Slide 9 here
•The supercomputers and mainframes , costing
millions of dollars and occupying excessively
large space, prevailing form of computing in
1960s were replaced with relatively low-cost and
smaller-sized minicomputers in 1970s
•In 1980s , very low-cost microprocessor-based
desktop computing machines in the form of
Trang 8Growth in Processor Performance
Insert Slide 9 here
•The growth in processor performance since
mid-1980s has been substantially high than in earlier years
•Prior to the mid-1980s microprocessor
performance growth was averaged about 35% per year
•By 2001 the growth raised to about 1.58 per
year
Trang 9Growth in Processor Performance
Alpha
MIPS R2000
DEC Alpha
Trang 10Price-Performance Design
Technology improvements are used to lower the cost and increase performance The relationship between cost and
price is complex one
The cost is the total amount spends to
produce a product
The price is the amount for which a
finished good is sold
Trang 11Price-Performance Design
The cost passes through
different stages before it
becomes price
A small change in cost may
have a big impact on price
Trang 12Price vs Cost … Insert Slide 14 here
• Manufacturing Costs: Total amount spent to
produce a component
- Component Cost: Cost at which the
components are available to the designer - It ranges from 40% to 50% of the list price of the product
- Recurring costs: Labor, purchasing
scrap, warranty – 4% - 16 % of list price
- Gross margin – Non-recurring cost: R&D,
marketing, sales, equipment, rental,
maintenance, financing cost, pre-tax profits, taxes
Trang 13Price vs Cost … Insert Slide 14
here
• List Price :
• Amount for which the finished good is
sold;
• it includes Average Discount of
15% to 35% of the as volume discounts and/or retailer markup
Trang 14Price vs Cost … Price-Performance Design Cont’d
Trang 15Cost-effective IC Design: Price-Performance Design
components surviving testing
Volume: increases manufacturing hence decreases the list price and improves the purchasing efficiency
transistor or wire in either x or y direction
Trang 16Cost-effective IC Design: Price-Performance Design
Reduction in feature size from 10 microns in
1971 and 0.18 in 2001has resulted in:
- Quadratic rise in transistor count
- Linear increase in performance
- 4-bit to 64-bit microprocessor
- Desktops have replaced time-sharing
machines
Trang 17Cost of Integrated Circuits
The Integrated circuit manufacturing passes through many stage:
Wafer chopping it into dies
Trang 18Cost of Integrated Circuits
Insert Slide 19 here Die: is the square area of the wafer containing the integrated circuit
See that while fitting dies on the wafer the
waist
Cost of a die: The cost of a die is determined
from cost of a wafer; the number of dies fit
on a wafer and the percentage of dies that work, i.e., the yield of the die
Trang 19Dies of Integrated Circuits
Trang 20Cost of Integrated Circuits
Insert Slide 21 here
• The cost of integrated circuit can be determined as ratio of the total cost ;
i.e., the sum of the costs of die, cost of testing die, cost of packaging and the cost of final testing a chip; to the final test yield
Trang 21Calculating Integrated Circuits Costs
die cost + die testing cost + packaging cost + final testing cost
final test yield
Trang 22Cost of Integrated Circuits
Insert Slide 23 here
per wafer and die yield
Trang 23Calculating Integrated Circuits Costs
die cost + die testing cost + packaging cost + final testing cost
final test yield
dies per wafer x die yield
Trang 24Cost of Integrated Circuits
Insert Slide 25 here
Trang 25Calculating Integrated Circuits Costs
die cost + die testing cost + packaging cost + final testing cost
final test yield
dies per wafer x die yield
π (wafer diameter/2)2 π (wafer diameter)
Trang 26Example Calculating Number of Dies
For die of 0.7 Cm on a side, find the number of dies per wafer
Trang 28Calculating Die Yield Insert Slide 29 here
• Die yield is the fraction or percentage of
good dies on a wafer number
• Wafer yield accounts for completely bad
wafers so need not be tested
• Wafer yield corresponds to on defect
density by α which depends on number of masking levels
• good estimate for CMOS is 4.0 and
Trang 29Calculating Integrated Circuits Costs
Trang 30Price-Performance Design
• Time to run the task:
• Execution time, response time, latency
• Throughput or bandwidth:
• Tasks per day, hour, week, sec, ns …
Trang 31Price-Performance Design
Insert Slid 32
• Example:
• To carry 2400 passengers from Lahore to Islamabad –
• Train completes the task in 4:00 hrs while airplane completes the same task
in 6.00 hrs.;
• e., 66.67% of the task in same time – throughput and hence performance of train is 50% more than airplane
Trang 32Price-Performance Design: Example
Vehicle
Train
Plane
Cost / person
300 Rs
3000 Rs
Time Lah to Isb
4.0 hours
45 min
Passenge rs/ trip
2400
300
Execution time /person
6.0 sec
9.0 sec
Cost-performance
300x6=1,800Rs-sec/person
3000x9=27,000Rs-sec/person
Time to complete job
4.0 hours
45x8 min
= 6.0 Hr
Plane 10 time faster but takes
50% more time to complete the
job; i.e., lesser throughput –
thus performance of train is
50%better than plane
The time per person and cost person of train is less than that of plane Thus the cost-performance of plane
is 1:15
Trang 33Metrics of Performance
Insert Slide 33
Megabytes per second
Compiler
Programming Language
Application
Instruction Set Architecture
Answers per month Operations per second
Datapath Control Function
Trang 34Aspects of CPU Performance
Inst Count CPI Clock Rate
Program √
Compiler √
Inst Set √ √
Organization √ √
Technology √
Trang 35Cycles Per Instruction
= CPU Clock Cycles for program / Instruction Count
= (CPU Time * Clock Rate) / Instruction Count
For instruction mix, the relative frequency of occurrence of different types of instructions is given as:
FICi = IC of ith instruction / Total Instruction count
Trang 36Example: Calculating average CPI
Base Machine (Reg / Reg)
Trang 37Cycles Per Instruction
Geometric mean time:
n
/ n
/ π Execution time ratio
Trang 38Summary: Price-Performance Design
Computer cost:
The total cost of manufacturing a computer is distributed among different parts of the system such as the cost of cabinet,
processor board and I/O devices
Performance Time is the key measurement of performance
Comparing performance of two designs: the ratio ,
η = Execution time Y / Execution time X
determines how much lower execution time machine Y takes as
compared to X ; as performance is inverse of execution time, i.e.,
η = Performance X / Performance Y
Trang 39Instruction Execution Rate - MIPS
MIPS specify performance inversely to execution time;
For a given program:
MIPS = (instruction count) / (execution time x 106)
MIPS could not be calculated from the instruction mix
reference machine as:
or
= [Time reference / Time M] x MIPS reference
Trang 40CPU Benchmark Suites
Performance Comparison: the execution time of
the same workload running on two machines without
running the actual programs
Benchmarks: the programs specifically chosen to measure the performance
Five levels of programs: in the decreasing order
Trang 41SPEC: System Performance Evaluation Cooperative
First Round 1989: 10 programs yielding a single number – SPECmarks
Second Round 1992: SPECInt92 (6 integer programs) and SPECfp92 (14 floating point programs)
Third Round 1995
– new set of programs: SPECint95 (8 integer programs) and
SPECfp95 (10 floating point)
– “benchmarks useful for 3 years”
Trang 42Summary: Designing and performance comparison
• Designing to Last through Trends
Capacity Speed
• Time to run the task
– Execution time, response time, latency
• Tasks per day, hour, week, sec, ns, …
– Throughput, bandwidth
• “X is n times faster than Y” means
ExTime(Y) Performance(X)
= ExTime(X) Performance(Y)
Trang 43Summary …… Cont’d
CPI Law:
Execution time is the REAL measure of computer performance!
Good products created when have:
– Good benchmarks, good ways to summarize
performance
Die Cost goes roughly with die area4
Trang 44Summary … Cont’d
“For better or worse, benchmarks shape a field”
Good products created when have:
– Good benchmarks
– Good ways to summarize performance
Given sales is a function in part of performance relative to competition, investment in improving product as reported
by performance summary
If benchmarks/summary inadequate, then choose between improving product for real programs vs improving product
to get more sales;
Sales almost always wins!
Execution time is the measure of computer performance!