Advanced Computer Architecture - Lecture 9: Computer hardware design. This lecture will cover the following: multi cycle and pipeline - datapath and control design; features of multi cycle design; multi cycle control design; introduction to pipeline datapath; high level view of multiple cycle datapath;...
Trang 1CS 704
Advanced Computer Architecture
Lecture 9
Computer Hardware Design
(Multi Cycle and Pipeline - Datapath and Control Design)
Prof Dr M Ashraf Chughtai
Trang 2Today’s Topics
Recap: multi cycle datapath and control
Features of Multi cycle design
Multi Cycle Control Design
Introduction to Pipeline datapath
Summary
Trang 3Recap: Lecture 8
Information flow and Control signals for
single cycles data path to execute:
– Add/Subtract Instruction
– Immediate Instruction
– Load/Store Instructions
– Control Instructions
Analysis of single cycle data path
How effectively are different sections used?
… Next please
Trang 4How effectively different sections are used?
– Memory is used twice, at different times
(i.e., Instruction Fetch and Load or Store)
– Adders in IF section are used once for fraction
of time (Fetch Phase)
– ALU is used for the execution of R-type
instructions and memory address calculation
Conclusion:
We can reduce H/W without hurting
performance by using extra control
Trang 5Multiple Cycle Approach
Clk
Cycle
I fetch ID/Reg Exec Mem Wr
Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5
Clk Clk
The single cycle operations are performed in five steps:
Instruction Fetch
Instruction Decode and Register Read
Execute (R- I-type or address for Load/store/Branch) Memory (Read/write)
Trang 6Multiple Cycle Approach
In the Single Cycle implementation, the cycle time
is set to accommodate the longest instruction, the Load instruction.
In the Multiple Cycles implementation, the cycle time is set to accomplish longest step, the
memory read/write
Consequently, the cycle time for the Single Cycle implementation can be five times longer than the multiple cycle implementation.
As an example, if T = 5 µ Sec for single cycle then
T= 1 µ Sec for multi cycle implementation
Trang 7Single Cycle vs Multiple Cycle
Clk
Cycle 1
Multiple Cycle Implementation:
I fetch ID/Reg Exec Mem Wr
Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9 Cycle 10
I fetch Exec Mem
Trang 8Single Cycle vs Multiple Cycle: Explanation
For different classes of instructions, Multi Cycle implementation may take 3, 4 or 5 cycles to fetch and execute an instruction
Now in order to compare the performance of
single cycle and multi cycle implementations, let
us consider a program segment comprising three instructions, given in the sequence:
Load
Store
R-type (say Add)
Trang 9Single Cycle vs Multiple Cycle: Explanation
The execution time for these three instructions
using single cycle implementation with cycle
length equals 5 µ Sec is:
T exe = 3 x 5 µ Sec = 15 µsec.
Note that here the cycle time is long enough for
the load instruction, but it is too long for the Store
and R-type instruction
So the last part of the cycle, in case of the store
and 4 th (memory) part in case of R-type instruction
is wasted.
Trang 10Single Cycle vs Multiple Cycle: Explanation
In Multi cycle implementation, Load is completed
in 5 Cycles, and store and R-type each takes 4
cycles to complete.
Thus, these three instructions take 5+4+4 = 13
cycles, if the cycle length is 1 µ Sec then the
execution time for the three instructions is:
T exe = 13 x 1 µ Sec = 13 µsec.
Conclusion:
The multi cycle is 15/13 = 1.24 times faster
Next: High-view of multi cycle datapath
Trang 11High Level View of Multiple Cycle Datapath
Data
Rreg # Rreg #
Register File
Explanation Next slide ……….
Trang 12High level view of Multiple Cycle Datapath: Explanation
Here, a shared memory is used, as the instruction fetch and
data read/write are performed in different cycles
The single ALU is shared among the instruction fetch, execute arithmetic and logic instructions and address calculation in
different cycles
The use of shared function unit (ALU) requires additional
multiplexers or widening of multiplexers
New temporary registers, Instruction register, Data memory,
operand A and B and ALUout, are included to hold the
information for use in later cycle
E.g.; Memory read in cycle 4 is written in cycle 5 (Load), operand registers A and B read in cycle 2 may be used in cycle 3 or 4,
and so on
Trang 13Multiple Cycle Datapath Design
Ideal Memory
WrAdr Din
Rb 5
5
32 busA
32 busB
RegWr
Rs Rt
Mux 4
0 1
Rt Rd
PCWr
ALUSelA
Mux6 0 1
Trang 14Multiple Cycle Datapath Architecture
Immunized Hardware: 1 memory, 1 adder
Cycle 1 - [Instruction Fetch]:
firstly, MUX-1 select input IorD =0 and the PC is
connected to the Memory Read address input
RAdr ; instruction is fetched from the memory at
Dout and is placed in th e Instruction Register by inserting IRWr [Yellow Path]
Secondly , the select input ALUSelA to MUX-3 , is
made equal to 0,, ALUSelB to MUX-5 is made equal
to 00 to add 4 to PC; then PCSrc of MUX-2 is made
address of the next instruction
Trang 15Multiple Cycle Datapath Architecture
Cycle 2 – [ID and Reg Rd.]
Rd and Imm16 fields are made available on respective lines (Shown in orange)
at buses A and B , respectively
Trang 16Multiple Cycle Datapath Architecture
Cycle 3 - [Exe]
The select inputs ALUSelA and ALUSelB to the
MUX-3 and MUX-5 , respectively for the instruction
in hand; available at ALUop input to the ALU
Control Unit
ALUSelA = 1 and ALUSelB = 01 to connect bus
A and bus B to ALU to perform the operation
[Green Path]
Trang 17Multiple Cycle Datapath Architecture
- For I-type and Memory Instructions:
ALUSelA = 1 and ALUSelB = 11 to connect bus
A and Sign Extended Imm16 to ALU to perform the operation on immediate data [Red Path]
The ALU output is kept in ALU OUT Register as result of ALU OP execution in case of I-type
operation and as Memory address in case of
memory instructions Load/store
Trang 18Multiple Cycle Datapath Architecture
- For J- type Instructions:
1: Condition Test: ALUSelA = 1 and
ALUSelB = 01; ALUop=SUB
If ALU output Zero =1 then
assert PCWrCond and 2: PC PC+4+[Sign Extend Imm16 and Shift left 2 bits]
ALUSelA = 0 ; ALUSelB = 10 Assert BrWr ; and PCSrc of MUX-2 = 1 to pass the target address to PC [Blue Path]
Trang 19Multiple Cycle Datapath Architecture
Cycle 4 - [Memory Instruction Load/Store]
IorD=1 to pass the ALUout Register as RAdr (Read Address) input to the memory to read data at the Dout [Dark Green Path]
output is wired to WrAdr (Write address input)
is wired to Din (Data In) [Dark blue] of the
memory
Trang 20Multiple Cycle Datapath Architecture
Cycle 5 - [Write Back]
RegDest of MUX-4 = 1 to select Rd as the
destination address; MemToReg = 0 to connect
ALUout to Bus-W and RegWr is asserted
memory
RegDest of MUX-4 = 0 to select Rt as the
destination address; MemToReg = 0 to connect
ALUout to Bus-W and RegWr is asserted
memory Load instruction next …
Trang 21Multiple Cycle Datapath Architecture
Cycle 5 - [Write Back]
RegDest of MUX-4 = 0 to select Rt as the
destination address; MemToReg = 1 to connect
Dout of the memory to Bus-W or the register file and RegWr is asserted
Trang 22Multi Cycle Control design
Control may be designed in the following steps using the initial representation as:
Finite State Machine
Here, the sequence control is defined by explicit next state functions, logic is represented by logic equations and usually PLAs are used to
implement the machine
Micro-program
-Here, micro-program counter and a dispatch
ROM defines the sequence control, logic is
represented by truth table and control is
implemented using ROM
Trang 23Multi Cycle Controller FSM Specifications
Trang 24Micro program Controller
Opcode
State Reg Inputs Outputs
Datapath
1
Address Select Logic Adder
Trang 25ADD SUB AND
DATA
.
User program plus Data this can change!
AND microsequence
e.g., Fetch Calc Operand Addr Fetch Operand(s)
Calculate Save Answer(s)
one of these is mapped into one
of these
Trang 26Designing a Microinstruction Set
1) Start with list of control signals
2) Group signals together that make sense (vs random):
called “fields”
3) Places fields in some logical order
(e.g., ALU operation & ALU operands first and
microinstruction sequencing last)
4) Create a symbolic legend for the microinstruction format, showing name of field values and how they set the control signals
– Use computers to design computers
5) To minimize the width, encode operations that will never be used at the same time
Trang 27Specialize state-diagrams easily captured by micro sequencer
– simple increment & “branch” fields
– datapath control fields
Control design reduces to Microprogramming Microprogramming is a fundamental concept
– implement an instruction set by building a very
simple processor and interpreting the instructions– essential for very complex instructions and when few register transfers are possible
Trang 28Microprogramming : inspiration for RISC
If simple instruction could execute at very high clock rate…
If you could even write compilers to produce
Then why not skip instruction interpretation by a
micro-program and simply compile directly into lowest language
of machine? (microprogramming is overkill when ISA
matches datapath 1-1)
Trang 29Single cycle verses multi cycle datapath
Key components of multi cycle data path Design and information flow in multi cycle data path
Multi cycle control unit design
Finite State Machine –based control Unit
Micro program- based controller
Trang 30and ALLAH Hafiz