Gajski EECS31/CSE31/, University of California, Irvine... Gajski 3 EECS31/CSE31/, University of California, Irvine Register-transfer-level design of one or more datapaths and control u
Trang 1Copyright © 2010-20011 by Daniel D Gajski EECS31/CSE31/, University of California, Irvine
Trang 23
Finite-state machine
Binary system and data
representation
Generalized finite-state machines
Combinational components
Sequential design techniques
Storage components
Register-transfer design
Processor components
Trang 3Copyright © 2010-20011 by Daniel D Gajski 3 EECS31/CSE31/, University of California, Irvine
Register-transfer-level design
of one or more datapaths and control units
Trang 4Motivation: Ones-counter
Control
unit
Control signals
Ocount = 0
Done=1; Output =Ocount
Start = 1
Data = 0 Data = 0
Start = 0
/
Done=1;
ALU M
Selector 0 1
S
S 2
WA WE
RAA REA
RAB REB
8 X m
register file
“0” “0”
3
3 3
20-bit control words
Problem:
Generate controller & control words for given FSMD & Datapath
Trang 5Copyright © 2010-20011 by Daniel D Gajski 5 EECS31/CSE31/, University of California, Irvine
Ones Counter from C Code
Function-based C code RTL-based C code
01: int OnesCounter(int Data){
02: int Ocount = 0;
03: int Temp, Mask = 1;
04: while (Data > 0) {
05: Temp = Data & Mask;
06 Ocount = Data + Temp;
08: Temp = Data & Mask;
09: Ocount = Ocount + Temp;
10: Data >>= 1;
11: } 12: Output = Ocount;
Trang 6CDFG for Ones Counter
&
>>1 +
Done Data
Done Ocount Data
•Data dependences inside BBs
08: Temp = Data & Mask;
09: Ocount = Ocount + Temp;
Trang 7Copyright © 2010-20011 by Daniel D Gajski 7 EECS31/CSE31/, University of California, Irvine
CDFG to FSMD for Ones Counter
Temp = Data AND Mask;
Ocount = Ocount + Temp;
>0 0
Done Output
&
>>1 +
Done Data
Done Ocount Data
Temp = Data AND Mask
Ocount = Ocount + Temp
Trang 8FSMD for Ones Counter
•FSMD more detailed then CDFG
•States may represent clock cycles
•Conditionals and statements executed
Temp = Data AND Mask
Ocount = Ocount + Temp
Trang 9Copyright © 2010-20011 by Daniel D Gajski 9 EECS31/CSE31/, University of California, Irvine
FSMD Definition
We defined an FSM as a quintuple < S, I, O, f, h > where S is a set of
which defines the state of the datapath by defining the values of all variables in each state with the set of expressions Expr(V):
Expr(V) = Const U V U {e i # e j | e i , e j el of Expr(V), # is an operation}
Notes: 1 Status signal is a signal in I;
2 Control signals are signals in O;
3 Datapath inputs and outputs are variables in V
Trang 10
RTL Design Model
Control
Control signals Status signals
Control inputs
Datapath inputs
Datapath outputs
Control outputs
Control
Control signals Status signals
Control inputs
Datapath inputs
Datapath outputs
Control outputs
High-level block diagram
Register-transfer-level block diagram
Bus 1 Bus 2
Bus 3
Status signals
Control signals
Control outputs
Datapath outputs
Datapath inputs
Control inputs
state logic -
Next-D Q
D Q
D Q
.
.
.
State register
Trang 11Copyright © 2010-20011 by Daniel D Gajski 11 EECS31/CSE31/, University of California, Irvine
RTL Design Model
Control
Control signals Status signals
Control inputs
Datapath inputs
Datapath outputs
Control outputs
Control
Control signals Status signals
Control inputs
Datapath inputs
Datapath outputs
Control outputs
High-level block diagram
Register-transfer-level block diagram
Bus 1 Bus 2
Bus 3
Status signals
Control signals
Control outputs
Datapath outputs
Datapath inputs
Control inputs
address logic -
Next- .
.
Program Counter
Trang 12C-to-RTL design
controller
datapath
state register (program counter)
output logic (program memory)
next-state logic (next-address generator)
RTL component and connectivity selection,
expression mapping (variable and operation mapping)
scheduling and pipelining
Trang 13Copyright © 2010-20011 by Daniel D Gajski 13 EECS31/CSE31/, University of California, Irvine
Square Root Approximation: C to CDFG
a=In 1 b=In 2
-max
1
Out Done
Example: Sq root (a + b) = max ( 0.875 x + 0.5 y ) , where x = max(|a|, |b|), y = min (|a|, |b|)
Trang 14Square Root Approximation: Scheduling
a=In 1 b=In 2
-max
1
Out Done
Example: Sq root (a + b) = max ( 0.875 x + 0.5 y ) , where x = max(|a|, |b|), y = min (|a|, |b|)
Resource-ASAP
Trang 15Copyright © 2010-20011 by Daniel D Gajski 15 EECS31/CSE31/, University of California, Irvine
Square Root Approximation: CDFG to FSMD
-max
1
Out Done
Control/Data flow graph
Example: Sq root (a + b) = max ( 0.875 x + 0.5 y ) , where x = max(|a|, |b|), y = min (|a|, |b|)
ASAP FSMD
Trang 16Square Root Approximation: FSMD Design Example: Sq root (a + b) = max ( 0.875 x + 0.5 y ) , where x = max(|a|, |b|), y = min (|a|, |b|)
s0
a = In 1
b = In 2
Start = 0 Start = 1
Control Start
Done
In 1
Out
In 2
• Storage allocation and sharing
• Functional unit allocation and sharing
• Bus allocation and sharing
Trang 17Copyright © 2010-20011 by Daniel D Gajski 17 EECS31/CSE31/, University of California, Irvine
Resource usage in SRA
Square-root approximation
No of live variables
1 2 3 3 2 2 2
1 2 3 3 2 2 2
1 1 1 2 1 2
max
1
min
2 1 1 2 1 1
1 1 1 2 2 2
max
1
min
2 1 1 2 1 1
Trang 18Resource usage in SRA
Square-root approximation
Max no.
of units
No of operations
1 1 1 2 1 2
max
1
min
2 1 1 2 1 1
1 1 1 2 2 2
max
1
min
2 1 1 2 1 1
a b t 1 t 2 x y t 3 t 4 t 5 t 6 t 7 abs1 i o
Trang 19Copyright © 2010-20011 by Daniel D Gajski 19 EECS31/CSE31/, University of California, Irvine
Register sharing (Variable merging)
the design
Trang 20
Merging variables with common sources
Trang 21Copyright © 2010-20011 by Daniel D Gajski 21 EECS31/CSE31/, University of California, Irvine
Register sharing (Variable merging)
1 2 3 3 2 2 2
1 2 3 3 2 2 2
Trang 22Register sharing (Variable merging)
Trang 23Copyright © 2010-20011 by Daniel D Gajski 23 EECS31/CSE31/, University of California, Irvine
FU sharing (Operator merging)
Trang 25-Copyright © 2010-20011 by Daniel D Gajski 25 EECS31/CSE31/, University of California, Irvine
Operator-merging for SRA
[ abs/min/+/- ]
Datapath after variable and operator merging
Trang 26Bus sharing ( connection merging )
Trang 27Copyright © 2010-20011 by Daniel D Gajski 27 EECS31/CSE31/, University of California, Irvine
Connection merging in SRA datapath
X
J
X X
D
X X
X
C
X X
X
J
X X
D
X X
X
C
X X
B
C
E
F G
J
K
L M
Trang 28Connection merging in SRA datapath
X
J
X X
D
X X
X
C
X X
X
J
X X
D
X X
X
C
X X
Trang 29Copyright © 2010-20011 by Daniel D Gajski 29 EECS31/CSE31/, University of California, Irvine
Register merging into Register files
therefore number of buses
Trang 30Datapath after register merging
Register access table
Square-root approximation
Trang 31Copyright © 2010-20011 by Daniel D Gajski 31 EECS31/CSE31/, University of California, Irvine
Chaining and multi-cycling
operations in each state
performance
two or more clock cycles
resource utilization
Trang 32SRA datapath with chained units
Square-root approximation
R 1 = [ a, t 1 , x, t 7 ]
R 2 = [ b, t 2 , y, t 3 , t 5 , t 6 ]
R 3 = [ t 4 ]
Trang 33Copyright © 2010-20011 by Daniel D Gajski 33 EECS31/CSE31/, University of California, Irvine
SRA datapath with multi-cycle units
Square-root approximation
R 1 = [ a, t 1 , x, t 7 ]
R 2 = [ b, t 2 , y, t 3 , t 5 , t 6 ]
R 3 = [ t 4 ]
Trang 34Pipelining
additional cost
concurrently for different data (assembly line principle)
(a) Unit pipelining (b) Control pipelining (c) Datapath pipelining
Trang 35Copyright © 2010-20011 by Daniel D Gajski 35 EECS31/CSE31/, University of California, Irvine
SRA datapath with single AU
s0
a = In 1
b = In 2
Start = 0 Start = 1
Write R2
t7x
t1a
min max
-|b|
|a|
AU stage 2
max +
min max
Read R2
t7x
x
t1
t1a
Write R2
t7x
t1a
min max
-|b|
|a|
AU stage 2
max +
min max
Read R2
t7x
x
t1
t1a
Trang 36Pipelined FSMD implementation
ALU
Selector RF
Bus 2 Bus 1
Status signals
Control signals
Datapath
Output Logic
State register
State logic
Next-Control unit
Standard FSMD implementation
Write R2
x
t1a
>>1
>>3
shifters
+ -
min max
|b|
|a|
AU stage 2
+ -
min max
Read R
x x
t1
t1a
Read Status
Write ALUIn Read RF Read CReg
a>b a>b
1
s1 1
s
c,d c,d c,d c+d x
x
x x x-1 y
Trang 37Copyright © 2010-20011 by Daniel D Gajski 37 EECS31/CSE31/, University of California, Irvine
Summary