kiến trúc máy tính võ tần phương chương ter04 exercise sinhvienzone com

dce Exercise 1 Fill the value of the control signals for following instruction: a.. dce Exercise 2 • We wish to add the instruction jalr jump and link register to the single-cycle datapa

Trang 1

TP.HCM

2013

dce

COMPUTER ARCHITECTURE CE2013

Faculty of Computer Science and

Engineering Department of Computer Engineering

Vo Tan Phuong

http://www.cse.hcmut.edu.vn/~vtphuong

Trang 2

dce

Chapter 4

Single-cycle & Pipeline

Processor

Trang 3

dce

zero

Single-Cycle Processor Overview

PCSrc

E

Data Memory

Address

Data_in Data_out

32

A L U

ALU result

32

5

Registers

RA

RB

BusA

BusB

RW BusW

32

Address Instruction

Instruction Memory

+1

30

Rs

5

Rd

Imm26

Rt

m u x

0

1 5

m u x

0

1

m u x

0

1

m u x

0

1

30

30 Jump or Branch Target Address

30

Imm16

Next PC

RegDst

ALUSrc RegWrite

J, Beq, Bne

MemtoReg

MemRead

MemWrite ExtOp

Main Control

Ctrl

ALUop func

clk

Trang 4

dce

Exercise 1

Fill the value of the control signals for following instruction:

a slt $t0,$s0,$zero

b bne $t0,$zero,exit_label

Reg Dst

Reg Write

Ext Op

ALU Src

Beq Bne J Mem

Read

Mem Write

Mem toReg

Reg Dst

Reg Write

Ext Op

ALU Src

Beq Bne J Mem

Read

Mem Write

Mem toReg

Trang 5

dce

Exercise 2

• We wish to add the instruction jalr (jump and link

register) to the single-cycle datapath Add any necessary datapath and control signals and draw the result datapath Show the values of the control signals to

control the execution of the jalr instruction.

• The jump and link register instruction is described

below:

Trang 6

dce

• One solution:

(Comment: JReg means Jump Register; RA means: Return Address)

Exercise 2

Trang 7

dce

• The main control signals for the JALR instruction are the same for other R-type instructions, such as ADD and SUB These control signals are shown in the table below:

• The ALU Control signals for the JALR instruction are shown below JReg = 1 and RA = 1 ALUCtrl is a don't care

Exercise 2

Trang 8

dce

Exercise 3

We want to compare the performance of a single-cycle CPU design with a multi-cycle CPU Suppose we add the multiply and divide

instructions The operation times are as follows:

o Instruction memory access time = 190 ps, Data memory access time = 190 ps

o Register file read access time = 150 ps, Register file write access = 150 ps

o ALU delay for basic instructions = 190 ps, ALU delay for multiply or divide =

550 ps Ignore the other delays in the multiplexers, control unit, sign-extension, etc.

Assume the following instruction mix: 30% ALU, 15% multiply & divide, 15% load, 15% store, 15% branch, and 10% jump.

a What is the total delay for each instruction class and the clock cycle for the single-cycle CPU design

b Assume we fix the clock cycle to 200 ps for a multi-cycle CPU, what is the CPI for each instruction class and the speedup over a fixed-length clock cycle?

Trang 9

dce

Exercise 3

a Total delay for each instruction:

Clock cycle = max delay = 1040ps

Trang 10

dce

Exercise 3

b CPI for each instruction:

CPI for Basic ALU = 4 cycles

CPI for Multiply & Divide = 6 cycles (ALU takes 3 cycles)

CPI for Load = 5 cycles

CPI for Store = 4 cycles

CPI for Branch = 3 cycles

CPI for Jump = 2 cycles

Average CPI = 0.3 * 4 + 0.15 * 6 + 0.15 * 5 + 0.15 * 4 + 0.15 * 3 + 0.1 *

2 = 4.1

Speedup of multi-cycle over single-cycle = (1040 * 1) / (200 * 4.1) =

1.27

Trang 11

dce

Exercise 4

• Identify all the RAW data dependencies in the following code Which dependencies are data hazards that will be resolved by forwarding? Which dependencies are data hazards that will cause a stall? Using a graphical representation of the pipeline, show the forwarding paths and stalled cycles if any

add $3, $4, $2

sub $5, $3, $1

lw $6, 200($3)

add $7, $3, $6

Trang 12

dce

Exercise 4

• RAW dependencies:

add $3, $4, $2 and sub $5, $3, $1 (forwarding)

add $3, $4, $2 and lw $6, 200($3) (forwarding)

lw $6, 200($3) and add $7, $3, $6 (stall 1, forward)

add $3, $4, $2 and add $7, $3, $6 (from register)

Trang 13

dce

Exercise 5

• We have a program of 10^6 instructions in the format of “lw, add,

lw, add ,…” The add instruction depends only on the lw instruction right before it The lw instruction also depends only on the add

instruction right before it If this program is executed on the 5-stage MIPS pipeline:

It takes 6 cycles on average to complete one LW and one ADD.

1 cycle (to complete LW) + 2 cycles (bubbles) + 1 cycle (to complete ADD) + 2 cycles (bubbles) = 6 cycles

So, it takes 6 cycles to complete 2 instructions

Average CPI = 6/2 = 3

b With forwarding, what would be the actual CPI?

It takes only 3 cycles on average to to complete one LW and one ADD

1 cycle (to complete LW) + 1 cycle (bubble) + 1 cycle (to complete ADD) = 3 cycles

So, it takes 3 cycles to complete 2 instructions

Định dạng
Số trang	13
Dung lượng	862,42 KB