1. Trang chủ
  2. » Công Nghệ Thông Tin

Advanced Computer Architecture - Lecture 32: Memory hierarchy design

56 5 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề memory hierarchy design
Người hướng dẫn Prof. Dr. M. Ashraf Chughtai
Trường học mac/vu
Chuyên ngành advanced computer architecture
Thể loại lecture
Định dạng
Số trang 56
Dung lượng 1,58 MB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

Advanced Computer Architecture - Lecture 32: Memory hierarchy design. This lecture will cover the following: main memory performance; virtual memory performance; destination virtual memory; DRAM logical organization; double data rate DRAM; optimizes sequential access; avoid handshaking; multiprocessor demand higher bandwidth;...

Trang 1

CS 704

Advanced Computer Architecture

Lecture 32

Memory Hierarchy Design

(Main and Virtual Memories)

Prof Dr M Ashraf Chughtai

Trang 2

Today’s Topics

Recap: Memory Hierarchy and Cache performance

Main Memory Performance

Virtual Memory Performance

Summary

MAC/VU-Advanced

Computer Architecture Lec 32 Memory Hierarchy Design (8) 2

Trang 3

design goal of memory system

Low cost as of cheapest memory fast speed as of fastest memory

Recap: Memory Hierarchy

Trang 4

MAC/VU-Advanced

Computer Architecture Lec 32 Memory Hierarchy Design (8) 4

The fastest, smallest and most costly memories The slowest, biggest and cheapest memories

Recap: Memory Hierarchy

Trang 5

– Average access speed

– Cost

– Cheapest technology

Semiconductor memories

Static and Dynamic RAMs

Upper levels in the memory hierarchy

Recap: Memory Hierarchy

Trang 6

MAC/VU-Advanced

Computer Architecture Lec 32 Memory Hierarchy Design (8) 6

The Caches use Static Random Access

Trang 7

Cache and main memory are organized in equal sized blocks

Word transfer

Bock transfer

The CPU requests contents of main

memory

Word transfer is fast

Recap: Cache Design

Trang 9

Organizations of main memory

Source for Caches

Destination virtual memory

Main Memory Organization

Trang 10

DRAM logical organization (4 M Bit)

MAC/VU-Advanced

Computer Architecture Lec 32– Memory Hierarchy Design (8) 10

Column Decoder Sense Amps & I/O

Memory  Array

Trang 11

Main Memory Performance

Trang 12

Main Memory Performance

MAC/VU-Advanced

Computer Architecture Lec 32 Memory Hierarchy Design (8) 12

Fast page mode

Optimizes sequential access

Synchronous DRAM (SDRAM)

Avoid handshaking

Double Data Rate (DDR) DRAM

Transmit data

Trang 14

MAC/VU-Advanced

Computer Architecture Lec 32 Memory Hierarchy Design (8) 14

Inputs/outputs and multiprocessors

Low-latency memory

Multiprocessor demand higher bandwidth

2 nd level caches with larger block size

Main Memory Performance

Trang 15

The most commonly used techniques are

Improving Main Memory Performance

Trang 16

MAC/VU-Advanced

Computer Architecture Lec 32 Memory Hierarchy Design (8) 16

1: Wider Main Memory

Trang 17

1: Wider Main Memory

Main Memory

L1 cache

Wider L2 Cache

Trang 18

MAC/VU-Advanced

Computer Architecture Lec 32 Memory Hierarchy Design (8) 18

1: Wider Main Memory: Example

4 words (i.e 32 byte) block

Time to send address = 4 clock cycles

Time to send the data word = 4 clock cycles

Access time per word = 56 clock cycles Miss Penalty =

No of words x [time to: send address + send data word + access word]

Trang 19

1: Wider Main Memory

1: For 1 word organization

Miss Penalty = 4 x (4 +4+56) = 4 x (64)

The memory bandwidth = bytes/clock cycle

= 32/256 = 1/8 byte /cycle

2: For 4-word organization

Miss Penalty = 1 x (4 +4+56) = 64 Clock Cycles; and

Memory bandwidth = 32/64 = 1/2 bytes/cycle;

Trang 20

MAC/VU-Advanced

Computer Architecture Lec 32 Memory Hierarchy Design (8) 20

1: Wider Main Memory: Demerits

Main Memory

L1 cache

Wider L2 Cache

Trang 21

2: Interleaved Memory

Trang 22

MAC/VU-Advanced

Computer Architecture Lec 32 Memory Hierarchy Design (8) 22

2: Interleaved Memory

Trang 23

2: Interleaved Memory

– bank 0 has all word whose: Address MOD 4 = 0 – bank 1 has all word whose: Address MOD 4 = 1 – bank 2 has all word whose: Address MOD 4 = 2 – bank 3 has all word whose: Address MOD 4 = 3

Word

address

Word addres s

Word address

0 4 8

1 5 9

2 6

3 7

Trang 24

MAC/VU-Advanced

Computer Architecture Lec 32 Memory Hierarchy Design (8) 24

Bandwidth Calculation:

bandwidth of 4 words interleaved memory using the time model as used in case of wider memory

The miss penalty for 4-word interleave memory is:

= time to send address + time to access +

number of banks x time to send data

= 4 + 56 + 4 x 4 =76 clock cycles

Bandwidth = 32/76 = 0.4 byte per clock

Bandwidth = 32/256= 1/8 = 0.125 byte per clock

Trang 25

3: Independent Memory Banks

Memory banks offer independent accesses

Multiprocessors

I/O

CPU with Hit under n Misses

Non-blocking Caches

Trang 26

MAC/VU-Advanced

Computer Architecture Lec 32 Memory Hierarchy Design (8) 26

3: Independent Memory Banks

Superbank

Trang 27

3: Independent Memory Banks

– An input device may use one controller and one bank

– The cache read may use another and

– The cache write still another

Trang 28

MAC/VU-Advanced

Computer Architecture Lec 32 Memory Hierarchy Design (8) 28

– Using memory banks

– Making memory and its bus wider

– Doing both

How many the banks should be there?

Trang 29

Summary: Main Memory Bandwidth Enhancement

This decision is essential to ensure that

if memory is being accessed sequentially

(e.g when processing an array)

then by the time you try to read a second

word from a bank, the first access has finished

Otherwise it will return to original bank

before it has the next word ready

Trang 30

MAC/VU-Advanced

Computer Architecture Lec 32 Memory Hierarchy Design (8) 30

Summary: Main Memory Bandwidth Enhancement

8 banks, each of 64-bit

Access time of 10 clock cycle

– Clock cycle 1

– Bank 0 after 10 clock cycles

– After 10 clock cycles,

– The bank 0 would fetch the next desired word – 7 banks sequentially till the 18 th clock cycle

Trang 31

Summary: Main Memory Bandwidth Enhancement

– 18 th clock

– Bank 0

– CPU cannot start fetching

– Clock cycle 20

– 10 clock cycles again

Number of bank ≥ Number of clock cycles to

access word in bank

Trang 33

Virtual Memory System

Increasing gap

High cost of main memory

Physical DRAM as a cache for the disk

Single level store

Trang 34

Virtual Memory system … Cont’d

Single level storage

Virtual Memory System

Manages two levels of memory

hierarchy

Main memory and secondary storage

Segments, named as a page

MAC/VU-Advanced

Computer Architecture Lec 32 Memory Hierarchy Design (8) 34

Trang 35

Virtual Memory system … Cont’d

Page

Block

Contiguous pages

Trang 36

Virtual Memory System … Cont’d

Physical Main Memory

A B C D

Virtual Memory Address space

Disk

Virtual Addresses

Physical Addresses

0 4k 8k 12k 16k 20k : :

0 4k 8k 12k 16k 20k 24k 28k 32k 36k 40k : :

C

Trang 37

Virtual Memory: Attributes

– Protection

– Relocation

Trang 38

Virtual Memory: Attributes … Cont’d

Trang 39

Page Tables

Process i:

Physical Addr Read? Write?

Trang 40

Virtual Memory: Attributes … Cont’d

Relocation

– Simplifies loading of program

– Allows to place a program anywhere

Trang 41

Cache verses Virtual memory

Page or segment is used for block

Page fault or address fault is used for miss

CPU produces virtual address

The virtual addresses are translated to the

main memory or physical addresses

Trang 42

Cache verses Virtual memory

Trang 43

Cache verses Virtual memory

Virtual Address

Page Table

Main Memory Physical Address

Trang 44

Cache verses Virtual memory

Replacement on cache miss

Page fault

The size of processor address

Cache size is independent of the processor

address

MAC/VU-Advanced

Computer Architecture Lec 32 Memory Hierarchy Design (8) 44

Trang 45

Cache verses Virtual memory

Secondary storage

Lower-level backing store for main memory

File system occupies the space on secondary

storage

Trang 46

Issues of Virtual Memory Design

Trang 47

Issues of Virtual Memory Design

performance

size

Trang 48

Typical System with Virtual Memory

Trang 49

Typical System with Virtual Memory

The CPU generates the Virtual Address

Operating system manages a lookup table

Location of the page or segment

Virtual addresses to physical addresses

Trang 50

Page Faults (like “Cache Misses”)

Current process suspends

OS has full control over placement

MAC/VU-Advanced

Computer Architecture Lec 32 Memory Hierarchy Design (8) 50

Trang 51

Page Faults (like “Cache Misses”)

CPU

Memory Page Table

Trang 52

Servicing a Page Fault: 3 steps

disk Disk disk

I/O controller

Trang 53

Servicing a Page Fault: 3 steps

I/O controller

Reg

(2) DMA Transfer

(1) Initiate Block Read

Trang 54

Servicing a Page Fault: 3 Steps

disk Disk disk Disk

I/O controller

Reg

(2) DMA Transfer

(1) Initiate Block Read

(3) Read Done

MAC/VU-Advanced

Computer Architecture Lec 32 Memory Hierarchy Design (8) 54

Trang 55

Main memory design

Methods to improve the bandwidth of main memory

Concept of Virtual Memory

Servicing the page fault in Virtual Memory

Trang 56

Allah Hafiz

MAC/VU-Advanced

Computer Architecture Lec 32 Memory Hierarchy Design (8) 56

Ngày đăng: 05/07/2022, 11:56