Advanced Computer Architecture - Lecture 26: Memory hierarchy design

Advanced Computer Architecture - Lecture 26: Memory hierarchy design. This lecture will cover the following: concept of caching and principle of locality; concept of cache memory; principle of locality; cache addressing techniques; RAM vs. cache transaction; temporal locality; spatial locality;...

Trang 1

CS 704

Advanced Computer Architecture

Lecture 26

Memory Hierarchy Design

(Concept of Caching and Principle of Locality)

Prof Dr M Ashraf Chughtai

Trang 2

Today’s Topics

Recap: Storage trends and memory hierarchy

Concept of Cache Memory

Principle of Locality

Cache Addressing Techniques

RAM vs Cache Transaction

Summary

MAC/VU-Advanced

Computer Architecture Lecture 26 Memory Hierarchy (2) 2

Trang 3

Recap: Storage Devices

Design features of semiconductor

Trang 4

Recap: Speed and Cost per byte

SRAM

hold moderately large amount of data and instructions

– Disk storage is slowest and

cheapest

data and instructions

MAC/VU-Advanced

Trang 5

Recap: CPU-Memory Access-Time

and Disk with respect to the speed of processor, as compared to that of the SRAM, is increasing very fast with

time

MAC/VU-Advanced

Trang 6

CPU-Memory Gap … Cont’d

MAC/VU-Advanced

1 10 100 1,000 10,000

SRAM access time

CPU cycle time

Trang 7

Memory Hierarchy Principles

The speed of DRAM and CPU

complement each other

Organize memory in hierarchy,

based on the Concept of Caching; and

– Principle of Locality

MAC/VU-Advanced

Trang 8

1: Concept of Caching

staging area or temporary-place to:

– store frequently-used subset of the data

or instructions from the relatively cheaper, larger and slower memory; and

– To avoid having to go to the main

memory every time this information is needed

MAC/VU-Advanced

Trang 9

Caching and Memory Hierarchy

Memory devices of different type are used for each value k – the device level

– the faster, smaller device at level k,

serves as a cache for the larger,

slower device at level k+1

– The programs tend to access the

data or instructions at level k more often than they access the data at level k+1

MAC/VU-Advanced

Trang 10

Caching and Memory Hierarchy

– Storage at level k+1 can be slower,

but larger and cheaper per bit

A large pool of memory that costs as much as the cheap storage at the

highest level (near the bottom in hierarchy)

serves data or instructions at the rate

of the fast storage at the lowest level

(near the top in hierarchy)

MAC/VU-Advanced

Trang 11

Examples of Caching in the Hierarchy

Hardware 0

On-Chip TLB

Address translations TLB

Web browser

10,000,000 Local disk

Web pages Browser cache

4-KB page 32-byte block 32-byte block

4-byte word

What Cached

Web proxy server

1,000,000,000

Remote server disks

OS 100

Main memory

Hardware 1

On-Chip L1

Hardware 10

Off-Chip L2

AFS/NFS client

10,000,000 Local disk

Hardware+ OS

100 Main memory

Compiler 0

CPU registers

Managed By

Latency (cycles) Where Cached

MAC/VU-Advanced

Trang 12

2: Principle of Locality

Programs access a relatively small

portion of the address space at any

Trang 13

MAC/VU-Advanced

Electr onics Computers

Chemistry Civil Engg Electrical Engg.

We select 4 books;

2 each of Electronics and

Computers; place them on

a small table for fast

access

Trang 14

Types of Locality

Temporal Spatial

Temporal locality is the locality in time

which says if an item is referenced, it will tend to be referenced again soon.

MAC/VU-Advanced

Trang 16

A well-written program tends to reuse data and instructions which are:

– either near those they have used recently – or that were recently referenced

themselves

MAC/VU-Advanced

Trang 17

– Spatial locality: Items with nearby

addresses (i.e., nearby in space) be located at the same level, as they

tend to be referenced close together

in time

– Temporal locality: Recently

referenced items (i.e., referenced

close in time) be placed at the same memory level, as they are likely to be referenced in the near future

MAC/VU-Advanced

Trang 18

Locality Example: Program

Trang 19

Locality Example

Spatial Locality:

All the array-elements a[ i ] or data,

reference in succession at each loop

iteration, so all the array elements be

located at the same level

All the instructions of the loop are

referenced repeatedly in sequence

therefore be located at the same level

Trang 20

Locality Example

Temporal Locality

The data, sum is referred each

iteration; i.e., recently referred data is referred in each iteration

The Instructions of a loop, sum += a[i]

Cycle through loop repeatedly

Trang 21

Based on Locality Principle

How Memory Hierarchy works?

MAC/VU-Advanced

― the memory hierarchy will keep the

more recently accessed data items

closer to the processor because

chances are the processor will

access them again soon

Trang 22

Based on Locality Principle

How Memory Hierarchy works?

MAC/VU-Advanced

NOT ONLY do we move the item that has just been accessed

are adjacent to it

Trang 23

Hierarchy List

Register File Level 0 Datapath

Main memory Level 3 System Board DRAM

Disk cache Level 4 Disk drive

Disk Level 5 Magnetic disk

Optical Level 6 CDs etc- bulk storage

Tape Level 7 Huge cheapest Storage

MAC/VU-Advanced

Trang 24

Intel Processor Cache

80386 – no on chip cache

80486 – 8k byte lines

Pentium (all versions)

– two on chip L1 caches

– Data & instructions

Trang 25

Cache Devices

Cache device is a small SRAM which is

made directly accessible to the processor; and

DRAM, which is accessible by the cache as well as by the user or programmer, is

placed at the next higher level as the Memory

Main-Larger storage such as disk, is placed away from the main memory

MAC/VU-Advanced

Trang 26

Cache Organization

MAC/VU-Advanced

Main Memory

Trang 27

Caching in a Memory Hierarchy

is partitioned into blocks (say 16

caches a subset of the blocks (say 4 blocks ) from level k+1

Trang 28

Cache Organization

MAC/VU-Advanced

Trang 29

Cache Addressing – Direct Addressing

Level k: 4 blocks

addressed by 2-bit code: zz

The n th block from k+1 level is placed

0001 0101 1001 1101

0010 0110 1010 1110

0011 0111 1011 1111

Trang 30

MAC/VU-Advanced

Memory Hierarchy Terminology

Lower Level Memory

Upper Level Memory

To Processor

From Processor Blk X Blk Y

Trang 31

MAC/VU-Advanced

Memory Hierarchy Terminology

Hit: the data the processor wants to

access appears in some block in the upper level (example: Block X)

that are found in the upper level (i.e., HIT)

which consists of

(i) RAM access time

(ii) Time to determine if this is hit or miss

Trang 32

MAC/VU-Advanced

Memory Hierarchy Terminology … Cont’d

Miss: data needed by the processor is not found in the upper level and has to be

retrieved from a block in the lower level

(Block Y)

(i) to replace a block in the upper level

(ii) to deliver the block the processor

Recommendation: Hit Time must be much

much smaller than Miss Penalty, otherwise no need for memory hierarchy

Trang 33

Request 14

– Object d is transferred

to CPU

Trang 34

Request 12

Cache Miss

Program needs object A , which

is stored in some block C say block 12 at level K+1

Cache miss

– Block C (12 from K+1) is

not at level k- It is cache Miss

– Hence, level k cache

must fetch it from level k+1; and

– transfer object A to the

Trang 35

Placement and Replacement Policies

– If level k cache is full, then some current block

must be replaced ( evicted ), which one is the

Replacement policy that defines which block should be evicted?

MAC/VU-Advanced

Trang 36

Types of misses

Cold (compulsory) miss

beginning of the cache access

Capacity miss

(working set) is larger than the cache

Conflict miss

but multiple data objects all map to the same level k block.

MAC/VU-Advanced

Trang 37

Conflict Miss: Example … Cont’d

If the placement policy is based on the

– Block n at level k+1 must be placed in

block (n mod 4) at level k

In this case, referencing blocks 0, 8, 0, 8, 0,

8, would miss every time as 8 mod 4 = 0,

MAC/VU-Advanced

Trang 38

Cache Design

We have observed that more than one

blocks from the level k+1 memory (say of

the main memory), having N blocks, may be placed at the same location (given by N

MOD M) in the level-k memory (say cache) having M blocks

Hence, a tag must be associated with each block in the level-k (cache) memory to

identify its position in the level k+1 memory (Main memory)

MAC/VU-Advanced

Trang 39

Direct Mapping Example

MAC/VU-Advanced

The 16 MB main memory has 24 address bus

It is organized in 32-bit blocks

16 K word (64 KB) cache requires 16-bit address and 8-bit tag

Trang 40

Direct Mapping Address Structure

24 bit address

2 bit word identifier (4 byte block)

22 bit block identifier – for the main memory

– 8 bit tag (=22-14)

– 14 bit slot or line or index value for cache

No two blocks in the same line have the same Tag field

Check contents of cache by finding line and

checking Tag

Trang 41

Direct Mapping Cache Organization

MAC/VU-Advanced

Trang 42

MAC/VU-Advanced

Let us consider another example with realistic numbers:

Assume we have a 1 KB direct mapped cache with block size equals to 32 bytes

In other words, each block associated with the

cache tag will have 32 bytes in it (Row 1).

0 1 2 3

Byte 63 :

Byte 992 Byte 1023 :

Cache Tag Line Number or Index

Valid Bit

:

Trang 43

MAC/VU-Advanced

Address Translation – Direct Mapped Cache

Assume the k+1 level main memory of 4GB, with Block Size equals to 32 bytes, and a k level cache

of 1Kbyte

Cache Index

0 4

31

Cache Tag

Ex: 0x01 Stored as part

of the cache “state”

Valid Bit

:

0 1 2 3

Byte 63 :

Byte 992 Byte 1023 :

Cache Tag

Byte Select Ex: 0x00 9

Trang 44

MAC/VU-Advanced

Cache Design

With Block Size equals to 32 bytes, the 5 least

significant bits of the address will be used as byte select within the cache block.

Since the cache size is 1K byte, the upper 32

minus 10 bits, or 22 bits of the address will be

stored as cache tag

The rest of the address bits in the middle, that is bit 5 through 9, will be used as Cache Index to

select the proper cache block entry

Tiêu đề	Memory Hierarchy Design
Người hướng dẫn	Prof. Dr. M. Ashraf Chughtai
Trường học	mac
Chuyên ngành	advanced computer architecture
Thể loại	lecture

Định dạng
Số trang	44
Dung lượng	1,32 MB