Advanced Computer Architecture - Lecture 25: Memory hierarchy design. This lecture will cover the following: storage technologies trends and caching; what is outside the processor; storage technologies; RAM and enhanced DRAM; disk storage;...
Trang 1CS 704
Advanced Computer Architecture
Lecture 25
Memory Hierarchy Design
(Storage Technologies Trends and Caching)
Prof Dr M Ashraf Chughtai
Trang 2Today’s Topics
What is outside the processor
Trang 3Data path and control design of
Trang 5– what is outside the processor?
performance of processors?
Whatever is outside the processor is
referred to as the I/O system
The I/O systems include:
What is Outside the Processor? … Cont’d
Trang 6MAC/VU-Advanced
Computer Architecture Lec 25 – Memory Hierarchy Design (1) 6
Memory system design for high
Trang 7Fast memory is expensive and cheap memory is slow, therefore a memory hierarchy is organized into several
Trang 8MAC/VU-Advanced
Computer Architecture Lec 25 – Memory Hierarchy Design (1) 8
Memory Hierarchy System
Control
Datapath
Memory Processor
Trang 10MAC/VU-Advanced
Computer Architecture Lec 25 – Memory Hierarchy Design (1) 10
“ The principle of locality”,
where each type of the module is located in the memory system?
Principle of Memory hierarchy
Trang 11Advantage of the principle of
Trang 12MAC/VU-Advanced
Computer Architecture Lec 25 – Memory Hierarchy Design (1) 12
The semiconductor memories such as
in access speed but are expensive
Used in small size and placed either inside
or closest to the processor
storage at lowest cost per bit are placed
farthest away from the processor
Storage Type in Memory System
Trang 13Levels of the Memory Hierarchy
prog./compiler 1-8 bytes
cache control 8-128 bytes
OS 512-4K bytes
user/operator
Upper Level
faster
Trang 14MAC/VU-Advanced
Computer Architecture Lec 25 – Memory Hierarchy Design (1) 14
Trang 16Memory Hierarchy Pyramid
registers on-chip L1 cache (SRAM)
main memory (DRAM)
local secondary storage
Main memory holds disk blocks retrieved from local disks.
off-chip L2 cache (SRAM)
L1 cache holds cache lines retrieved from the L2 cache memory.
CPU registers hold words retrieved from L1 cache.
L2 cache holds cache lines retrieved from main memory.
Trang 18MAC/VU-Advanced
Computer Architecture Lec 25 – Memory Hierarchy Design (1) 18
Classification of memory systems based on different attributes and their design
Attributes:
– Material – Semiconductor, magnetic, optical
– Accessing – Random, Sequential, Hybrid
– Store/Retrieve – ROM, RMM and RWM
Storage Systems
Trang 19Random-Access Memory (RAM)
Key features
– RAM is packaged as a chip.
– Basic storage unit is a cell (one bit per cell).
– Multiple RAM chips form a memory
Types of RAM
– Static RAM (SRAM)
– Dynamic RAM (DRAM)
Trang 20MAC/VU-Advanced
Computer Architecture Lec 25 – Memory Hierarchy Design (1) 20
Static RAM: Basic Cell
6-Transistor SRAM Cell
word (row select)
3 Cell pulls one line low
4 Sense amp on column detects
difference between bit and bit
replaced with pullup
to save area
1 0
Trang 22MAC/VU-Advanced
Computer Architecture Lec 25 – Memory Hierarchy Design (1) 22
Trang 24MAC/VU-Advanced
Computer Architecture Lec 25 – Memory Hierarchy Design (1) 24
SRAM Organization: 16-word x 4-bit
Cell SRAM Cell SRAM Cell SRAM Cell
Sense Amp + Sense Amp + Sense Amp + Sense Amp +
Dout 2 Dout 3
Wr Driver & Precharger + Wr Driver & Precharger + Wr Driver & Precharger + Wr Driver & Precharger +
Din 2 Din 3
A0 A1 A2 A3
Q: Which is longer: word line or bit line?
Trang 26MAC/VU-Advanced
Computer Architecture Lec 25 – Memory Hierarchy Design (1) 26
Trang 28Dynamic Random Access Memory (DRAM )
– Each cell stores bit with a capacitor and
Trang 29Basic DRAM Cell
– Cell and bit line share charges
Here, very small voltage changes
occurs on the bit line therefore
Sense amplifier is used to detect
Trang 30MAC/VU-Advanced
Computer Architecture Lec 25 – Memory Hierarchy Design (1) 30
Trang 32DRAM Organization 16 words x 8bit
internal row buffer
cols
rows
0 1 2 3 0
1 2 3
16 x 8 DRAM chip
addr
data
supercell (2,1)
2 bits /
8 bits /
memory controller (to CPU)
MAC/VU-Advanced
Computer Architecture Lec 25 – Memory Hierarchy Design (1) 32
Trang 33Reading DRAM Supercell (2,1)
Step 1(a): Row access strobe (RAS) selects row 2.
Step 1(b): Row 2 copied from DRAM array to row buffer.
cols
rows
0 1 2
16 x 8 DRAM chip
3 addr
data
2 /
8 /
memory controller
Trang 34Reading DRAM Supercell (2,1)
Step 2(a): Column access strobe (CAS) selects column 1.
and eventually back to the CPU.
cols
rows
0 1 2 3 0
1 2 3
internal row buffer
8 /
memory controller
supercell (2,1)
Trang 3564 MB DRAM Memory Module
[8x8MB DRAM Chips]
: supercell (i,j) addr (row = i, col = j)
Memory controller
bits 8-15
bits 16-23
bits 24-31
bits 32-39
bits 40-47
bits 48-55 bits
56-63
Trang 36Enhanced DRAMs
– In normal DRAM, we can only read and write
M-bit at time because only one row and one
column is selected at any time by the row and column address
– In other words, for each M-bit memory access,
we have to provide a row address followed by a column address
MAC/VU-Advanced
Computer Architecture Lec 25 – Memory Hierarchy Design (1) 36
Trang 37Page Mode DRAM: Motivation
Regular DRAM Organization:
– N rows x N column x M-bit – Read & Write M-bit at a time – Each M-bit access requires
a RAS / CAS cycle Fast Page Mode DRAM
– N x M “register” to save a row
Column Address
Mbit Output
Trang 38MAC/VU-Advanced
Computer Architecture Lec 25 – Memory Hierarchy Design (1) 38
CAS, CAS]
- instead of with 4 RAS and 4 CAS:
[(RAS,CAS), (RAS,CAS), (RAS,CAS), (RAS,CAS)]
Trang 39Fast Page Mode Operation
Fast Page Mode DRAM
– N x M “SRAM” to save a row
After a row is read into the register
– Only CAS is needed to access
other M-bit blocks on that row
– RAS_L remains asserted while
N x M “SRAM”
Row Address
2nd Mbit 3rd Mbit 4th Mbit
Trang 40MAC/VU-Advanced
Computer Architecture Lec 25 – Memory Hierarchy Design (1) 40
Trang 42Enhanced DRAMs
closely spaced CAS signals
– It is driven with rising clock edge instead
of asynchronous control signals.
MAC/VU-Advanced
Computer Architecture Lec 25 – Memory Hierarchy Design (1) 42
Trang 43Enhanced DRAMs
(DDR SDRAM)
– It is Enhancement of SDRAM that uses both
clock edges as control signals.
– It is like FPM DRAM, but output is produced by
shifting row buffer; and is
– Dual ported to allow concurrent reads and
writes
Trang 44Nonvolatile Memories
DRAM and SRAM are volatile memories
– Lose information if powered off.
Nonvolatile memories retain value even if powered off.
(ROM).
read and modified.
MAC/VU-Advanced
Computer Architecture Lec 25 – Memory Hierarchy Design (1) 44
Trang 45Nonvolatile Memories
Types of ROMs
Firmware
Boot time code, BIOS (basic
input/output system)
Trang 47Disk Geometry (Muliple-Platter
View)
Aligned tracks form a cylinder.
surface 0 surface 1 surface 2 surface 3 surface 4 surface 5
cylinder k
spindle
platter 0 platter 1 platter 2
Trang 48Disk Capacity
Capacity: maximum number of bits that can
be stored.
– Recording density (bits/in): number of bits
that can be squeezed into a 1 inch segment of a track.
– Track density (tracks/in): number of
tracks that can be squeezed into a 1 inch radial segment.
– Arial density (bits/in2): product of
recording and track density.
MAC/VU-Advanced
Computer Architecture Lec 25 – Memory Hierarchy Design (1) 48
Trang 49Disk Operation (Multi-Platter View)
arm
read/write heads move in unison from cylinder to cylinder
spindle
Trang 50Disk Access Time
Average time to access some target sector approximated by :
Seek time (Tavg seek )
– Time to position heads over cylinder
containing target sector
– Typical Tavg seek = 9 ms
MAC/VU-Advanced
Computer Architecture Lec 25 – Memory Hierarchy Design (1) 50
Trang 51Disk Access Time
Rotational latency (Tavg rotation )
– Time waiting for first bit of target sector to pass
under r/w head.
– Tavg rotation = 1/2 x 1/RPMs x 60 sec/1 min
Transfer time (Tavg transfer )
– Time to read the bits in the target sector.
– Tavg transfer =
Trang 52Disk Access Time Example
– Access time dominated by seek time and rotational latency.
– First bit in a sector is the most expensive, the rest are free.
– SRAM access time is about 4 ns/doubleword, DRAM about 60 ns
Disk is about 40,000 times slower than SRAM,
2,500 times slower then DRAM.
MAC/VU-Advanced
Computer Architecture Lec 25 – Memory Hierarchy Design (1) 52
Trang 53Logical Disk Blocks
– The set of available sectors is modeled as a
sequence of b-sized logical blocks (0, 1, 2, )
Mapping between logical blocks and actual
(physical) sectors
– Maintained by hardware/firmware device called
disk controller.
– Converts requests for logical blocks into
(surface, track, sector) triples.
Trang 54CPU-Memory Gap
MAC/VU-Advanced
Computer Architecture Lecture 26 Memory Hierarchy (2) 54
1 10 100 1,000 10,000 100,000 1,000,000 10,000,000 100,000,000
SRAM access time
CPU cycle time
Trang 55Memory hierarchy organization
Design of basic memory modules of DRAM and SRAM
Design and working of disk storages
Gap between the speed of processor and the storage devices - DRAM, SRAM and
Disk is increasing with time
Trang 56MAC/VU-Advanced
Computer Architecture Lec 25 – Memory Hierarchy Design (1) 56