Advanced Computer Architecture - Lecture 40: Input/Output systems

Advanced Computer Architecture - Lecture 40: Input/Output systems. This lecture will cover the following: RAID and I/O system design; redundant array of inexpensive disks; I/O benchmarks; I/O system design; service accomplishment; service interruption; network attached storages and reliability;...

Trang 1

CS 704

Advanced Computer Architecture

Lecture 40

Input Output Systems

(RAID and I/O System Design)

Prof Dr M Ashraf Chughtai

Trang 3

MAC/VU-Advanced

Computer Architecture Lecture 40 Input / Output System (3) 3

Last time we compared the performance of disk storage and flash memory

We noticed that flash is six times faster than

the disk for read and the disk is six time faster than the flash for data write

– Then we discussed the trends in I/O

inter-connects as: the networks, channels and

backplanes

– The networks offer message-based

narrow-pathway for distributed processors over long

distance

Trang 4

The backplanes offer memory-mapped wide

pathway for centralized processing over short distance

The interconnects are implemented via buses

The buses are classified in two major

categories as the I/O bus and CPU-Memory bus

The channels are implemented using I/O buses and backplanes using CPU-Memory buses

Trang 5

MAC/VU-Advanced

Then we discussed the bus transition protocols

which specify the sequence of events and

timing requirements in transferring information

as synchronous or asynchronous

communication

We also discussed bus arbitration protocols ― the protocols to reserve the bus by a device

that wishes to communicates when multiple

devices need the bus access

Here, we noticed that the bus arbitration

schemes usually try to balance two factors:

Trang 6

 Bus-priority: the device with highest priority

should be serviced first

 Fairness: every device that want to use the bus

is guaranteed to get the bus eventually

The three bus arbitration schemes are:

 Daisy Chain Arbitration

 Centralized Parallel Arbitration

 Distributed Arbitration

Trang 7

Storage I/O Performance

Now having discussed the basic types of

storage devices and the ways to interconnect them to the CPU, we are going to look into the ways to evaluate the performance of storage I/O systems

We know that if a storage device crashes then prime objective of a storage device should be

to remember the original information to make storage device reliable

The reliability of a system can be improved

by using the following four methods

MAC/VU-Advanced

Trang 8

Reliability Improvement

 Fault Avoidance – prevent fault occurrence by

construction

 Fault Tolerance – providing service complying

with the service specification

by redundancy

 Error Removal – minimizing the presence of

errors by verification

 Error Forecasting – to estimate the presence,

creation and consequence

of errors by evaluation

Trang 9

Reliability, availability and dependability

The performance of storage I/Os is measured

in terms of its reliability, availability and

dependability

These terminologies have been defined by

Laprie; in the paper entitled

‘Dependable Computing and Fault Tolerance: Concepts and Terminology;

published in the Digest of papers of 15 th

Annual Symposium on Fault Tolerant

Computing (1985)

MAC/VU-Advanced

Trang 10

Laprie defined dependability as the quality of delivered service such that reliance can

justifiably be placed on this service;

where the service delivered by a system is its observed actual behavior and the system

failure occurs when actual behavior deviates from the specified behavior

Note that a user perceives a system alternating between two states of delivered service; these states are:

Trang 11

 Service Accomplishment – service is

delivered as specified and

 Service Interruption – delivered service is different from the specified service

Quantifying the transitions between service accomplishment and service interruption is the measure of the dependability

The dependability is measured in terms of the measure of:

 module reliability, which is the measure

of the continuous service accomplishment;

MAC/VU-Advanced

Trang 12

Measuring Reliability

 and, module availability, which is the

measure of the swinging between the accomplishment and interruption states

of delivered service

Now before we discuss the reliable and

dependable designs of the storage I/O let us

understand the terminologies used to measure reliability, availability and dependability

The reliability of a module is the measure of

the time to failure from a reference initial

instant

Trang 13

Measuring Reliability … Cont’d

In other words we can say the Mean Time To Failure (MTTF) of a storage module, a disk, is the measure of reliability; and

The reciprocal of the MTTF is the rate of

failure; and

the service interruption is measured as the

Mean Time To Repair (MTTR)

Now let us understand, with the help of an

example, how can we use these terminologies

to measure the availability of a disk subsystem

MAC/VU-Advanced

Trang 14

Measuring Reliability: Example

Consider a disk subsystem comprising the following component

For the given values of MTTF of each

component; find the system failure rate and hence the system MTTF

Trang 15

Reliability Example … Cont’d

10 disks, each with MTTF = 1,000,000 Hrs 1SCSI controller with MTTF = 500,000 Hrs

1 SCSI cable with MTTF = 1,000,000 Hrs

1 power supply with MTTF = 200,000 Hrs

Solution:

System Failure Rate = 10 (1/1,000,000) +1/500,000 +

1/1,000,000 + 1/200,000 +1/200,000

= 23 /1,000,000 Hrs

System MTTF = 1/Failure Rate = 1,000,000/23

= 43,500 Hrs = 5 yearsMAC/VU-Advanced Computer Architecture Lecture 40 Input / Output System (3) 15

Trang 16

The availability of a module is the measure of the service accomplishment with respect to the swinging between the two states of

accomplishment and interruption

The module availability, therefore can be

quantified as the ratio of the MTTF and Mean Time Between Failure – MTBF (which is equal

to the sum of MTTF and MTTR); i.e.,

Availability = MTTF / (MTTF +MTTR)

= MTTF / MTBF

Trang 17

Network Attached Storages and Reliability

Last time we discussed the disk storages and their interface with the processor using

channel and backplane interconnects; and

talked about the impact of disk storages and interconnects on the overall performance of the complete computing system

Today we will discuss the network

interconnects to interface multiple processers located at a long distance and need high

performance storage service

MAC/VU-Advanced

Trang 18

A network provides well defined physical

and logical interfaces; i.e., interconnect

separate CPU and storage system

The networks are capable of sustaining

high bandwidth transfer and their file-server Operating system supports remote file

access

Hence, the network attached storages are

more vulnerable to the reliability and their dependability is very high

Trang 19

Network Attached Storage

Decreasing Disk Diameters

Increasing Network Bandwidth

Network File Services

3 Mb/s » 10Mb/s » 50 Mb/s » 100 Mb/s » 1 Gb/s » 10 Gb/s

networks capable of sustaining high bandwidth transfers

Network provides

well defined physical

and logical interfaces:

separate CPU and

storage system! OS structures supporting remote

file access

MAC/VU-Advanced

Trang 20

So to improve both the availability and

performance of storage system, disk arrays

are introduced, which contain many low cost disks

The throughput of disk arrays is improved by having high bandwidth disk system which

employ many small disk drives; and

The throughput of a disk array is increased by having many small arms on small (3.00” – 1.8”) disk drives rather than one long arm on a

larger disk (14” – 24”); and

Trang 21

Manufacturing Advantages of Disk Arrays

3.5”

Disk Array: 1 disk design

Conventional: 4 disk designs

Disk Product Families

Trang 22

Replace Small # of Large Disks with Large #

of Small Disks! (1988 Disks)

11 W 1.5 MB/s

Disk Arrays have potential for

Trang 23

Simply spreading the data over many disk

forces access to may several disks and hence improve the throughput

The drawback to an array with more devices is

that dependability and hence the reliability

decreases – generally N devices have 1/N

reliability

MAC/VU-Advanced

Trang 24

Array Reliability: Example

Reliability of N disks = Reliability of 1 Disk ÷ N

Disk system MTTF = 50,000 Hours ÷ 70 disks

= 700 hours

Drops from 6 years to 1 month!

However, the dependability can be improved by adding redundant disks to the array to tolerate

faults

Arrays (without redundancy) too unreliable to be

Trang 25

Subsystem Organization

controller

single board disk controller

host adapter

in small format devices

striping software off-loaded from

host to array controller

Trang 26

Redundant Arrays of Disks

In a disk array, files are "striped" across multiple spindles

Adding redundant disk to achieve high fault

tolerance yields high data availability

Here, if Disks fails, the contents are reconstructed from data redundantly stored in the array

However the drawbacks of redundant disk are:

 Capacity penalty to store it

 Bandwidth penalty to update

Trang 27

System-Level Availability

Fully dual redundant

I/O Controller I/O Controller

Array Controller Array Controller

Recovery

Group

Goal: No Single Points of Failure

with duplicated paths, higher performance can be

obtained when there are no failures

MAC/VU-Advanced

Trang 28

Redundant Arrays of Disks

These systems are known as RAID:

 Redundant Array of Inexpensive Disks or

 Redundant Array of Independent Disks

There exist several different approaches

to include redundant disks in the disk

array

These approaches are usually classified

by numerical value which identifies the

RAID level

Each of these techniques have different

Trang 29

Redundant Arrays of Disks

MAC/VU-Advanced

The fault tolerance and overhead in redundant disk for RAID having 8 disks of user data is as given below:

0 No Redundancy 0 0

1 Mirrored 1 8

2 Memory –Style ECC 1 4

3 Bit Interleaved Parity 1 1

4 Block Interleaved Parity 1 1

5 Block interleaved 1 1

distributed parity

6 P+Q Redundancy 2 2

Trang 30

RAID 0 – Non Redundant Striped

RAID 0 is the disk array without any redundant disk

However, here the data is stripped across a set

of disks which makes the collection appears to the software as a single large disk

Note that the taxonomy RAID 0 is a misnomer

as there is no redundant disk; but as the data stripping is used here, so it is normally

referred to as the RAID

Trang 31

RAID 1: Disk Mirroring/Shadowing

Each disk is fully duplicated onto its "shadow“ Targeted for high I/O rate

Whenever data are written to one disk those data are also written to redundant disk

recovery group

MAC/VU-Advanced

Trang 32

If a disk fails, the system just goes to the

mirror, so there are 8 survivals in this example provided one disk of mirrored pair fails

It is the most expensive solution: 100%

capacity overhead

One Logical write = two physical writes

If the data worth 4 disk is to be striped and

stored on 8 disks, there are two way to strip

the data

Trang 33

Note that since 2001, as there is no commercial

implemen-tation of RAID 2, we will not discuss this technique

MAC/VU-Advanced

Trang 34

RAID 3: Bit-Interleaved Parity Disk

Rather than having a complete copy of the

original disk, we can achieve desired

dependability by adding enough redundant

information to restore the lost information on

failure

RAID 3 uses one extra disk, called Parity disk,

that holds the check information in case of failure RAID 3 act logically as single high capacity, high transfer rate disk

The arms are synchronized logically and spindles rotationally

Trang 35

1 1 0 0 1 1 0 1

1 0 0 1 0 0 1 1

0 0 1 1 0 0 0 0

Trang 36

Here, every read or write access goes to all the disk

For every read access, the parity is computed across recovery group to protect against hard disk failures

Note that for the RAID 3 shown here, there is 33% capacity cost for parity

However, the wider arrays reduce capacity

costs, but decreases expected availability and increases reconstruction time

Trang 37

RAID 4:Block-Interleaved Parity and

RAID 5:Distributed Block-Interleaved Parity

Both the RAID 4 and RAID 5 levels use the same ratio of data disk to parity disk as

RAID 3, but they access data differently

The distribution of data in RAID 4 verses

RAID 5 is shown here

In the Block-Interleaved Parity RAID 4, the parity disk is associated to each data block, identical to RAID 3

So it supports a mixture of small read and small writes and large read and large writes

MAC/VU-Advanced

Trang 38

RAID 4: Block-Interleaved Parity

D20 D21 D22 D23 P5

Disk Columns

Increasing Logical Disk Addresses

Stripe Stripe Unit

Trang 39

RAID 4:Block-Interleaved Parity and

RAID 5:Distributed Block-Interleaved Parity

However, one drawback of this system is that the parity disk must be uploaded on

every write, which is bottleneck for back write

back-to-This bottleneck is resolved in Block

interleaved parity RAID 5, where the parity disk is distributed among the blocks

Note from the RAID 5 organization shown here that the parity associated each row of the data block is no longer restricted …

MAC/VU-Advanced

Trang 40

RAID 5: Distributed Block-Interleaved Parity

Stripe Stripe Unit

Trang 41

RAID 4:Block-Interleaved Parity and RAID 5:Distributed Block-Interleaved Parity

… to a single disk

Hence, this organization allows multiple

writes to occur simultaneously as long as the stripe-units are not located in the same disk

For example:

1st write to block 8 must also access its

parity block P2 (i.e., two reads from two

disks – the 1st and 3rd disks) and

MAC/VU-Advanced

Trang 42

RAID 4 and RAID

2nd write to block 5 imply an update in P1

(i.e., two reads from two disks – the 2nd and

4th disks)

Thus, the two write could occur at the same time in parallel

Where as we you look into the organization

of RAID 4, both the P1 and P2 are on the

same disk (5th disk) so it would be

bottleneck and could not be written

simultaneously

Trang 43

RAID 4 and RAID 5

In RAID 4 and RAID 5, the parity is stored as blocks and is associated with a set of data

blocks

In RAID 3 every access goes to all the disks while the levels 4 and 5 use smaller accesses which allow independent access to occur in parallel

In RAID 4 and RAID 5 error detection

information in each sector is checked

independently for ‘small reads’ to see if the data are correct in one sector

MAC/VU-Advanced

Tiêu đề	Input Output Systems (Raid And I/O System Design)
Người hướng dẫn	Prof. Dr. M. Ashraf Chughtai
Trường học	mac/vu
Chuyên ngành	advanced computer architecture
Thể loại	lecture

Định dạng
Số trang	50
Dung lượng	1,42 MB