3 PP flynn pararchitectures xử lý song song và phân tán

PHẦN 1: TÍNH TOÁN SONG SONG Chƣơng 1 KIẾN TRÚC VÀ CÁC LOẠI MÁY TINH SONG SONG Chƣơng 2 CÁC THÀNH PHẦN CỦA MÁY TINH SONG SONG Chƣơng 3 GIỚI THIỆU VỀ LẬP TRÌNH SONG SONG Chƣơng 4 CÁC MÔ HÌNH LẬP TRÌNH SONG SONG Chƣơng 5 THUẬT TOÁN SONG SONG PHẦN 2: XỬ LÝ SONG SONG CÁC CƠ SỞ DỮ LIỆU (Đọc thêm) Chƣơng 6 TỔNG QUAN VỀ CƠ SỞ DỮ LIỆU SONG SONG Chƣơng 7 TỐI ƢU HÓA TRUY VẤN SONG SONG Chƣơng 8 LẬP LỊCH TỐI ƢU CHO CÂU TRUY VẤN SONG SONG

Trang 1

Thoai Nam

Trang 2

Flynn’s Taxonomy

Classification of Parallel Computers Based

on Architectures

Trang 3

Based on notions of instruction and data streams

– SISD (Single Instruction stream, a Single Data stream )

– SIMD (Single Instruction stream, Multiple Data streams )

– MISD (Multiple Instruction streams, a Single Data stream)

– MIMD (Multiple Instruction streams, Multiple Data stream)

Popularity

– MIMD > SIMD > MISD

Trang 4

SISD

– Conventional sequential machines

IS : Instruction Stream DS : Data Stream

CU : Control Unit PU : Processing Unit

Trang 5

SIMD

– Vector computers, processor arrays

– Special purpose computations

PE : Processing Element LM : Local Memory

SIMD architecture with distributed memory

Trang 6

MISD

– Systolic arrays

– Special purpose computations

Memory (Program,

MISD architecture (the systolic array)

Trang 7

MIMD

– General purpose parallel computers

CU 1 PU 1

Shared Memory

IS

I/O

CU n IS PU n DS I/O

IS MIMD architecture with shared memory

Trang 8

Classification based on Architecture

Trang 9

Instructions are divided into a number of steps (segments, stages)

At the same time, several instructions can be loaded in the machine and be executed in

different steps

Trang 10

– IF – instruction fetch

– ID – instruction decode and register fetch

– EX- execution and effective address calculation

– MEM – memory access

Trang 11

– Id, SISAL, Silage, LISP,

– Single assignment, applicative(functional) language

– Explicit parallelism

Trang 12

z = (a + b) * c

+

*

a b c

z

The dataflow representation of an arithmetic expression

Trang 13

Execution of instructions is driven by data availability

– What is the difference between this and normal (control flow) computers?

– Time lost waiting for unneeded arguments

– High control overhead

– Difficult in manipulating data structures

Trang 14

/

d2 e2

*+

f2

/

d3 e3

*+

f3

/

d4 e4

*+

Trang 15

Execution on

a Control Flow Machine

Sequential execution on a uniprocessor in 24 cycles

Assume all the external inputs are available before entering do loop

+ : 1 cycle, * : 2 cycles, / : 3 cycles,

How long will it take to execute this program on a dataflow

computer with 4 processors?

Trang 16

Execution on

a Dataflow Machine

c1 c2 c3 c4 a1

a2 a3 a4

b1 b2

b4 b3

Data-driven execution on a 4-processor dataflow computer in 9 cyclesCan we further reduce the execution time of this program ?

Trang 17

Programming model

– Operations performed in parallel on each element

of data structure

– Logically single thread of control, performs

sequential or parallel steps

– Conceptually, a processor associated with each data element

Trang 18

SIMD Architectural model

– Array of many simple, cheap processors with little

Trang 19

Instruction set includes operations on vectors

Trang 20

A sequential computer connected with a set of identical processing

elements simultaneouls doing the same operation on different data

Program and

Data Memory

CPU I/O processor

Front-end computer

I/0

Trang 21

Stream vector from

memory to the CPU

Trang 22

Consists of many fully programmable

processors each capable of executing its own program

Shared address space architecture

Classified into 2 types

– Uniform Memory Access (UMA) Multiprocessors – Non-Uniform Memory Access (NUMA)

Multiprocessors

Trang 23

Uses a central switching mechanism to reach a centralized shared memory

All processors have equal access time to global memory

Tightly coupled system

Problem: cache consistency

Trang 24

Crossbar switching mechanism

Mem Mem Mem Mem

Cache P

I/O Cache

P

I/O

Trang 25

Shared-bus switching mechanism

Cache P

I/O Cache

P

I/O

Trang 26

Packet-switched network

Network

Cache P

Cache P Mem

Trang 27

P Mem

Cache

P

P Mem

Distributed Memory

Trang 28

Current Types of Multiprocessors

PVP (Parallel Vector Processor)

– A small number of proprietary vector processors

connected by a high-bandwidth crossbar switch

Trang 29

VP VP

Crossbar Switch VP

VP : Vector Processor

SM : Shared Memory

Trang 31

DSM (Distributed Shared Memory)

Custom-Designed Network

P/C LM

NIC DIR

P/C LM

NIC DIR

MB: Memory Bus

P/C: Microprocessor & Cache

LM: Local Memory

DIR: Cache Directory

NIC: Network Interface

Circuitry

Trang 32

Consists of many processors with their own memory

P/C

P/C: Microprocessor & Cache

M: Memory

Trang 33

Current Types of Multicomputers

MPP (Massively Parallel Processing)

– Total number of processors > 1000

Trang 34

MPP

(Massively Parallel Processing)

P/C LM NIC

MB

P/C LM NIC MB

Custom-Designed Network

P/C: Microprocessor & Cache MB: Memory Bus NIC: Network Interface Circuitry LM: Local Memory

Trang 35

Commodity Network (Ethernet, ATM, Myrinet, VIA)

P/C M

NIC

P/C M

IOB: I/O Bus

NIC: Network Interface

Circuitry

Trang 36

P/C P/C

Hub

>= 16

IOC

P/C: Microprocessor & Cache MB: Memory Bus

NIC: Network Interface Circuitry SM: Shared Memory IOC: I/O Controller LD: Local Disk

Trang 37

Trend in Parallel Computer

Định dạng
Số trang	37
Dung lượng	518,59 KB