PHẦN 1: TÍNH TOÁN SONG SONG Chƣơng 1 KIẾN TRÚC VÀ CÁC LOẠI MÁY TINH SONG SONG Chƣơng 2 CÁC THÀNH PHẦN CỦA MÁY TINH SONG SONG Chƣơng 3 GIỚI THIỆU VỀ LẬP TRÌNH SONG SONG Chƣơng 4 CÁC MÔ HÌNH LẬP TRÌNH SONG SONG Chƣơng 5 THUẬT TOÁN SONG SONG PHẦN 2: XỬ LÝ SONG SONG CÁC CƠ SỞ DỮ LIỆU (Đọc thêm) Chƣơng 6 TỔNG QUAN VỀ CƠ SỞ DỮ LIỆU SONG SONG Chƣơng 7 TỐI ƢU HÓA TRUY VẤN SONG SONG Chƣơng 8 LẬP LỊCH TỐI ƢU CHO CÂU TRUY VẤN SONG SONG
Trang 1Thoai Nam
Trang 2Flynn’s Taxonomy
Classification of Parallel Computers Based
on Architectures
Trang 3Based on notions of instruction and data streams
– SISD (Single Instruction stream, a Single Data stream )
– SIMD (Single Instruction stream, Multiple Data streams )
– MISD (Multiple Instruction streams, a Single Data stream)
– MIMD (Multiple Instruction streams, Multiple Data stream)
Popularity
– MIMD > SIMD > MISD
Trang 4SISD
– Conventional sequential machines
IS : Instruction Stream DS : Data Stream
CU : Control Unit PU : Processing Unit
Trang 5SIMD
– Vector computers, processor arrays
– Special purpose computations
PE : Processing Element LM : Local Memory
SIMD architecture with distributed memory
Trang 6MISD
– Systolic arrays
– Special purpose computations
Memory (Program,
MISD architecture (the systolic array)
Trang 7MIMD
– General purpose parallel computers
CU 1 PU 1
Shared Memory
IS
I/O
CU n IS PU n DS I/O
IS MIMD architecture with shared memory
Trang 8Classification based on Architecture
Trang 9Instructions are divided into a number of steps (segments, stages)
At the same time, several instructions can be loaded in the machine and be executed in
different steps
Trang 10– IF – instruction fetch
– ID – instruction decode and register fetch
– EX- execution and effective address calculation
– MEM – memory access
Trang 11– Id, SISAL, Silage, LISP,
– Single assignment, applicative(functional) language
– Explicit parallelism
Trang 12z = (a + b) * c
+
*
a b c
z
The dataflow representation of an arithmetic expression
Trang 13Execution of instructions is driven by data availability
– What is the difference between this and normal (control flow) computers?
– Time lost waiting for unneeded arguments
– High control overhead
– Difficult in manipulating data structures
Trang 14/
d2 e2
*+
f2
/
d3 e3
*+
f3
/
d4 e4
*+
Trang 15Execution on
a Control Flow Machine
Sequential execution on a uniprocessor in 24 cycles
Assume all the external inputs are available before entering do loop
+ : 1 cycle, * : 2 cycles, / : 3 cycles,
How long will it take to execute this program on a dataflow
computer with 4 processors?
Trang 16Execution on
a Dataflow Machine
c1 c2 c3 c4 a1
a2 a3 a4
b1 b2
b4 b3
Data-driven execution on a 4-processor dataflow computer in 9 cyclesCan we further reduce the execution time of this program ?
Trang 17Programming model
– Operations performed in parallel on each element
of data structure
– Logically single thread of control, performs
sequential or parallel steps
– Conceptually, a processor associated with each data element
Trang 18SIMD Architectural model
– Array of many simple, cheap processors with little
Trang 19Instruction set includes operations on vectors
Trang 20A sequential computer connected with a set of identical processing
elements simultaneouls doing the same operation on different data
Program and
Data Memory
CPU I/O processor
Front-end computer
I/0
Trang 21Stream vector from
memory to the CPU
Trang 22Consists of many fully programmable
processors each capable of executing its own program
Shared address space architecture
Classified into 2 types
– Uniform Memory Access (UMA) Multiprocessors – Non-Uniform Memory Access (NUMA)
Multiprocessors
Trang 23Uses a central switching mechanism to reach a centralized shared memory
All processors have equal access time to global memory
Tightly coupled system
Problem: cache consistency
Trang 24Crossbar switching mechanism
Mem Mem Mem Mem
Cache P
I/O Cache
P
I/O
Trang 25Shared-bus switching mechanism
Cache P
I/O Cache
P
I/O
Trang 26Packet-switched network
Network
Cache P
Cache P
Cache P Mem
Trang 27P Mem
Cache
P
P Mem
Distributed Memory
Trang 28Current Types of Multiprocessors
PVP (Parallel Vector Processor)
– A small number of proprietary vector processors
connected by a high-bandwidth crossbar switch
Trang 29VP VP
Crossbar Switch VP
VP : Vector Processor
SM : Shared Memory
Trang 31DSM (Distributed Shared Memory)
Custom-Designed Network
P/C LM
NIC DIR
P/C LM
NIC DIR
MB: Memory Bus
P/C: Microprocessor & Cache
LM: Local Memory
DIR: Cache Directory
NIC: Network Interface
Circuitry
Trang 32Consists of many processors with their own memory
P/C
P/C: Microprocessor & Cache
M: Memory
Trang 33Current Types of Multicomputers
MPP (Massively Parallel Processing)
– Total number of processors > 1000
Trang 34MPP
(Massively Parallel Processing)
P/C LM NIC
MB
P/C LM NIC MB
Custom-Designed Network
P/C: Microprocessor & Cache MB: Memory Bus NIC: Network Interface Circuitry LM: Local Memory
Trang 35Commodity Network (Ethernet, ATM, Myrinet, VIA)
P/C M
NIC
P/C M
IOB: I/O Bus
NIC: Network Interface
Circuitry
Trang 36P/C P/C
Hub
>= 16
IOC
P/C: Microprocessor & Cache MB: Memory Bus
NIC: Network Interface Circuitry SM: Shared Memory IOC: I/O Controller LD: Local Disk
Trang 37Trend in Parallel Computer