Dudng cong khong cd ky hieu vdi net liln va net diit tUdng iing la hieu nang ciia thuat toan tfnh luc true t i l p tren MDGRAPE-2 va may chu 22 3.11 Tudng t u nhu hmh 3.10 nhung vdi he
Trang 1D A I HOC QUOC GIA HA N O I
OAI HOC QUOC GIA HA NQi TRUNG TAM THONG ''iN ^HIJ VlE^
1 ' c U
Ha Noi - 2006
Trang 2M u c luc
D a n h m u c h i n h v e 2
D a n h m u c b a n g 3
1 D a n h sach c a n b o t h a m gia thUc hien de tai 5
2 Tdm t a t nhiJng ket q u a chfnh cua de tai n g h i e n ciJu khoa hpc 6
2.1 Ten d^ tai 6 2.2 Chu tri dg tai 6
2.3 Nhu'ng ket qua chinh 6
2.3.1 Ket qua vk khoa hpc 6
2.3.2 K^t qua phuc vu thuc t€ 7
2.3.3 Ket qua dao tao 7
2.3.4 Kgt qua nang cao tilm luc khoa hoc 7
2.3.5 T m h hhih su* dung kinh phi 7
3 N O I D U N G C U A D E TAI 9
3.1 Dat va,n d l 9 3.2 T5ng quan ve tfnh lire tu'dng tac nhanh trong mo phong dong luc
phan tii 10 3.2.1 Cac thuat toan nhanh trong ITnh vuc mo phong dpng luc phan
t d 11 3.2.2 T h u a t toan FMM va cac biin t h i 12
3.2.2.1 Thuat toan FMM 12 3.2.2.2 Thuat toan cua Anderson 13
3.2.2.3 T h u a t toan ciia Makino 14 3.2.3 May tinh chuyen dung song song G R A P E va iing dung 16
3.2.4 Cai dat thuat toan nhanh tren phan ciing chuyen dung 19
3.2.4.1 Cai dat thuat toan tree tren p h i n ciing G R A P E 19 3.2.4.2 Cai dat thuat toan FMM tren p h i n ciing MD-ENGINE 19 3.3 Npi dung va k i t qua nghien ciiu 20
3.3.1 Cac kho khan cin giai quylt 20
3.3.2 Giai phap va kit qua ciia chung toi 21
3.4 T h a o luan 26
Trang 3MUC LUC 2
3.5 Kit luan va kiln nghi 28
Tai lieu t h a m k h a o 29
P h u luc 34
Trang 4D a n h m u c hinh ve
3.1 Y tudng chmh cua thuat toan tinh luc FMM 12
3.2 PhUdng phap cua Anderson 14
3.3 PhUdng phap P^M^ ciia Makino 15
3.4 Kiln triic cd ban ciia mpt he may tfnh G R A P E 16
3.5 May tmh M D G R A P E - 2 (PCI) co t i c dp cue dai tUdng dUdng 48GFlops
vdi 4 chip MDGRAPE-2 (m5i chip co t i c dp cue dai tUdng duong
16GFlops) 16 3.6 May tfnh MDGRAPE-2 (Compact PCI) co t i c dp cUc dai tUdng
dudng 192GFlops vdi 16 chip MDGRAPE-2 (m6i chip co t i c dp cue
dai tUdng dUdng 16GFlops) 17
3.7 Cum may tinh G R A P E nhin tii mat trUdc [54] 18
3.8 Cum may tinh G R A P E nhui txt mat sau [54] 19
3.9 Cum may tfnh G R A P E nhin tii ben trai [54] 20
3.10 So sanh thdi gian tfnh luc ciia thuat toan FMM va thuat toan tinh luc
true t i l p tren he I Dudng cong cd cac hmh trdn t h i hien hieu nang
cua FMM tren may tinh MDGRAPE-2 Dudng cong vdi cac hinh ngu
giac la hieu nang ciia FMM tren may chii Cac hmh trdn va ngu giac
to den la hieu nang tUdng iing vdi dp chinh xac cao (p = 5), khong
to den ling vdi dp chfnh xac t h i p {p = 1) Dudng cong khong cd ky
hieu vdi net liln va net diit tUdng iing la hieu nang ciia thuat toan
tfnh luc true t i l p tren MDGRAPE-2 va may chu 22
3.11 Tudng t u nhu hmh 3.10 nhung vdi he II 23
3.12 So sanh thdi gian tfnh luc ciia thuat toan FMM va thuat toan tree
tren he I Dudng cong cd cac hinh trdn t h i hien hieu nang cua FMM
tren may tfnh MDGRAPE-2 Dudng cong vdi cac hmh tam giac la
hieu nang cua thuat toan tree tren MDGRAPE-2 Cac hinh trdn va
tam giac td den la hieu nang tUdng iing vdi dp chinh xac cao (p — 5),
khong td den iing vdi dp chfnh xac t h i p {p = I) 24
3.13 Tudng t u nhu hinh 3.12 nhung vdi he II 26
3.14 So sanh hieu nang cua hai ban cai dat FMM khi sii dung cac cong
thiic (3.11) - ban 1.0 va (3.12) - ban 2.0 Dudng cong cd hhih vudng
la hieu n t o g ciia ban 1.0 Dudng cong cd hinh trdn la hieu nang cua
ban 2.0 Cac dudng cong cd hinh td den iing \'di do chinh xac cao
(p = 5), khong to den iing vdi do chfnh xac t h i p (p = 1) 27
Trang 5D a n h m u c bang
1.1 Danh sach can bp, cong tac vien, hpc vien cao hpc va sinh vien tham
gia thuc hien d l tai 5
3.1 Cac pha tfnh toan va cac cdng thiic tUdng iing dUdc sii dung trong
hai ban cai dat FMM Cac p h i n in dam dUdc thuc hien tren may tfnh
G R A P E 21 3.2 P h a n tfch thdi gian thUc hien cac pha tfnh toan cua ban 2.0 tren he
II vdi so lUdng hat A^ = 1024 x 1024 = 1048576 Chii y r i n g cac pha
"Tao cay" va "Tao danh sach lan can va danh sach tUdng tac" trong
thuat toan FMM khong dUdc chiing tdi md t a trong bao cao nay vi
cae pha nay chi cd tfnh c h i t chuan bi cho tfnh toan va chilm r i t ft
thdi gian tfnh to^n nhu t a t h i y trong bang 25
3.3 So sanh vdi ban cai dat DPMTA 3.1.3 cua Wrankin [49] 25
Trang 6D a n h sach can bo t h a m gia thu'c hien de tai
Bang 1.1: Danh sach can bp, cdng tac vien, hpc vien cao hpc va sinh vien tham gia thuc hien dl tai
Nguyin Hai Chau
(chu nhiem d l tai)
T Ebisuzaki
A Kawai
Vu Bdi H i n g
T r i n Manh Tudng
Dd Thi Minh Viet
Nguyin Thi Thuy Linh
P h a m Quang Nhat Minh
Le Thi Lan PhUdng
Chu Quang Thiiy
Hpc h a m hpc vj
TS
TS
TS ThS
Hpc vien Cdng nghe Saitama, Nhat Ban
Khoa Cdng nghe thdng tin, trudng Dai hpc cdng nghe, DHQGHN Khoa Toan Cd Tin hoc, trudng DHKHTN, DHQGHN
Khoa Cdng nghe thdng tin, trudng Dai hpc cdng nghe, DHQGHN Khoa Cdng nghe thdng tin, trudng Dai hpc cdng nghe, DHQGHN Khoa Cdng nghe thdng tin, trudng Dai hpc cdng nghe, DHQGHN Khoa Cdng nghe thdng tin, trudng Dai hpc cdng nghe, DHQGHN Khoa Cdng nghe thdng tin, trudng Dai hpc cdng nghe, DHQGHN
Trang 7Tom t a t nhiJng ket q u a chinh cua de t a i nghien ciJu khoa hoc
2.1 Ten de tai
Tfnh toan hieu nang cao va iing dung vao bai toan md phong ddng luc phan tii (High performance computing and its application to molecular dynamics simula-tion)
M a s d : QC.05.01
2.2 Chii t r i de tai
Ngudi chu tri: TS Nguyin Hai Chau
Cd quan: Trudng Dai hpc Cdng nghe, Dai hpc Qudc gia Ha Ndi
Dia chi: 144 Xuan Thiiy, C i u Giiy, Ha Ndi
Trang 82. T6MTAT NHIING KET QUA CHINH CUA D^ TAI NGHIEN ClfU KHOA H0C7
2.3.2 K i t q u a p h u c v u thu'c t l
Da hoan thanh bp chudng trinh cai dat thii nghiem thuat toan FMM tren may tfnh MDGRAPEI-2 Cac kit qua nghien ciiu cua dl tai cho thiy, thuat toan do chung tdi thilt kl va cai dat cd hieu nang cao va dp chfnh xac thoa man cac yeu ciu cua nhilu ling dung trong linh vUc md phong dpng luc phan tii Thuat toan cai dat nay
CO kha nang tfch hdp vdi mpt so iing dung vl MD da dUdc triln khai nhu NAMD May tfnh MDGRAPE-2 cd tai Ha Npi va kha nang sii dung may tfnh nay vao md phong la kha thi
2.3.5 T i n h h i n h sdf d u n g k i n h p h i
Da Sli dung hit kinh phf dUdc cip ciia dl tai
C H U NHIEM DE TAI XAC N H A N CUA D O N VI
X A C N H A N CUA Cd Q U A N CHU Q U A N
Trang 92 T6M TAT NHIJNG KET QUA CHINH CUA DE TAI NGHIEN CHU KHOA HOCS
Abstract
We have implemented fast multipole method (FMM) on a special-purpose computer
G R A P E (GRAvity p i P E ) The FMM is one of the fastest approximate algorithms
to calculate forces among particles Its calculation cost scales as 0(-/V), while naive
algorithm scales as 0{N'^) Here, TV is the number of particles in the system G R A P E
is a hardware dedicated to the calculation of Coulombic or gravitational force among particles Its calculation speed is 100-1000 times faster than that of conventional computers of the same price, though it cannot handle anything but force calculation
We can expect significant speed up by the combination of the fast algorithm and the fast hardware However, a straight forward implementation of the algorithm actually runs on G R A P E at rather modest speed This is because of the limited function of the hardware Since G R A P E can handle particle force only, just a small fraction
of the calculation procedure can be put on it The rest part must be performed
on a conventional computer connected to GRAPE In order to take full advantage
of the dedicated hardware, we modified the FMM using Pseudoparticle Multipole Method and Anderson's method In the modified algorithm, multipole and local expansions are expressed by distribution of a small number of imaginary particles (pseudoparticles), and thus they can be evaluated by G R A P E Results of numerical experiments show t h a t G R A P E accelerates the FMM by a factor of 3-60 depending
on the accuracy Its performance exceeds that of Barnes-Hut treecode on G R A P E
at high accuracy (root-mean-square relative force error ~ 10~^), in the case of to-uniform distribution of particles
close-K e y w o r d s : Molecular dynamics, numerical simulation, fast multipole method, tree algorithm, Anderson's method, pseudoparticle multipole method, special-purpose computer
Trang 10N O I D U N G CUA DE TAI
3.1 Dat v4n d l
Md phong dpng lue phan tii la mpt trong nhiing phUdng phap phd biin dUdc sii dung trong vat ly/hoa hpc d l nghien ciiu cac he nhilu hat Md phdng ddng luc phan
tii dUa tren dinh luat 2 Newton ve chuyen ddng: F = ma^ trong dd F la luc tac
dung tren hat; m, a tUdng iing la khdi lUdng va gia tdc cua hat Tuf cac thdng tin
ve luc tac dung tren mdi hat, xac dinh gia tdc cua mdi hat trong he Giai phudng trinh chuyin dpng d l sinh ra mdt dudng cong md ta vi trf, van toc, gia toc cua cac hat tai cac mdc thdi gian khac nhau Tii dudng cong nay, trang thai t i l p theo hay trang thai trUdc ciia he se dUdc du bao
Xet tren khfa eanh tfnh toan, viec thuc hien bai toan md phong ddng luc phan tii cd t h i dUdc md t a qua cac budc sau:
1 Chpn vi trf ban d i u cua cac hat vdi dien tfch cho trude trong he
2 Chpn mdt t a p hdp van tdc khdi tao ciia cac hat Cac van toc nay thudng dUdc chpn theo phan phdi Boltzmann ddi vdi mdt vai nhiet dp, sau dd dupe chuin hda sao cho tdng ddng lUdng ciia toan he b i n g 0
3 Tfnh ddng ludng cua mdi hat tu" van toc va khoi lUdng cua chiing
4 Tfnh lUc tUdng tac tlnh dien (Coulombic force) tren mdi hat
5 Tfnh vi trf mdi cho cac hat sau mdt khoang thdi gian n g i n sau dd Khoang thdi gian nay dUdc gpi la bude thdi gian (time step) Viec tfnh toan nay dUdc thuc hien b i n g each giai phudng trinh chuyin ddng dua tren dinh luat 3 Newton
6 Tfnh toan van tdc va gia tdc mdi cho cac hat trong he
7 Lap lai cac budc tu" budc 3 d i n budc 6
8 Lap lai qua trinh nay du lau d l cho he dat tdi trang thai can bing
9 Khi he dat tdi trang thai can bing, ghi lai vi trf cua cac hat sau mot sd vdng lap n h i t dinh Cac thdng tin nay thudng dUdc ghi lai sau tii 5 din 25 vdng lap Danh sach cac tpa dp nay tao thanh quy dao chuyen dong ciia he hat
Trang 113 Nd DUNG CUA DE TAI 10
10 T i l p tuc qua trinh lap di lap lai va ghi lai dii lieu cho d i n khi cd dii dir lieu
dUdc tap hdp de dua ra cac kit qua vdi dp chfnh xac mong mudn
11 P h a n tich cac quy dao chuyin ddng d l thu dUde thdng tin vl he
Luc / ( r i ) va the nang (/)(ri) tlnh dien dUdc cho bdi cac cdng thiic sau:
/ ( n ) = f : ^ ^ ^ (3.1)
va
N
^(^"'^)-E7' (3.2)
trong dd TV la s6 lUdng c^c hat, rJ va QJ tUdng iing la vi trf va dien tfch ciia hat j ,
r la khoang each giiia hat i va j dUdc dinh nghia bdi cdng thiic r — \ / | 7 \ — fj\'^
D l dang t h i y r i n g d budc 4, thuat toan tfnh luc tUdng tac tmh dien ddn gian
n h i t chfnh la thuc hien tfnh luc tUdng tac giiia timg cap hat Do dd, chiing ta se
phai thuc hien N{N — l ) / 2 phep tfnh luc dua vao phUdng trinh (3.1) Ndi each khac,
thuat toan ddn gian nay (sau day gpi t i t la thuat todn tinh lUc trUc tiep) cd dp phiic
tap tfnh toan 0{N'^)
Trong cae budc tfnh toan neu tren, budc tfnh luc tUdng tac tren mdi hat (budc
4) la nhiem vu nang n l n h i t xet vl mat tfnh toan Bdi vay c i n phai ap dung, cai
dat cac thuat toan cd dp phiic tap tfnh toan 0{N) hoac 0{N log N) va/hoac sieu
may tfnh, may tfnh song song hoac may tfnh chuyen dung tdc dp cao d l thUc hien
nhiem vu nay
D l tai cua chiing tdi cd hai nhiem vu chfnh
Nhiem vu thii n h i t la nghien ciiu, tim hilu cac v i n d l cd sd ciia tfnh toan hieu
nang cao va iing dung ciia tfnh toan hieu nang cao vao bai toan md phdng ddng
luc phan tii Trong p h i n cd sd tfnh toan hieu nang cao, chiing tdi nghien ciiu, tim
hieu cac v i n d l cd lien quan tdi cac kiln triic, mdi trudng va cdng cu tfnh toan hieu
nang cao: Cum may tfnh P C Linux va mdt sd p h i n ciing chuyen dung cd tdc dp cao
chuyen danh cho bai toan md phong TV-body hoac md phong ddng luc phan tii
Nhiem vu thii hai la nghien ciiu mdt iing dung cu t h i cua tfnh toan hieu nang
cao Chiing tdi nghien ciiu va d l x u i t phUdng phap mdi d l tang tdc dp tfnh luc tinh
dien x i p xi tren may tfnh chuyen dung song song MDGRAPE-2, n h i m tang tdc bai
toan md phong dpng lue phan tii
3.2 Tong q u a n ve t m h l\ic t\idng tac n h a n h trong mo
p h o n g dong \\ic p h a n tiJ
Nhiem vu nghien ciiu thii n h i t ciia d l tai la tim hilu vl cd sd tfnh toan hieu nang
cao: Cd sd ly luan, mdi trudng, cdng cu va pham vi iing dung cua tfnh toan hieu
nang cao Chiing tdi da nghien ciiu, tlm hieu va vilt bao cao chuyen d l vl tdng
Trang 123 NQI DUNG CUA DE TAI 11
quan tfnh toan hieu nSng cao, he thong file song song PVFS va giao dien lap trinh
OpenMP (Xem cdc bao cdo chuyen d l trong phu luc)
Nhiem vu thiJ hai la nghien ciiu mpt iing dung cu t h i ciia tfnh toan hieu nang
cao trong bai todn md phdng dpng luc phan tii: tang tdc dp tfnh luc tUdng tac tlnh
dien Tinh luc tUdng tac tinh dien trong bai toan md phong dpng luc phan tii la mdt
nhiem vu nang ne ve mat tfnh toan, ddi hoi thdi gian tfnh toan r i t ldn Do dd, tang
tdc dp tfnh lUc la v i n d l cd tfnh thdi sU Vin d l nghien ciiu nay la su kit hdp giiia
khoa hpc may tfnh va vat ly/hda hpc Cac hudng nghien ciiu chfnh d l tang tdc dp
tfnh lUc hien nay gdm cd:
1 Sli dung cac thuat toan "nhanh" cd dp phiic t a p 0{N) hoac 0{N log TV), trong
dd TV la so lUdng cac hat trong bai toan md phong dpng luc phan tur,
2 Sli dung cac may tfnh chuyen dung cd toc dp tfnh luc r i t cao, vf du MDGRAPE
(toe dp tfnh luc nhanh hdn may tfnh khdngchuyen dung vdi cimg gia tiln tii
100-1000 lin) hoac MD-Engine
3 K i t hdp cac hudng nghien ciiu tren
Nhiem vu nghien ciiu iing dung ciia d l tai rdi vao hudng thii ba Chiing tdi nghien ciiu
phUdng phap cai dat thuat toan khai triln da cue nhanh (fast multipole algorithm)
tren may tfnh chuyen dung MDGRAPE-2 va d l x u i t phUdng phap mdi d l tang tdc
dp tfnh lUc x i p xi
Sau day chiing tdi se lin ludt trinh bay cac phUdng phap tang tdc dp tfnh toan
luc trong md phong ddng lUc phan til theo ea ba hudng nhu da neu tren
3.2.1 C a c t h u a t t o a n nhanh trong linh v\lc m o p h o n g d o n g liic phan
tit '
Trong cac bai toan md phdng dpng luc phan tii cd diln (tiic la khdng cd cac tfnh
toan lUdng tii), cdng viec tfnh luc tUdng tac giiia cac hat chilm nhilu thdi gian n h i t
- khoang 95% tdng sd thdi gian chay chUdng trmh Thuat toan tfnh luc ddn gian
n h i t dUdc gpi la thuat toan true t i l p cd dp phiic tap 0{N'^) Nhu vay khi TV ldn, thdi
gian tfnh lUc se r i t ldn va cac md phdng ddng luc phan tii vdi hang trieu hoac hang
chuc trieu hat se ton r i t nhilu thdi gian, tham chf ngay ca khi sii dung GRAPE
Do dd da ed nhilu nghien ciiu d l x u i t cac thuat toan vdi dp phiic t a p 0(TV) hoac
0{N log TV) d l tfnh x i p xi luc vdi dp chfnh xac dilu khien dudc
Nam 1985, A Appel lin d i u tien d l x u i t thuat toan phan c i p d l tfnh luc vdi
dp phiic t a p O(TVlogTV) [3] Dua tren kit qua cua A Appel, nam 1986 P Hut va
J Barnes da phat triln thuat toan tree vdi dp phiic tap O(AMogTV) [5] Thuat toan
nay nhanh chdng dude sii dung rpng rai trong md phong vat ly thien van do tfnh
ddn gian va hieu qua ciia nd Nam 1987, L Greengard va V Rokhlin da phat triln
thuat toan khai triln da cue nhanh (fast multipole algorithm - FMM) d l tfnh luc
x i p xi trong khdng gian 2 chilu [19] Day la mdt thuat toan r i t phiic tap, dac biet la
cac cdng thiic biin ddi khai triln da cue va khai trien Taylor va do dd, phai din 10
nam sau, nam 1997, phien ban 3 chieu d i u tien cua thuat toan mdi dUdc Greengard
va Rokhlin cdng bd [20] Thuat toan FMM cd dd phiic tap tfnh toan 0{N) FMM
Trang 133 NOI DUNG CUA DE TAI 12
thudng dUdc suf dung cho cac bai t o i n md phong dpng lUc hpc phan tii eho cac he
cd so lUdng hat rat ldn Sau day chiing tdi se trinh bay ky hdn vl thuat toan FMM
va cac biin t h i cua thuat toan nay
3.2.2 Thuat toan F M M va cac b i i n t h i
3.2.2.1 T h u a t t o a n F M M
FMM la thuat toan d i u tien cho phep tfnh luc tUdng tac tinh dien x i p xi cho bai
toan md phong dpng luc phan tuf vdi dp phiie tap tfnh toan 0{N) FMM trong
khdng gian hai chilu dUdc L Greengard va V Rokhlin phat triln nam 1987 Sau dd
thuat toan trong khdng gian ba chilu dUdc cdng bd vao nam 1997 Do FMM la mdt
thuat toan r i t phiic tap, ehiing tdi chi n h i c lai y tudng chfnh cua FMM trong bao
cao nay Md t a ehi t i l t ciia thuat toan cd t h i tim trong cac bai bao ciia Greengard
va Rokhhn [19, 20]
M2L
Multipole expansion Local expansion Hinh 3.1: Y tudng chfnh ciia thuat toan tfnh luc FMM
Chiing t a c i n nhd lai la thuat toan tfnh lUc trUc tilp tfnh luc giiia timg cap hat,
ndi each khac thuat toan nay ed dp phiie tap tfnh toan O(TV^) Trong khi dd, y tudng
chfnh Clia thuat toan F M M la tfnh luc tUdng tac giua cac nhdm hat, sau dd tfnh x i p
xi lUc va t h i nang tren mdi hat b i n g each sii dung khai triln da cue va khai triln
Taylor Cd 5 pha chfnh trong thuat toan FMM Hinh 3.1 md ta y tudng cua FMM
Trong dd, M2M, M2L va L2L la ba pha d i u tien cua thuat toan cd y nghla nhu sau
M2M la biin ddi khai t r i l n da euc-khai triln da cue, M2L la biin ddi khai triln da
cUc-khai trien Taylor va L2L la biin ddi khai trien Taylor-khai trien Taylor Tiep
theo cac pha nay la pha tfnh luc tUdng tac "gin" va tfnh lUc tUdng tac "xa" Tfnh
luc tUdng tac g i n dUde thuc hien nhu thuat toan tfnh lUc true t i l p Luc tUdng tac
xa dudc thuc hien nhd viec liy dao ham rieng eua t h i nang dat dude tu* pha L2L
Chiing ta ky hieu hai pha tfnh luc tUdng tac g i n va xa tUdng iing la Fnear va
Fjar-Trong 5 pha tfnh toan ndi tren, M2L, Fnear va Fjar la cac pha tfnh toan tdn thdi
gian n h i t
Mac dil FMM dat dUdc dp phiic tap tfnh toan 0(N) nhung do tfnh chit plnic
tap ciia thuat toan n h i t la cac cdng thiic biin ddi trong cac pha M2M M2L va L2L
Trang 143 NOI DUNG CUA DE TAI 13
hi$u nang dat dUdc trong cai dat cua FMM chua cao Thdi gian thue hien FMM chi
thuc su t h a p hdn thdi gian thuc hien thuat toan tfnh luc true t i l p khi so hat TV kha
ldn, khoang 65535 trd len Bdi vay, cd nhilu biin t h i ciia FMM n h i m lam giam su
phiic t a p khi cai dat FMM de dat dUdc hieu nang eao hdn Sau day la mdt sd biin
t h i diln hinh
3.2.2.2 T h u a t t o a n c u a A n d e r s o n
Anderson [1] da d l x u i t phUdng phap biin t h i ciia FMM Uu dilm chfnh cua phUdng
phap nay la su ddn gian va hieu qua Anderson khdng sii dung cae cdng thiic biin
ddi khai triln da cue va khai triln Taylor Thay vao dd, dng da d l x u i t cac cdng
thiic mdi ddn gian hdn
PhUdng phap ciia Anderson dudc dUa tren cdng thiic cua Poisson Cdng thiie
Poisson cho phep giai bai toan gia tri bien ciia phUdng trinh Laplace Chiing tdi tdm
tat phUdng phap ciia Poisson nhu dudi day, tren cd sd dd trinh bay phudng phap
ciia Anderson
Cho trUdc t h i nang tren mat c i u ban kfnh a, khi dd gia tri t h i nang $ tai diem
f cd tpa dp c i u la f = (r, cf), 9) dUdc tfnh bdi cae cdng thiie:
Luu y chiing t a sii dung he tpa dp ciu trong hai cdng thiic (3.3) va (3.4) Trong
cac cdng thiic nay, ^{as) la gia tri t h i nang cho trUdc tren mat ciu S la miln liy
tfch phan va d day S ehfnh la mat c i u ban kfnh 1 cd tam tai goc tpa dp (gpi t i t la
mat cdu dOn vi)\ P^ la da thiic Legendre
D l cd t h i sii dung cac cdng thiic (3.3), (3.4) thay t h i cho cac cdng thiic ciia
khai trien da cue va khai trien Taylor, Anderson da d l x u i t phien ban "rdi rac" ciia
(3.3) va (3.4) Ong da rdi rac hda vl phai eiia cac cdng thiic tren b i n g each thay
tfch phan b i n g mpt tdng hiiu han va thay t h i miln S b i n g mdt tap hiiu han cac
dilm tren mat c i u Tap dilm nay dUdc xac dinh dUa vao khai niem t-design cdu do
Hardin va Sloane d l x u i t [24] Sau day la dinh nghla cua t-design ciu
Mdt t a p p = {Pi, ,PK] CO K dilm n i m tren mat c i u ddn vi Vt^ — S^~^ =
{x = (x], ,Xd) ^ R^ : X • X ~ 1} dUde gpi la mdt t-design ciu nlu ddng n h i t thiic
Trang 153 NQI DUNG CUA DE TAI 14
C i n chu y ring vdi t tong quat, ngudi t a v i n chua bilt dudc tap dilm tdi uu cho t-design c i u (tiic la tap dilm cd sd lUdng dilm K nho nhat) Tuy nhien b i n g thuc
nghi§m Hardin va Sloane v i n cd t h i xae dinh dUde cae t-design ciu Tpa dp ciia cac t-design c i u do Hardin va Sloane tim ra ed t h i dUde tai xudng tai dia chi
h t t p : //www r e s e a r c h a t t coin/''nj a s / s p h d e s i g n s /
Khai trien ngoai
Gia tri the nang Khai trien ngoai Khai trien trong
Hinh 3.2: PhUdng phap ciia Anderson
Sli dung t-design ciu, Anderson da dua ra hai cdng thiic rdi rac hda ciia cae cdng thiic (3.3) and (3.4) nhu sau:
vdi r < a (khai t r i l n trong) 0 day Wi la cac trpng sd va p la cac sd hang khdng bi
c i t khi rdi rac hda Sau day chiing t a gpi p la c i p khai trien Hinh 3.2 minh hpa y
tudng phUdng phap ciia Anderson
3 2 2 3 T h u a t t o a n cua M a k i n o
Makino [39] d l x u i t phUdng phap khai tnen da cUc gid hat (Pseudo-Particle
Multi-pole Method - gpi t i t la P^M^hoac phUdng phap gia hat) P-^M^ sii dung cac cdng thiic la biin the ciia cac cdng thiic khai trien da cue Ldi fch cua phudng phap nay
la tinh ddn gian va hdn nu:a, cac cdng thiic cua P^M^ cd the dupe thuc hien tren may tfnh chuyen dung G R A P E
Trang 16'3 NOI DUNG CUA DE TAI 15
Y tudng chinh ciia P^M^ la sii dung mdt sd ludng nhd eae gid hat d l bilu diln
cac khai t r i l n da cue Ndi each khac, phUdng phap nay cho t a each x i p xi t h i nang gay ra do cac hat trong he b i n g t h i nang gay ra do cae gia hat, va sd lUdng cac gia hat nhd hdn nhilu so vdi sd lUdng cac hat thUc Mpt vf du ddn gian cd t h i t h i y dUde la cd t h i thay t h i 100 qua can 1kg b i n g hai qua can 50kg d l dat dUde eung mdt trpng luc 100kg Hinh 3.3 minh hpa y tudng phUdng phap P^M^ cua Makino
L^
Gia hat Hinh 3.3: PhUdng phap P^M^ cua Makino
Y tudng ciia Makino kha gidng vdi y tudng cua Anderson Ca hai phUdng phap
d i u Sli dung so lUdng hiiu han cac dai lUdng rdi rac de x i p xi t h i nang do cac hat
gay ra Dilm khac biet chfnh n i m d chd Anderson sii dung mdt sd hiiu han cac gid tri thi ndng trong khi dd Makino sii dung mdt sd hiiu han cae gid hat
PhUdng phap P^^M"^ dUdc md ta nhu sau Phan bd cua cac gia hat (hay vi trf
va dien tfch cua ehiing) phai dUde xac dinh sao cho phii hdp vdi eae he sd cua khai triln da cue Theo each t i l p can ddn gian n h i t thi ehiing t a phai giai mdt he phUdng trinh b i n g each nghich dao cdng thiic khai triln da cue Cach t i l p can nay chi thfch
hdp cho cdng thiie khai triln da cue c i p p < 2 [30]
Tuy nhien n l u c i p khai triln p > 2 thi viec nghich dao cdng thiic khai trien da
cue la r i t khd Trong trudng hdp nay, Makino da dUa ra mdt each giai ddn gian Ong da ed dinh vi trf cua cac gia hat theo t-desgin c i u va chi giai phudng trinh d l tim dien tfch eua cac gia hat Cach tilp can nay d i n d i n viec giai mOt he phUdng trinh tuyIn tfnh mac dii so lUdng gia hat cd tang len Do dd xet mdt each tdng the, each t i l p can eiia Makino da lam cho bai toan ddn gian di kha nhilu
Vdi each t i l p can neu tren, Makino da dua ra dupe cdng thiic khai triln ngoai cho phUdng phap gia hat nhu sau:
Trang 173 NOI DUNG CUA DE TAI 16
gdc giu'a fi vk vector vi trf Rj ciia gia hat j Chiing minh dl din tdi cdng thiic (3.8)
cd trong tai Heu [39]
3.2.3 M a y t i n h chuyen d u n g song song G R A P E va iJng d u n g
HOST COMPUTER
Positions, charges
May tfnh chuyen dung song song GRAPE dUdc chl tao tai Vien nghien ciiu vat
ly va hda hpc Nhat Ban (RIKEN [59]) dl tfnh luc tUdng tac hoac thi nang giira cac hat Day la mdt may tfnh ed kiln triic da xu" ly pipeline GRAPE khdng phai la mpt may tfnh cd thi hoat dpng dpc lap Dl sii dung GRAPE chiing ta cin mdt he may
tfnh gdm cd mpt may tfnh thdng thudng, vf du IBM-PC (sau day gpi la ma.y chu)
va mdt may tfnh GRAPE Chiing ta gpi t i t he ma\" tfnh nay la he GRAPE
Trang 183 Nd DUNG CUA DE TAI 17
Mdt he may tfnh G R A P E ddn gian n h i t bao gdm mdt may ehii va mdt may
G R A P E dUdc noi vdi nhau qua mpt dudng truyin ehing han bus PCI hoac Compact PCI G R A P E se thue hien toan bp eae tfnh toan l u e / t h l nang va may chii thue hien
t i t ca cac phep tfnh khac Sd do khoi cua he may G R A P E dUdc md t a tren hinh 3.4 Hinh 3.5 va 3.6 minh hpa may tfnh chuyen dung MDGRAPEl-2 la mpt may tfnh thupc hp G R A P E , phien ban eho bus PCI va Compact PCI
D l tfnh l u e / t h e nang, may ehu gui thdng tin vl vi trf va khdi lUdng (trong trudng hdp tUdng tae h i p d i n ) hoac dien tfch (trong trudng hdp tUdng tac tlnh dien) ciia cac hat d i n G R A P E , sau dd nhan kit qua tfnh toan l u c / t h l nang t r a lai tu* GRAPE Tdc dp tfnh l u e / t h l nang eua G R A P E nhanh hdn cac may tinh thdng thudng cd cimg gia tiln khoang 100-1000 lin Vf du: may tfnh MDGRAPE-2 phien ban Compact PCI cd toe dp tfnh lue tUdng dUdng vdi 192 GFlops Sau day la md ta chi tilt vl chiic nang cua G R A P E
Chiic nang ed sd eua G R A P E la tfnh lue /(r*i) tac ddng len hat i tai vi trf f^, va the nang (j){u) k i t hdp vdi / ( r ^ ) Mae dii cd nhilu phien ban khac nhau cua GRAPE
vdi muc dfch iing dung khac nhau nhu md phdng dpng Iuc phan tii hay vat ly thien van, nhung chiic nang cd sd ciia cac phien ban nay v i n khdng thay ddi
Luc /(r'l) va t h i nang (i){fi) dUde eho bdi cac cdng thiic:
Trang 193 NQI DUNG CUA DE TAI 18
trong đ TV la s6 lUdng cac hat, fj va qj tUdng iing la vi trf va dien tich eua hat j , r^
la khoang each mem (danh cho cac ufng dung vat ly thien van) giiia hat i va j dUdc dinh nghla bdi cong thiic Vs = %/'\fi — fj\'^ + ê trong đ e la tham sd mim
Dl tfnh luc /(fl), may chii cin gufi dii lieu cho GRAPE bao gom f,, fj, qj, e, va Ậ GRAPE tfnh luc f{fi) vdi mpi i sau đ giM kit qua tra lai may chiị Thi nang
(t>{fi) dUde tinh tUdng tụ
Doi vdi bai toan mo phdng dpng luc phan tuf, chiing ta khdng ed tham sd mim, dilu đ cd nghia la € = 0 Khi đ cac phUdng trinh (3.9) va (3.10) trd thanh (3.1)
va (3.2)
Mdi may ehu cd t h i lien kit vdi nhilu may GRAPE dl tang tdc dp tfnh luc
va cae he may GRAPE cd t h i lien kit vdi nhau dl tao thanh mpt cum may tfnh GRAPE vdi tdc dp tinh toan r i t caọ Hinh 3.7, 3.8 va 3.9 minh hpa mdt cum may tfnh GRAPE dUde dat ten la MDM [54] nhin tir mat trude, mat sau va ben traị Cum may tfnh nay cd tdc dp tUdng dUdng 78TFlops
Hinh 3.7: Cum may tfnh GRAPE nhin tii mat trUdc [54]
Trong nam 2005, phien ban tilp theo cua MDGRAPE-2 la MDGRAPE-3 [61]
da dUdc che tao thii thanh cdng Mdi chip MDGRAPE-3 ed tdc dp tUdng dUdng 165GFlops d tan so 250MHz va 200GFlops d t i n so 300MHz (nhanh hdn MDGRAPE-
2 tii 10 din 12 lin) Thang 6/2006, cum may tfnh MDGRAPE-3 dimg dl md phdng cac ling dung du doan ciu triic protein da dUdc hoan thanh va dUde dat ten la
Protein Explorer Protein Explorer la may tfnh diu tien tren thi gidi dat tdc dp
tfnh toan vUdt ngudng 1 Petaflops vdi tdc dp 1.4PFlops nhanh hdn may tfnh diing dau trong Top500 [56] nam 2006 la IBM BlueGen khoang 3 lin Tuy nhien Protein Explorer la may tfnh chuyen dung nen khdng dUdc xip hang trong Top500
Trong phin tilp theo chiing tdi trinh bay cac nghien ciiu va kit qua da cd vl viee cai dat thuat toan nhanh tren cac phin ciing chuyen dung, tren cd sd đ thuc
hien nhiem vu nghien cxCn eiia dl taị
Trang 203 NQI DUNG CUA DE TAI 19
Hinh 3.8: Cum may tfnh G R A P E nhin tu* mat sau [54]
3 2 4 C a i d a t t h u a t t o a n n h a n h t r e n p h 4 n c i i n g c h u y e n d u n g
3.2.4.1 Cai d a t t h u a t t o a n t r e e t r e n p h a n ciing G R A P E
Nam 1999, J Makino va A Kawai da phat triln phUdng phap P^M'^va lin d i u tien cai dat thuat toan tree tren may GRAPE, eho kit qua r i t tdt Trong cac nam 2000-2004, J Makino va A Kawai da hen tuc cd nhiing phat triln mdi trong viec tang tdc thuat toan tree P h i n m i m cai dat thuat toan nay ciia hai tac gia nay da tang tdc thuat toan tree mdt each dang kl Vdi cac md phdng yeu ciu dp chfnh xac
t h i p , G R A P E tang tdc tree khoang 10 lin, va vdi cac md phdng yeu ciu dp ehfnh xac cao, G R A P E tang tdc thuat toan tree tdi x i p xi 60 lin
3.2.4.2 Cai d a t t h u a t t o a n F M M t r e n p h i n ciing M D - E N G I N E
Nam 2003, T Amisaki, S Toyoda, H Miyagawa va K Kitamura da cai dat FMM tren p h i n ciing tUdng tU nhu MDGRAPE-2: MD-ENGINE [2] Tuy nhien kit qua dat dUdc ehua hoan toan tdt Dp tang tdc cua thuat toan FMM tren MD-ENGINE kha han chl Ly do chfnh la cae tac gia tren chi tang tdc dUdc p h i n tfnh luc trUc
Trang 213 NOI DUNG CUA DE TAI 20
Hinh 3.9: Cum may tfnh GRAPE nhin tii ben trai [54]
tilp Fnear eiia FMM ma khdng tang tdc dude pha M2L va pha tfnh luc xa Fjar
-hai trong ba pha tfnh toan tdn thdi gian n h i t ciia FMM (xem p h i n 3.2.2.1)
3.3 Noi d u n g va k i t qua nghien ciJu
3 3 1 C a c k h o k h a n c 4 n giai q u y e t
Nhiem vu nghien ciiu ciia ehiing tdi la tim each tang toc dp tfnh luc tUdng tac tinh
dien cho bai toan md phdng dpng luc phan tii Nhiem vu nay n i m trong hudng
nghien ciiu thii 3 nhu da neu trong p h i n 3.2
Trong thuat toan FMM, cac pha M2L, Fnear va Fjar la tdn thdi gian n h i t Do
dd cin phai sii dung GRAPE d l tfnh toan n h i m tang tdc cac pha nay Tfnh toan
eho pha Fnear tren GRAPE la hiin nhien vi pha nay sii dung cdng thiic (3.1) Nhu
vay d l tang tdc FMM ehiing tdi phai giai quylt cac nhiem vu sau day:
1 Tang tdc pha M2L Viec tang tdc nay khdng d l dang nhu tang tdc pha Fnear
vi M2L sii dung cac cdng thiie biin ddi tii khai trien da cue sang khai triln
Taylor T i t ca cac cdng thiic khai triln nay diu khdng t h i tfnh toan tren
GRAPE
2 Tang toe pha F/ar- Trong thuat toan FMM gdc ciia Greengard va Rokhlin,
Ffar dude tfnh nhd liy dao ham rieng cua t h i nang dat dUdc trong pha L2L
Cdng thiic tfnh lUc cho pha Fjar la / = -V^ vdi / la luc va $ la t h i nang
Cdng thu'c nay ciing khdng t h i tfnh toan tren G R A P E Tudng t u nhu vay,
cdng thu'c tfnh Fjar trong phUdng phap bien the ciia Anderson ciing khdng
t h i tfnh true t i l p tren G R A P E Makino da cd cdng thiic cd the tfnh khai trien
da cue tren GRAPE nhung chi ap dung dUde cho pha M2M va M2L cua FMM
(khi su" dung kit hdp vdi phUdng phap ciia Anderson)
Trang 223 NQI DUNG CUA DE TAI 21
3 3 2 G i a i p h a p v a k i t q u a c u a c h u n g t o i
D l giai quyet nhiem vu thii n h i t , chiing tdi ap dung phUdng phap P^M^ cua Makino cho pha M2M, sau dd ap dung phUdng phap eiia Anderson eho pha M2L Khi dd, pha M2M dUdc tfnh toan tren may chii va pha M2L dUde tfnh toan tren GRAPE
P h a t i l p theo, L2L dUde ap dung nhd cdng thiie khai triln trong (3.4) cua Anderson
P h a tfnh luc Ffar' Trong ban cai dat d i u tien ciia chiing tdi (gpi t i t la ban 1.0), Ffar dUdc tfnh qua cdng thiic dao ham ciia t h i nang nhu sau [9]:
tfnh pha Fjar tren may tfnh G R A P E t h i hien trong ban cai dat thii hai ciia chiing
tdi (gpi tat la ban 2.0) [10, 11] D i u tien chiing tdi da tim ra mpt cdng thiic mdi tUdng tu nhu cdng thiic khai triln trong ciia Anderson Chiing tdi gpi day la cdng thiic khai triln P^M^ trong:
Bang 3.1: Cac pha tfnh toan va cac cdng thiic tUdng iing dUdc sii dung trong hai ban cai dat FMM Cac p h i n in dam dudc thUc hien tren may tfnh G R A P E
Chiing tdi da thilt lap hai he thdng may tfnh G R A P E He thong thii nhit (goi
t i t la he I) cd mdt card MDGRAPE-2 phien ban Compact PCI (64 pipelines, tdc
Trang 233 NQI DUNG CUA DE TAI 22
Hinh 3.10: So sanh thdi gian tfnh luc ciia thuat toan FMM va thuat toan tfnh luc true tilp tren he I Dudng cong cd cac hinh trdn the hien hieu nang ciia FMM tren may tfnh MDGRAPE-2 Dudng eong vdi eae hinh ngii giac la hieu nang cua FMM tren may chii Cac hinh trdn va ngii giac td den la hieu nang tUdng iing vdi dp chfnh
xac cao {p = 5), khdng td den iing vdi dp chfnh xac thip {p = 1) Dudng cong khdng
cd ky hieu vdi net liln va net diit tUdng iing la hieu nang ciia thuat toan tfnh luc true tilp tren MDGRAPE-2 va may chii
Trang 243 NOI DUNG CUA DE TAI 23
Hinh 3.11: TUdng tU nhu hinh 3.10 nhung vdi he II
dp cue dai tUdng dUdng 192GFlops) va mdt may chu Compaq DS20E (Alpha 21264, 667MHz) He thong thii hai (gpi t i t la he II) cd mpt card MDGRAPE-2 phien ban PCI (16 pipelines, tdc dp cue dai tUdng dUdng 48GFlops) va mpt may chu Intel Pentium 4 2.2GHz sii dung bo mach ehii Intel D850 Chiing tdi da thii nghiem thuat
toan FMM cai dat tren G R A P E vdi dp chfnh xac tfnh luc t h i p (cip khai trien p — I)
va cao (p == 5) vdi phan bd cac hat gin ddng n h i t trong khdi lap phudng K i t qua thuc nghiem dUde minh hpa tren cac hinh 3.10, 3.11, 3.12, 3.13 va bang 3.2 K i t qua thuc nghiem tren he I cho tren cac hinh 3.10 va 3.12 K i t qua thue nghiem tren he
II eho tren cac hinh 3.11, 3.13 va bang 3.2
Su: dung cdng thiic (3.12), ehiing tdi da tfnh toan dUde pha Ffar tren GRAPE
Sli dung cdng thiic nay, ban cai dat 2.0 ciia chiing tdi cd hieu nang cao hdn ban 1.0 tu" 2 l i n (vdi dp chfnh xac tfnh luc t h i p ~ 10~^) va 5 lin (vdi dp chfnh xac cao
~ 10"^) K i t qua thUc nghiem dUdc t h i hien tren hinh 3.14
Chiing tdi da so sanh hieu nang ciia thuat toan FMM do ehiing tdi cai dat vdi hieu nang ciia mdt ban cai dat khac do T Wrankin thUe hien [49] (Distributed Parallel Multipole Tree Algorithm - DPMTA ban 3.1.3 cd t h i tai xudng tir dia chi
h t t p : //www ee duke edu/~wrankin/Dpmta/) tren he II K i t qua so sanh dUdc cho trong bang 3.3 Hieu nang cua FMM (cd GRAPE) do chiing tdi cai dat cao hdn hieu nang ciia DPMTA khoang 10 lin, va t h i p hdn hieu nang cua DPMTA khoang 1.1-1.4 lin (nlu khdng cd GRAPE),
Trang 253 NOI DUNG CUA DE TAI 24
Hinh 3.12: So sanh thdi gian tfnh lUc cua thuat toan FMM va thuat toan tree tren
he I Dudng cong cd cac hinh trdn thi hien hieu nang ciia FMM tren may tfnh MDGRAPE-2 Dudng cong vdi cac hinh tam giac la hieu nang ciia thuat toan tree tren MDGRAPE-2 Cae hinh trdn va tam giac td den la hieu nang tUdng iing vdi
dp ehfnh xae eao {p = 5), khdng td den iing vdi dp chfnh xac thip (p — 1)
Trang 263 NOI DUNG CUA DE TAI 25
Bang 3.2: Phan tfch thdi gian thuc hien cac pha tfnh toan eiia ban 2.0 tren he II vdi sd lUdng hat A^ = 1024 x 1024 =: 1048576 Chii y ring cae pha "Tao cay" va
"Tao danh sach lan can va danh sach tUdng tac" trong thuat toan FMM khdng dUde ehiing tdi md ta trong bao eao nay vi cac pha nay chi cd tfnh chit chuin bi cho tfnh toan va chilm rat ft thdi gian tfnh toan nhu ta thiy trong bang
Do chinh xac
Tao cay
Tao danh sach lan can
va danh sach tuOng tac
0.06 0.22
0.01 0.16 0.0004
0.17 0.01
0.78 8.57 3.92
13.27
14.78
Cao 1.03
0.08 5.92
0.21 4.78 0.18
5.17 0.34
0.97 17.37 9.48
27.82
40.36
Tren may chu
Thip 1.02
1.89 0.26
0.36
0
0
0.36 0.05
2.31 5.97
133.88
0
0
133.88 4.11
Bang 3.3: So sanh v6i ban cai dat DPMTA 3.1.3 cua Wrankin [49]
Ban 2.0 ciia chiing toi
CO GRAPE khong co GRAPE
2.9 16.4 64.0
34.1 196.5 878.8
Trang 273 Nd DUNG CUA DE TAI 26
Number of particles A'^
Hinh 3.13: Tudng tu nhu hinh 3.12 nhung vdi he II,
4M
3.4 T h a o l u a n
Chiing tdi da dat dUdc cac kit qua ehfnh sau day trong d l tai nghien c\hi QC.05.01:
• Chiing tdi da nghien eiiu tdng quan vl tfnh toan hieu nang eao, tfnh toan song song; da nghien ciiu vl mdi trudng lap trinh OpenMP, he thdng file song song ao P V F S ; da nghien eiiu vl cac phUdng phap va cdng cu tfnh luc nhanh trong md phdng dong luc phan tii, dac biet la thuat toan khai triln da cue nhanh FMM (Fast Mutlipole Method) va may tfnh song song chuyen dung
M D G R A P E - 2
• Da cai dat thanh cdng thuat toan FMM tren may tfnh chuyen dung GRAPE Sli dung may tfnh MDGRAPE}-2 va phUdng phap cai dat ciia chiing tdi, hieu nang Clia thuat toan F M M dude tang len tii 3 lin (dp ehfnh xac t h i p ~ 10"-^)
d i n 60 l i n (dp ehfnh xae cao ~ 10"^)
• Da tim ra cdng thiic mdi (3.12) d l cd t h i dUa toan bd p h i n tfnh toan luc tUdng
tac g i n Fnear va luc tUdng tac xa Ffar ^^^ ^^Y GRAPE, khic phuc dUdc han
chl v l hieu nang eua mpt sd ban cai dat tren p h i n ciing tUdng t u [2]
• Hieu nang dat dxxac ciia FMM tren G R A P E la cao hdn so vdi thuat toan tree
tren G R A P E trong trudng hdp yeu ciu tfnh luc tUdng tac vdi dp chfnh xac cao
Trang 283 NQI DUNG CUA DE TAI 27
td den iing vdi dd chinh xae eao (p = 5), khdng td den iing vdi dd ehfnh xac t h i p (p = i )
• So sanh vdi cae ban cai dat khac ciia FMM, thuat toan eiia chiing tdi khi thuc hien tren G R A P E ed hieu nang cao hdn khoang 10 lin
Sd lieu t h i hien trong bang 3.2 eho thiy, toe dp truyin dii lieu qua bus giua GRAPE
va may chii la nguyen nhan chfnh lam han chl toc dp ciia thuat toan FMM tren
G R A P E va tdc dp tfnh toan cua G R A P E da dUde khai thae hieu qua Do dd giai phap tang tdc trong tUdng lai khdng chi la tang tdc may G R A P E va may chii ma cdn la tang toe dp t r u y i n du lieu giua may chii va G R A P E Chang han chiing ta cd
t h i Sli dung cac bus mdi nhu PCI-X hoac PCI Express
PhUdng phap cai dat eua chiing tdi nhd vao cdng thiic mdi cd the dUde thuc hien tren cum may tfnh G R A P E d l cd dUdc mpt ban cai dat song song vdi hieu nang cao hdn nhilu lin
Chiing tdi se t i l p tue nghien ciiu cai dat tfch hdp FMM vdi mdt he p h i n mim
md phdng ddng lUc phan tii, ehing han NAMD [33, 36] d l cd t h i sii dung GRAPE cho cac bai toan vat ly
Trang 293 Nd DUNG CUA DE TAI 28
3.5 K i t luan va kiin nghj
Nghien ciiu v l cac phUdng phap tfnh toan trong md phdng ddng luc phan tii la mpt
linh vUc nghien ciiu cd tfnh c h i t lien nganh D l ed t h i phat triln Imh vUc nghien
Cliu nay, c i n cd sU hdp tae chat ehe giua cac can bd nghien eiiu cua eae nganh khae
nhau
P h a t trien va sii dung may tinh chuyen dung [45, 38] trong mdt sd linh vuc nhu
md phdng dpng lUe phan til, md phdng A'"-body trong vat ly thien van la mdt xu
hudng nghien eiiu trong khoang 15 nam trd lai day trong ITnh vUc tfnh toan hieu
nang eao Xu hudng nay hien dang phat trien manh va cd nhiing Uu va nhudc diem
ehfnh nhu sau Uu dilm chfnh ciia xu hudng nay la chi phi t h i p : Sd tiln phai chi
tra eho mdi MFlops tfnh toan t h i p hdn so vdi ehi phf phai tra n l u sii dung cac sieu
may tfnh phd dung [29], ddng thdi hieu nang dat dUde ciia may tfnh chuyen dung
thudng cao hdn hieu nang ciia may tfnh phd dung nhilu lin NhUdc diem ehfnh cua
xu hudng nay la so lUdng bai toan ed t h i giai tren may chuyen dung bi han chl Tuy
nhien cac bai toan ed t h i giai tren may tfnh G R A P E khdng ehi la md phdng ddng
lue phan til hoac md phdng A^-body Cac bai toan khac trong vat ly plasma [50],
giai phUdng trinh tieh phan bien [48], giai phUdng trinh Poisson-Boltzmann [25] da
dUde thuc hien tren may tfnh GRAPE
Cac may tfnh G R A P E hien cd tai Vien Khoa hpc va Ky thuat hat nhan (Dudng
Hoang Qudc Viet - Nghla Dd, quan C i u Giiy, Ha Ndi) true thupc Vien nang ludng
nguyen til Viet Nam va ed t h i dUdc sii dung hoan toan miln phf Mdt so phin mim
ndi tiing trong linh vUe md phdng ddng lue phan til nhu AMBER [53], DL-POLY
[55] da hd trd p h i n eiing GRAPE Trong dilu kien hien nay eiia Viet Nam, viec sil
dung cae sieu may tfnh tdc dp cao thudng khdng kha thi do ehi phi d i u tu ban d i u
va ehi phf van hanh r i t cao May tfnh chuyen dung cho mpt sd bai toan da neu tren
la mdt giai phap hiiu fch d l dat dUdc tdc dp tfnh toan cao vdi gia re
Trang 30Tai lieu t h a m khao
[1] C R Anderson, An implementation of the fast multipole method without tipoles, SIAM J Sci Stat Comput 13 (4), 1992, 923-947
mul-[2] T Amisaki, S Toyoda, H Miyagawa, K Kitamura, Development of ware accelerator for molecular dynamics simulations: a computation board that calculates nonbonded interactions in cooperation with fast multipole method Journal of Computational Chemistry 24, 2003, 582-592
hard-[3] A Appel, An efficient program for many-body simulation, SIAM J Sci Stat Comput 6 (1),1985, 85-103
[4] J E Barnes, A modified tree code: Don't laugh; It runs Journal of tional Physics 87, 1990, 161-170
Computa-[5] J E Barnes, P Hut, A hierarchical ©(A''log TV) force-calculation algorithm, Nature 324, 1986, 446-449
[6] R Buyya, High performance cluster computing: Architectures and systems, volume 1, Prentice-Hall P T R , 1999
[7] R Buyya, High performance cluster computing: Programming and applications, volume 2, Prentice-Hall P T R , 1999
[8] R Chandra, L Dagum, D Kohr, D Maydan, J McDonald, R Menon, Parallel programming in OpenMP, Morgan Kaufmann Publishers, 2001
[9] N H Chau, A Kawai, T Ebisuzaki, Implementation of fast multipole rithm on special-purpose computer MDGRAPE-2, Proceedings of the 6th World Multiconferenee on Systemies, Cybernetics and Informatics SCI2002, Orlando, Colorado, USA, July 14-18, 2002, 477-481
algo-[10] N H Chau, A Kawai, T Ebisuzaki, Special-purpose computer accelerated fast multipole method, submitted to Journal of Computer Science and Cybernetics,
Trang 31TAI LIEU THAM KHAO 30
12] N H Chau, A Kawai, T Ebisuzaki, A new implementation of fast multipole
algorithm on special-purpose computer MDGRAPE-2, Proceeding of Annual
Meeting of Molecular Simulation Society of Japan, Niigata, Dec 16-18, 2002
13] J Dongarra, I Foster, G Fox, W Gropp, K Kennedy, L Torczon, A White,
Sourcebook of parallel computing, Morgan Kaufmann Publishers, 2003
14] L Fereira, G Kettmann, A Thomasch, E Sileoeks, J Chen, J-C Daunois, J
Ihamo, M Harada, S Hill, W Bernoechi, Linux H P C cluster installation, IBM
Redbooks, 2001
15] T Fukushige, A Kawai, J Makino, Structure of dark matter halos from
hierar-chical clustering III Shallowing of the inner cusp, Astrophysical Journal 606,
2004, 625-634
16] T Fukushige, J Makino, A Kawai, Constructing P C - G R A P E Cluster Using
GRAPE-6A, submitted to Pubhcations of the Astronomical Society of Japan
17] S L Graham, M Snir, C A Patterson, Getting up to speed: The future of
' supercomputing, T h e National Academies Press, 2005
18] A Grama, A Gupta, G Karypis, V Kumar, Introduction to parallel
comput-ing, 2'^'^ edition, Addison Wesley, 2003
19] L.Greengard, V RokhHn, A fast algorithm for particle simulations, Journal of
Computational Physics 73, 1987, 325-348
20] L Greengard, V Rokhlin, A new version of the fast multipole method for the
Laplace equation in three dimensions, Acta Numerica 6, 1997, 229-269
21] W Gropp, E Lusk, T Sterling, Beowulf cluster computing with Linux, 2"^
edition, T h e MIT Press, 2003
22] W Gropp, E Lusk, A Skjellum, Using MPI, The MIT Press, 1999
23] J M Haile, Molecular dynamics simulation: Elementary methods, Wiley &
Sons Inc., 1997
24] R H Hardin, N J A Sloane, McLaren's improve snub cube and other new
spherical design in three dimensions Discrete and Computational Geometry 15,
1996, 429-441
25] S Hofinger, Solving the Poisson-Boltzmann equation with the speeiahzed
com-puter chip MD-GRAPE-2 Journal of Computational Chemistry, 26 (11), 2005
1148-1154
26] Y Hu, S L Johnsson A data-parallel implementation of 0 ( A ' j hierarchical
A'-body methods, Intl J of Supercomput Appl and High Perf Comput., 10(1),
1996, 3-40
Trang 32TAI LIEU THAM KHAO 31
27] C Hughes, T Hughes, Parallel and distributed programming using C + + ,
Ad-dison Wesley, 2003
28] Z Juhasz, P Kacsuk, D Kranzlmiiller, Distributed and parallel systems:
Clus-ter and grid computing Springer Science & Business Media Inc 2005
29] A Kawai, T Fukushige, J Makino, $7.0/Mflops astrophysical N-body
simu-lation with treecode on GRAPE-5, Supercomputing 99, June 20-25, Grecotel
Imperial Hotel Rhodes, Greece
30] A Kawai, J Makino, Pseudoparticle multipole method: A simple method to
im-plement a high-accuracy treecode, The Astrophysical Journal, 550, 2001,
L143-L146
31] A Kawai, J Makino, High-accuracy treecode based on pseudoparticle multipole
method Proceedings of the 208th Symposium of the International Astronomical
Union, Tokyo, Japan, July 10-13, 2001, 305-314
32] A Kawai, J Makino, T Ebisuzaki, Performance analysis of high-accuracy tree
code based on the pseudoparticle multipole method, The Astrophysical Journal
Supplement ,151, 2004, 13-33
33] P Lakshminarasimhulu, J D Madura, A cell multipole based domain
decom-position algorithm for molecular dynamics simulation of systems of arbitrary
shape, Computer Physics Communications, 144, 2002, 141-153
34] A L Lastovetsky, Parallel Computing on Heterogeneous Networks, John Wiley
& Sons, 2003
35] A R Leach, Molecular modelling: Principles and apphcations, 2'^^ edition,
Prentice-Hall, 2001
36] J A Lupo, Z Q Wang, A M MeKenney, R Pachter, W Mattson, A large scale
molecular dynamics simulation code using the fast multipole algorithm (FMD):
performance and application Journal of Molecular Graphics and Modelling, 21,
2002, 89-99
37] J Makino, Treecode with a special-purpose processor, Publ Astron Soc Japan
43, 1991, 621-638
38] J Makino, M Taiji, Scientific Simulations with Special-Purpose Computers
- T h e G R A P E Systems Chichester: John Wiley k Sons, 1998
39] J Makino, Yet another fast multipole method without multipoles -
pseudopar-ticle multipole method Journal of Computational Physics, 151, 1999 910-920
40] R S Morrison, Cluster computing: Architectures, operating systems, parallel
processing & programming languages, GNU General Public License 2003
Trang 33TAI LIEU THAM KHAO 32
[41] T Narumi, A Kawai, T Koishi, An 8.61 Tflop/s molecular dynamics
simula-tion for NaCl with a special-purpose computer: MDM, Proceedings of SC2001,
Denver, Colorado, USA, November 10-16, 2001, in CD-ROM
[42] W, H Press, S A Teukolsky, W T Vetterling, B P Flannery, Numerical
recipe in C - T h e art of scientific computing, 2"^ edition, Cambridge University
Press, 1992
[43] T Sterling, Beowulf cluster computing with Windows, The MIT Press, 2002
[44] R Susukita, T Ebisuzaki, B G Elmegreen, H Purusawa, K Kato, A Kawai,
Y Kobayashi, T Koishi, G D McNiven, T Narumi, K Yasuoka, Hardware
accelerator for molecular dynamics: MDGRAPE-2, Computer Physics
Commu-nications, 155, 2003, 115-131
[45] D Sugimoto, Y Chikada, J Makino, T Ito, T Ebisuzaki, M Umemura, A
special-purpose computer for gravitational many-body problems, Nature, 345,
1990, 33-35
[46] M Taiji, T Narumi, Y Ohno, Protein Explorer: A Petaflops Special-Purpose
Computer for Molecular Dynamics Simulations, Genome Informatics 13, 2002,
461-462
[47] The RedHat cluster manager installation and administration guide, RedHat
Inc., 2002
[48] T Takahashi, A Kawai, T Ebisuzaki, Accelerating boundary integral equation
method using a special-purpose computer, International Journal for Numerical
Methods in Engineering, 66 (3), 2005, 529 548
[49] W T Wrankin, J A Board, A Portable Distributed Implementation of the
Par-allel Multipole Tree Algorithm, Proceedings of the Fourth IEEE International
Symposium on High Performance Distributed Computing-HPDC 95, The Ritz
Carlton Pentagon City, Virginia, USA, August 1-4, 1995,17-22
[50] Yuichi Yatsuyanagi, Toshikazu Ebisuzaki, Yasuhito Kiwamoto, Tadatsugu
Ha-tori and Tomokazu Kato, "Simulations of diocotron instability using a
special-purpose computer, MDGRAPE-2", Phys Plasmas, 10 (2003) 3188-3195
[51] L Ying, G Biros, D Zorin, A kernel-independent adaptive fast multipole
al-gorithm in two and three dimensions Journal of Computational Physics, 196,
2004, 591-626
[52] L Ying, G Biros, D Zorin, H Langston, A new parallel kernel-independent fast
multipole method, Proceedings of the A C M / I E E E SC2003 Phoenix, Arizona,
USA, November 15-21, 2003
[53] h t t p : / / a i n b e r s c r i p p s e d u /
[54] h t t p : / / a t l a s r i k e n j p / r a d i n /
Trang 34TAI LIEU THAM KHAO 33
Trang 35P h u luc
Phu luc gdm ed: 02 bai bao ciia de tai da gili dang tap chf Tin hpc va Dilu khiln
hpc, Tap chl Khoa hpe Dai hpe Quoe gia Ha Ndi; 01 tdm t i t bao eao tai Hdi thao qude t l vl tfnh toan hieu nang eao dUde td chiic tai trudng Dai hpe Khoa
hpe T u nhien, DHQGHN ngay 23-24/11/2005 {International Workshop on High Performance Computing, Hanoi, November 23-24, 2005)\ 01 bao cao tdng quan, 02
bao cao chuyen d l vl tfnh toan song song do cae can bd tham gia d l tai thUe hien
va 03 bia luan van tot nghiep dai hpc thuc hien theo hudng nghien ciiu cua d l tai
34
Trang 36A new formulation for fast calculation of far field
force in molecular dynamics simulations
Nguyen Hai Chau
College of Technology Vietnam National University, Hanoi, Vietnam
Email: chaunhOvnu edu vn
Abstract
We have developed a new formulation for fast calculation of far-field force of fast
multipole method (FMM) in molecular dynamics simulations FMM is a linear
al-gorithm to calculate force for molecular dynamics simulations GRAPE is a
special-purpose computer dedicated to Coulombic force calculation It runs 100-1000 times
faster than normal computer at the same price However FMM cannot be implemented
directly on GRAPE We have succeeded to implement FMM on GRAPE and
devel-oped a new formulation for far-field force calculation Numerical tests show that the
performance of FMM using our new formulation on GRAPE is approximately 2-5
times faster than that of FMM using conventional far field formulation
Tdm tat Bai bao nay trinh bay phUdng phap tinh toan mdi nhim tang tdc do tinh luc tUdng
tac xa trong thuat toan FMM dl thuc hien bai toan mo phong dpng luc phan tii
Chung tdi da de xuit cong thiic mdi de tang tie do thuc hien thuat toan FMM tren
may tfnh song song chuyen dung GRAPE FMM la thuat toan tfnh luc tUdng tac vdi
do phiic tap tfnh toan tuyIn tinh trong he A-chit dilm GRAPE la mpt hp may tfnh
song song chuyen dung danh dl tinh luc tuong tac tinh dien Culdng hoac luc hip din
vdi tdc dp cao hdn cac may PC thong thudng tii 100 din 1000 lin Tuy nhien GRAPE
khdng thi true tilp tfnh dUdc bilu thiic tinh luc xa theo cac phudng phap truyin
thdng da cd, tiic la lay dao ham rieng ciia ham the nang Chung tdi da tim ra cac
cdng thiic mdi dl cd thi thuc hien viec tfnh luc tuong tac xa tren GRAPE Kit qua
thuc nghiem cho thiy thuat toan FMM cai dat tren GRAPE sii dung phuong phap
mdi Clia chiing tdi dUdc tang tdc tii 2 lin (ddi vdi dp chinh xac thip) din 5 lin (ddi
vdi dp chfnh xac cao) so vdi thuat toan FMM cai dat tren GRAPE sii dung phudng
phap tfnh luc xa truyin thdng
1 Introduction
Molecular dynamics (MD) simulations often require high calculation cost Tiie most tensive part of MD is calculation of Coulombic force among particles (i.e atoms and ions)
in-In naive direct-summation algorithm, cost of the force calculation scales as 0{N'^), where
N is the number of particles In order to reduce the cost of force calculation, fast
algo-rithms such as Barnes-Hut treecode [4] and fast multipole method [8] have been designed
Calculation cost of these algorithms are 0(7V logAO and 0{N) respectively These fast
algorithms are widely used in the field of MD simulation [14] [15]
Trang 37There exists another approach to accelerate the force calculation It is to use hardware dedicated to the calculation of inter-particle forcẹ G R A P E (GRAvity PipE) [20] [17] is
one of the most widely used hardware of that kind Figure 1 shows basic structure of a
GftAPE system It consists of a G R A P E processor board and a general-purpose computer
(hereafter the host computer)
The primary function of G R A P E is to calculate the force / ( f ; ) exerted on particle i at position fi, and potential (p{fi) associated with f{fi) Although there are several variants
of G R A P E for different applications such as astrophysics and MD, the basic functions of these hardware devices are mostly the samẹ
The force f{fi) and the potential 0(fi) are expressed as
where N is the number of particles to handle, fj and qj are the position and the charge
of particle j , and rg is the softened distance between particle i and j defined as r^ =
=• | 2
+ e , where e is the softening parameter
In order to calculate force f{fi), relevant data, fi, fj, qj, e, and N are sent from the host computer to G R A P E G R A P E then calculates f{fi) for every i, and sends it back to the host T h e potential 4>{fi) is calculated in the same manner
HOST COMPUTER
Positions, charges
Forces
GRAPE
Figure 1: Basic structure of a G R A P E system
A typical G R A P E system performs force calculation 100-1000 times faster than
con-ventional computers of the same price dọ For small-A" {N ;$10^) systems, combination
of simple direct-summation algorithm and G R A P E is the fastest and simplest calculation schemẹ Fast algorithms are not very effective at such a small Ậ However, for large-
N systems, 0{N'^) direct-summation becomes expensive, even with G R A P E hardwarẹ
Combination of a fast algorithm and fast hardware will deliver extremely high
perfor-mance for large N Makino et al [16] have successfully implemented a modified treecode
[3] on G R A P E , and achieved a factor of 30-50 speed up
Implementation of FMM on dedicated hardware of similar kind (MD-ENGINE) has been reported, but its performance is rather modest [1] This is mainly because the hard-ware limitation Since dedicated hardware can calculate the particle force only, they cannot handle multipole and local expansion Therefore only a small fraction of the calculation procedure in the F M M ean be performed on such hardware, and the speed up gain remains
Trang 38rather modest An outstanding problem is how to perform a large or all fraction of FMM's calculation procedure on GRAPE
We have implemented FMM on GRAPE and achieved significant speedup [5] However
we Have not succeeded to put far field calculation part of FMM to GRAPE This fact limits
the performance of FMM on GRAPE
In this paper we describe our new formulation to speed up far field force calculation
- a significant calculation part of FMM on GRAPE Remaining parts of the paper are organized as follows Section 2 gives a summary of the FMM and related algorithms In section 3, we describe the implementation of our FMM code and its limitation Section 4 presents our new formulation Results of numerical tests are shown in section 5 Section 6 compare our code with other code Section 7 summarizes
2 F M M a n d related algorithms
In this section we give brief description of the FMM (section 2.1), and two related gorithms, namely, the Anderson's method (section 2.2) and the pseudoparticle multipole method (section 2.3)
al-2.1 F M M
The FMM is an approximate algorithm to calculate force among particles In the case of
close-to-uniform distribution, its computation complexity is 0{N) This sealing is achieved
by approximation of force using the multipole and local expansion technique The rithm was initially presented for two-dimensional case [8], and then extended to three-dimension [9]
algo-Figure 2 shows schematic idea of force approximation in the FMM The force from a group of distant particles are approximated by a multipole expansion At an observation point, the multipole expansion is converted to local expansion The local expansion is evaluated by each particle around the observation point Hierarchical tree structure is used for grouping of the particles [8, 9]
M2L
Multipole expansion Local expansion
Figure 2: Schematic idea of force approximation in FMM
2.2 Anderson's method
Anderson [2] proposed a variant of the FMM using a new formulation of the multipole and local expansions His method is based on the Poisson's formulae In order lo use
Trang 39these formulae as replacements of the multipole and local expansions, Anderson proposed
discrete versions of them as follows When potential on the surface of a sphere of radius a
is given, the potential $ at position f= {r,<p,e) is expressed as:
for r < a (inner expansion) The function Pn denotes the n-th Legendre polynomial Here
w^ are constant weight values and p is the number of untruncated terms Hereafter we
refer p as expansion order
Anderson's method uses Eq (3) and (4) for M2M and L2L transitions, respectively
The procedures of other stages are the same as that of the original FMM Note that
Anderson used spherical i-design [10] to obtain Eq (3) and (4) Examples of spherical
t-design is available at h t t p : / / w w w r e s e a r c h a t t c o m / ~ n j a s / s p h d e s i g n s /
2.3 Pseudoparticle multipole method
Makino [18] proposed the pseudoparticle multipole method (P^M^), yet another
formula-tion of the multipole expansion The advantage of his method is that the expansion can
be evaluated using G R A P E
The basic idea of P^M^ is to use a small number of pseudoparticles to express the
multipole expansions In other words, this method approximates the potential field of
physical particles by the field generated by a small number of pseudoparticles This idea
is very similar t o t h a t of Anderson's method Both methods uses discrete quantity to
approximate the potential field of the original distribution of the particles The difference
is that P^M^ uses the distribution of point charges, while the Anderson's method uses
potential values In the case of P^M^, the potential is expressed by point charges as given
below, and thus it can be evaluated using GRAPE.:
^ P 2/ + 1 fr \^
where Qj is charge of pseudoparticle, f, = {r^,(f),6) is position of physical particle, 7^^ is
angle between fi and position vector Rj of the j - t h pseudoparticle [18]
3 Implementation of the FMM on G R A P E
In this section, we briefly describes our implementation on G R A P E [5] The FMM consists
of five stages, namely, tree construction, M2M transition M2L conversion, L2L transition,
and force evaluation Force-evaluation stage consists of near field and far field evaluation
parts
Trang 40In the case of original FMM, only the near field part of the force-evaluation stage
can be performed on GRAPE In our implementation (hereafter code A), we modified
the original FMM so t h a t GRAPE ean handle M2L conversion stage, which is most time
con^ming Table 1 summarizes mathematical expressions and operations used at each
calculation stage In the following we describe stages of the code A
Table 1: Mathematical expressions and operations used in our implementation of the code
A [5] Bold parts run on GRAPE
M2M
M2L
L2L
Near field force
Far field force
Original [9]
multipole expansion M2L conversion formula
local expansion
Code A (section 3) P^M'^
Eq (6)
The tree construction stage has no change It is performed in the same way as in the
original FMM
At the M2M transition stage, we compute positions and charges of pseudoparticles,
instead of forming multipole expansion as in the original FMM This process is totally
done on the host computer
The M2L conversion stage is done on GRAPE Difference from the original FMM is
that we do not use the formula to convert multipole expansion to local expansion We
directly calculate potential values due to pseudoparticles
The L2L transition is done in the same way as Anderson has done using Eq (4)
The near field contribution is directly calculated by evaluating the particle-particle
force G R A P E handles this part
Using Eq (4), we obtain the far field potential on a particle at position f Consequently,
far field force is calculated using derivative of Eq (4):
where u — s^- f/r All the calculation at this stage is done on the host computer
With the modification to original FMM described above, we have succeeded to put
the bottleneck, namely, the M2L conversion stage, on G R A P E The overall calculation of
the FMM is significantly accelerated Now the most expensive part is the far field force
evaluation A new bottleneck appears Eq (6) is complicated and evaluation of it takes
rather big fraction of the overall calculation time [5]
If we can convert a set of potential values into a set of pseudoparticles at marginal
calculation cost, force from those pseudoparticles can be evaluated on GRAPE, and the
new bottleneck will no longer exist In order for this conversion, we have newly developed
a conversion procudure (hereafi:er A2P conversion) presented in section 4