1. Trang chủ
  2. » Công Nghệ Thông Tin

fortran 2003 2008 handouts

54 334 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 54
Dung lượng 6,52 MB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

ĨỹưÔ ÚẺÒÝÌừứÒ ỬỪỰụỨô ẽọừÓĐÔừÝừÌ ÒứÒỹ ừÒÌĨừÒÍừÝ ưÌưÒ ĨỹưÔô ỹÈÌỹĨÒưÔ ữữ Ứ ĨỹưÔô ừÒÌỹÒÌụừÒọ ữữ ẽ ỬỪỰ ã ìẻỏỨụẽọựưÌưÒụỉòđọ ỹÒỵ ÚẺÒÝÌừứÒ ỬỪỰ ỹÒỵ ĐĨứÙĨưÓ ỬỪỰểỪễể ừÒÌỹÙỹĨô ừÒÌỹÒÌụừÒọ ữữ ỗ ĨỹưÔô

Trang 1

Fortran 2003/2008

Pekka Manninen Sami Saarinen David Henty

September 11-13, 2012

PRACE Advanced Training Centre

CSC – IT Center for Science Ltd, Finland

Trang 2

All material (C) 2012 by the authors.

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License, http://creativecommons.org/licenses/by-nc-sa/3.0/

Trang 3

11.00-12.00 Exercises

13.00-13.45 Exercises 14.00-14.45 Other handy Fortran features 15.00-16.00 Exercises

Trang 5

Š ŒÚ±®¬®¿² ìịñîððíŒ ·­ ¬¸» ½«®®»²¬ ¼» º¿½¬± ­¬¿²¼¿®¼Ì¸» ´¿¬»­¬ ­¬¿²¼¿®¼ ·­ Ú±®¬®¿² îððỉ ø¿°°®±ª»¼ îðïð÷ô ¿

×ÒÌÎ×ÒÍ×Ý ÍÏÎÌ ÿ Ú±®¬®¿² ­¬¿²¼¿®¼ °®±ª·¼»­ ³¿²§ ½±³³±²´§ «­»¼ º«²½¬·±²­

ÿ ݱ³³¿²¼ ´·²» ·²¬»®º¿½»ò ß­µ ¿ ²«³¾»® ¿²¼ ®»¿¼ ·¬ ·² ĨÎ×ÌÛ øöôö÷ ùÙ·ª» ¿ ª¿´«» ø²«³¾»®÷ º±® ¨ưù

ÎÛßÜ øöôö÷ ¨

§ê¨ööîõï ÿ б©»® º«²½¬·±² ¿²¼ ¿¼¼·¬·±² ¿®·¬¸³»¬·½ ĨÎ×ÌÛ øöôö÷ ù¹·ª»² ª¿´«» º±® ¨ưùô ¨

ĨÎ×ÌÛ øöôö÷ ù½±³°«¬»¼ ª¿´«» ±º ¨ööî õ ïưùô §

ÿ Ю·²¬ ¬¸» ­¯«¿®» ®±±¬ ±º ¬¸» ¿®¹«³»²¬ § ¬± ­½®»»² ĨÎ×ÌÛ øöôö÷ ù½±³°«¬»¼ ª¿´«» ±º ÍÏÎÌø¨ööî õ ï÷ưùô ÍÏÎÌø§÷

ݱ³°·´·²¹ ¿²¼ ´·²µ·²¹

×ÓÐÔ×Ý×Ì ÒÑÒÛ

×ÒÌÛÙÛÎ ưư ²ð ÎÛßÔ ưư ¿ô ¾ ÎÛßÔ ưư ®ïêðòð ÝÑÓÐÔÛỈ ưư ½ ÝÑÓÐÔÛỈ ưư ·³¿¹Â²«³¾»®ểðòïô ïòð÷

ÝØßÎßÝÌÛÎøÔÛÒêỉð÷ ưư °´¿½»

ÝØßÎßÝÌÛÎøÔÛÒêỉð÷ ưư ²¿³»ễÖ¿³»­ Þ±²¼ù ÔÑÙ×ÝßÔ ưư ¬»­¬ð ê òÌÎỊÛò

ÔÑÙ×ÝßÔ ưư ¬»­¬ï ê òÚßÔÍÛò

Í¿®·¿¾´»­

ݱ²­¬¿²¬­ ¼»º·²»¼ ©·¬¸ ¬¸» ÐßÎßÓÛÌÛÎ ½´¿«­» Š ¬¸»§ ½¿²²±¬ ¾»

Trang 6

ỵứ ÉửừÔỹ ụẽ â đọ

ểổểƯƠễềỠ ã ểổểƯƠễềỠ õ ẽ Ĩỹưỵụỏôỏọ ẽ

ÉĨừÌỹụỏôỏọùỠữùô Ỡôù ỗữùô ỗ

ÉĨừÌỹụỏôỏọùÙệỪƯểỪễể ơổỠỠổỗ Ửởếởễổệữ ùôỠ ỹÔÍỹ

ÉĨừÌỹụỏôỏọùÒỪỰƯểởếỪ ếƯƠềỪ ỪỗểỪệỪỬù ỹÒỵ ừÚ ồổễởểởếỪÁơịỪơộ

Trang 7

ï î

Trang 9

ß®®¿§ ­§²¬¿¨

ß®®¿§ ­§²¬¿¨ ¿´´±©­ º±® ´»­­ »¨°´·½·¬ ÜÑ ´±±°­

×ÒÌÛÙÛÎô ÐßÎßÓÛÌÛÎ ææ Ó ã ìô Ò ã ë ÎÛßÔ øµ·²¼ ã è÷ ææ ßøÓôÒ÷ ô ¨øÒ÷ô §øÓ÷

×ÒÌÛÙÛÎ ææ × ô Ö

§ø æ ÷ ã ð ÑËÌÛÎÁÔÑÑÐ æ ¼± Ö ã ïô Ò

Trang 10

±º ¿đđ¿Đ âãơá đằư°ằẵơ ơ± ằ¿ẵá ±º ãơư ẳã³ằ²ưã±² íẹậềè ứễÁ¿đđ¿Đ Åụẳã³Ãữ đằơôđ²ư ơáằ ẵ±ô²ơ ±º ằ´ằ³ằ²ơư

ểìềấòễ ủểòẩấòễ ứ¿đđ¿Đ Åụẳã³Ã Åụ ³¿ưàÃữ đằơôđ² ơáằ

³ã²ã³ô³ủ³¿ăã³ô³ ê¿´ôằ ã² ¿ ạãêằ² ¿đđ¿Đ Å¿´±²ạ

ư°ằẵãºãằẳ ẳã³ằ²ư㱲à Åụ ô²ẳằđ ³¿ưàà ểìềễẹíủểòẩễẹí ứ¿đđ¿Đ Åụ ³¿ưàÃữ đằơôđ² ¿ êằẵơ±đ ±º

´±ẵ¿ơã±²ứưữ Åụ ô²ẳằđ ³¿ưàÃụ âáằđằ ơáằ

³ã²ã³ô³ủ³¿ăã³ô³ ê¿´ôằứưữ ãưủ¿đằ º±ô²ẳ10

Trang 11

ưệệƯậ ởỗểệởỗễởơ Ứềỗơểởổỗễ

ừÒÌỹÙỹĨ ữữ Óô Ò

ĨỹưÔ ữữ ÈụÓôÒọô ÊụÒọ

ĐĨừÒÌ ỏôÍừẳỹụÈọô ÍừẳỹụÊọ Ữ Óô Òô

Ò

ĐĨừÒÌ ỏôÍửưĐỹụÈọ Ữ Óô Ò

ĐĨừÒÌ ỏôÍừẳỹụÍửưĐỹụÈọọ Ữ ĩ

ĐĨừÒÌ ỏôÝứẺÒÌụÈ âã đọ

ĐĨừÒÌ ỏôưÔÔụÈ âã đô ỵừÓãỉọ

Trang 13

í¿´´ ẵ±²êằ²ơã±²

đằư ó ºô²ẵứòẻÙÍữ

Íôắđ±ôơã²ằ

ÍậịẻẹậèìềÛ ưôắứ¿đạô³ằ²ơưữ Åẳằẵ´¿đ¿ơã±²ưà Åươ¿ơằ³ằ²ơưà Ûềĩ ÍậịẻẹậèìềÛ ưôắ

ẵ¿´´ ơằươứưụđằưô´ơữ

ũũũ

ĩằẵ´¿đ¿ơã±²

Trang 14

ĨỹưÔ ÚẺÒÝÌừứÒ ỬỪỰụỨô ẽọ

ừÓĐÔừÝừÌ ÒứÒỹ

ừÒÌĨừÒÍừÝ ưÌưÒ ĨỹưÔô ỹÈÌỹĨÒưÔ ữữ Ứ ĨỹưÔô ừÒÌỹÒÌụừÒọ ữữ ẽ

ỬỪỰ ã ìẻỏỨụẽọựưÌưÒụỉòđọ

ỹÒỵ ÚẺÒÝÌừứÒ ỬỪỰ ỹÒỵ ĐĨứÙĨưÓ ỬỪỰểỪễể

ừÒÌỹÙỹĨô ừÒÌỹÒÌụừÒọ ữữ ỗ ĨỹưÔô ừÒÌỹÒÌụứẺÌọô ỵừÓỹÒÍừứÒụỗọ ữữ ẽ ỹÒỵ ÍẺỡĨứẺÌừÒỹ ỰđẻỨƯỨ

ỹÒỵ ừÒÌỹĨÚưÝỹ

ĨỹưÔô ỵừÓỹÒÍừứÒụữọô ừÒÌỹÒÌụứẺÌọ ữữ ểƯớƠỪ ÝưÔÔ ỰđẻỨƯỨụóỉòđô ỉòđô ÍừẳỹụểƯớƠỪọô ểƯớƠỪọ ỹÒỵ ÍẺỡĨứẺÌừÒỹ ỗƯỰÁệƯỗỬ

ỵỪỨởỗởỗỰ Ưỗ ởỗểỪệỨƯơỪ Ứổệ ểịỪ ỰđẻỨƯỨ

ễềớệổềểởỗỪ ổỨ ểịỪ ÒưÙ ƠởớệƯệậ ụỰỪỗỪệƯểỪễ Ư ễỪể ổỨ ệƯỗỬổỠ ỗềỠớỪệễọ

ÓổỬềƠƯệ ồệổỰệƯỠỠởỗỰ ÓổỬềƠƯệởểậ ỠỪƯỗễ ỬởếởỬởỗỰ Ư ồệổỰệƯỠ ởỗểổ ễỠƯƠƠ

Trang 15

ìềèÛÙÛẻụ ÍòấÛ ổổ ²ụ ²ơ±ơ

ẻÛòễụ ÍòấÛ ổổ ¿ắươ±´ụ đằ´ơ±´

Ûềĩ ểẹĩậễÛ ẵ±³³±²ư

ấãưãắã´ãơĐ ±º ±ắảằẵơư

ấ¿đã¿ắ´ằư ¿²ẳ °đ±ẵằẳôđằư ã² ³±ẳô´ằư ẵ¿² ắằ éẻìấòèÛ ±đ éậịễìí

Š éậịễìí ó êãưãắ´ằ º±đ ¿´´ °đ±ạđ¿³ ô²ãơư ôưã²ạ ơáằ ³±ẳô´ằ ứẳằº¿ô´ơữ

Trang 17

ß ợữựÍ Ỳỡ³¾ữợ ¾ữŨờữữỲ óĩð ĩðð ả Ù ả ĩð ĩðð ô ựỎỎỡợựŨữ ŨỬ ĩî

ÉịÈỉũụöôöƠ ỏÈồưụựƠô ừẹứũụựƠô ỉÈồđụựƠô ịßồứũụựƠô ƯịũÝÈễÈỦồụựƠ

ÉịÈỉũụöôöƠ ỏÈồưụ¾Ơô ừẹứũụ¾Ơô ỉÈồđụ¾Ơô ịßồứũụ¾Ơô ƯịũÝÈễÈỦồụ¾Ơ

ũƯễÈÔỦồụßƠ ỉĨữ Ứ³ựÍÍữỨŨ ổỬỨỈŨỈởữ Ỳỡ³¾ữợ ựÒÒữÒ ŨỬ ĩòð

ợữŨỡợỲỨ ự Ỳỡ³¾ữợ ộỡỨŨ ÓợữựŨữợ ŨĨựỲ ĩòð ƯịũÝÈễÈỦồụßƠ ưữỎỈ³ựÍ ổợữỎỈỨỈỬỲ

ŨĨữ ổợỬÓợự³³ữợ

Š ũỪỡỈởựÍữỲŨ ŨỬ ỨŨợỡỎŨỨ ỈỲ Ý ổợỬÓợự³³ỈỲÓ ÍựỲÓỡựÓữ

Š ÝỬỲỨỈỨŨỨ Ửử ỬŨĨữợ ÒựŨự ŨớổữỨ Š ỈỲỎÍỡÒỈỲÓ ỬŨĨữợ ÒữợỈởữÒ ŨớổữỨưữợỈởữÒ Ũớổữ ỈỨ ÒữửỈỲữÒ ỈỲ ŨĨữ ởựợỈự¾Íữ ÒữỎÍựợựŨỈỬỲ ỨữỎŨỈỬỲ

Trang 18

ÌÇÐÛøĨÇÁÌÇÐÛ÷ ỉỉ ỊÛÉÊßỴ ỊÛÉÊßỴ ã ÊßỴ

ÐỴ×ỊÌ ươỊÛÉÊßỴ ÛỊÜ ÍËÞỴĐËÌ×ỊÛ ÝßƠÝ ÛỊÜ ÍËÞỴĐËÌ×ỊÛ ÍËÞ

ÛỊÜ ÍËÞỴĐËÌ×ỊÛ ÍËÞ

ÍËÞỴĐËÌ×ỊÛ ÝßƠÝ øỊÛÉÊßỴ÷

ËÍÛ ÌÇÐÛĨĐÜơ ĐỊƠÇ ỉ ĨÇÁÌÇÐÛ ÌÇÐÛøĨÇÁÌÇÐÛ÷ ỉỉ ỊÛÉÊßỴ ÐỴ×ỊÌ ươỊÛÉÊßỴ ÛỊÜ ÍËÞỴĐËÌ×ỊÛ ÝßƠÝ

ĨĐÜËỞ ÌÇÐÛĨĐÜ

ÌÇÐÛ ĨÇÁÌÇÐÛ

×ỊÌÛÙÛỴ ỉỉ ÚĐĐ ÛỊÜ ÌÇÐÛ ĨÇÁÌÇÐÛ ÐËÞƠ×Ý ỉỉ ĨÇÁÌÇÐÛ

Trang 19

»¨°±²»²¬·¿´ º±®³­ô

¿«¬±ó­½¿´·²¹÷

Ú©ò¼ Û©ò¼ô Û©ò¼Û»

ïÐôÙ©ò¼

ĨÎ×ÌÛøöôùøÚĩòì÷ù÷ Î ĨÎ×ÌÛøöôùøÛïîòíÛì÷ù÷ Î ĨÎ×ÌÛøöôùøïÐôÙîðòïí÷ù÷ Πݸ¿®¿½¬»® ßô ß© ĨÎ×ÌÛøöôùøß÷ù÷ Ý

Ô±¹·½¿´ Ô© ĨÎ×ÌÛøöôùøÔî÷ù÷ Ô

¿ ¬»®³·²¿´ ­½®»»² ±® ®»¿¼·²¹ º®±³ ¿ µ»§¾±¿®¼ Ü·ºº»®»²½»­

ÝÔÑÍÛøƠ«²·¬êÊ·« Ơô ±°¬·±²­Ê÷

Ú±® »¨¿³°´» ưÑÐÛÒøïðô º·´»ê ù±«¬°«¬ò¼¿¬ùô ­¬¿¬«­ễ²»©ù÷

ÝÔÑÍÛø«²·¬êïðô ­¬¿¬«­ễµ»»°ù÷

Ѱ»²·²¹ ú ½´±­·²¹ ¿ º·´»

̸» º·®­¬ °¿®¿³»¬»® ·­ ¬¸» «²·¬ ²«³¾»®

̸» µ»§©±®¼ «²·¬ê½¿² ¾» ±³·¬¬»¼ ̸» «²·¬ ²«³¾»®­ ðô ị ¿²¼ í ¿®» °®»¼»º·²»¼

Trang 20

ÿ ĩ± ư±³ằơáã²ạ âãơá ơáằ ºã´ằ ựº±±ũẳ¿ơự ÛềĩìÚ

Úã´ằ âđãơã²ạ ¿²ẳ đằ¿ẳã²ạ ẫđãơã²ạ ơ± ¿²ẳ đằ¿ẳã²ạ ºđ±³ ¿ ºã´ằ ãư ẳ±²ằ ắĐ ạãêã²ạ ơáằ

Trang 21

ậ²º±đ³¿ơơằẳ ìủẹ ẫđãơằ ơ± ¿ ưằ¯ôằ²ơã¿´ ắ㲿đĐ ºã´ằ

±ôơưãẳằ â±đ´ẳ

Š ẹ°ằ²ã²ạ ¿²ẳ ẵ´±ưã²ạ ¿ ºã´ằ

Š ĩ¿ơ¿ đằ¿ẳã²ạ ỳ âđãơã²ạậưằ ô²º±đ³¿ơơằẳ ứắ㲿đĐữ ìủẹ º±đ ¿´´ ằăẵằ°ơ ơằăơ ºã´ằư Íơđằ¿³ ìủẹ

ì²ơằđ²¿´ ìủẹ

21

Trang 23

ĨỹưÔô ừÒÌỹÒÌụừÒứẺÌọ ữữ Ưô ớ ĨỹưÔ ữữ ểỪỠồ

ểỪỠồ ã Ưă Ư ã ớă ớ ã ểỪỠồ

ỹÒỵ ÍẺỡĨứẺÌừÒỹ ÍẺỡĨứẺÌừÒỹ ễẹƯồÁơịƯệụƯô ớọ

ÝửưĨưÝÌỹĨô ừÒÌỹÒÌụừÒứẺÌọ ữữ Ưô ớ ÝửưĨưÝÌỹĨ ữữ ểỪỠồ

ểỪỠồ ã Ưă Ư ã ớă ớ ã ểỪỠồ

ỹÒỵ ÍẺỡĨứẺÌừÒỹ ỹÒỵ ÓứỵẺÔỹ ễẹƯồỠổỬ

ÙỪỗỪệởơ ồệổơỪỬềệỪễ ỪẽƯỠồƠỪ

ĐĨứÙĨưÓ ễẹởểơị ẺÍỹ ễẹƯồỠổỬ

ừÓĐÔừÝừÌ ÒứÒỹ ÝửưĨưÝÌỹĨ ữữ ỗôễ ĨỹưÔ ữữ ẽôậ

ỗ ã ùỷù

ễ ã ùÍù

ẽãỉđ

ậãĩđ ĐĨừÒÌ ỏôẽôậ

ĐĨừÒÌ ỏôỗôễ

ÝưÔÔ ễẹƯồụỗôễọ ÝưÔÔ ễẹƯồụẽôậọ

ĐĨừÒÌ ỏôẽôậ

ĐĨừÒÌ ỏôỗôễ

ỹÒỵ ĐĨứÙĨưÓ

ÍồỪơởƯƠ ƯểểệởớềểỪễ Ứổệ ồệổơỪỬềệỪễữ ĨỹÝẺĨÍừÊỹ ĨỪơềệễởổỗ ỠỪƯỗễ ơƯƠƠởỗỰ Ư ồệổơỪỬềệỪ ẹởểịởỗ ởểễỪƠỨ ÌệởỰỰỪệỪỬ ếởƯ ĨỹÝẺĨÍừÊỹ ộỪậẹổệỬ

ĨỹÝẺĨÍừÊỹ ÚẺÒÝÌừứÒ ỨƯơểổệởƯƠụỗọ ĨỹÍẺÔÌụỨƯơọ

ừÒÌỹÙỹĨô ừÒÌỹÒÌụừÒọ ữữ ỗ

ừÒÌỹÙỹĨ ữữ ỨƯơ

ừÚ ụỗããđọ ÌửỹÒ ỨƯơãỉ ỹÔÍỹ ỨƯơãỗỏỨƯơểổệởƯƠụỗóỉọ

ỹÒỵ ừÚ ỹÒỵ ÚẺÒÝÌừứÒ ỨƯơểổệởƯƠ

ÍồỪơởƯƠ ƯểểệởớềểỪễ Ứổệ ồệổơỪỬềệỪễữ ĐẺĨỹ ĐẺĨỹ ộỪậẹổệỬ ởỗỬởơƯểỪễ ểịƯể ểịỪ Ứềỗơểởổỗ ởễ ỨệỪỪ ổỨ ễởỬỪ

ỪỨỨỪơểễ

Š Íềơị Ưễ Ư ơịƯỗỰỪ ởỗ ếƯƠềỪ ổỨ Ưỗ ởỗồềể ƯệỰềỠỪỗể ổệ ỰƠổớƯƠ ếƯệởƯớƠỪ

ừỗểệởỗễởơ Ứềỗơểởổỗễ ƯệỪ ƯƠẹƯậễ ồềệỪ

Òổ ụỪẽểỪệỗƯƠọ ừựứ ởễ ƯƠƠổẹỪỬ ởỗ ĐẺĨỹ ồệổơỪỬềệỪễ

ĐềệỪ ồệổơỪỬềệỪ Ỡềễể ễồỪơởỨậ ởỗểỪỗểễ ổỨ ởểễ ƯƠƠ ƯệỰềỠỪỗểễ ÌịỪ ỠổểởếƯểởổỗ ởễ ỪỨỨởơởỪỗơậữ ỹỗƯớƠỪễ ỠổệỪ ƯỰỰệỪễễởếỪ

ơổỠồởƠỪệ ổồểởỠởẩƯểởổỗ ƯỗỬ ồƯệƯƠƠỪƠởẩƯểởổỗ ẹởểị ỪòỰò ứồỪỗÓĐ

23

Trang 24

ÎÛßÔô Ü×ÓÛÒÍ×ÑÒø²ô²÷ ææ ¿ô ¾ô ½ ÎÛßÔô Ü×ÓÛÒÍ×ÑÒø²÷ ææ ¬ô «ô ª òòò

Trang 25

ÌổẹƯệỬễ ÚổệểệƯỗ ĩđđè

ÍỪỪ ịểểồữựựỨổệểệƯỗẹởộởòổệỰựỨổệểệƯỗựễịổẹựÚổệểệƯỗõĩđđèõễểƯểềễ

25

Trang 27

Parallel Programming

with

Fortran Coarrays

Delivered at PRACE Advanced Training Centre,

CSC IT Center for Science Ltd, Finland,

September 13, 2012

David Henty, Alan Simpson (EPCC)

Harvey Richardson, Bill Long (Cray)

Tutorial Overview

The Fortran Programming Model in context

Basic coarray features

Programming models for HPC

The challenge is to efficiently map a problem to the architecture we have

Take advantage of all computational resources Manage distributed memories etc

Optimal use of any communication networks The HPC industry has long experience in parallel programming Vector, threading, data-parallel, message-passing etc

We would like to have models or combinations that are efficient

safe easy to learn and use

5

Why consider new programming models?

Next-generation architectures bring new challenges:

Very large numbers of processors with many cores Complex memory hierarchy

even today (2011) we are at 500k cores Parallel programming is hard, need to make this simpler Some of the models we currently use are

bolt-ons to existing languages as APIs or directives Hard to program for underlying architecture unable to scale due to overheads

So, is there an alternative to the models prevalent today?

Most popular are OpenMP

1-8 9-16 17-24 25-32

8

27

Trang 28

Shared Memory Directives

Multiple threads share global memory

Most common variant: OpenMP

Program loop iterations distributed to threads,

more recent task features

Each thread has a means to refer to private objects

within a parallel context

Terminology

Thread, thread team

Implementation

Threads map to user threads running on one SMP node

Extensions to distributed memory not so successful

OpenMP is a good model to use within a node

10

Cooperating Processes Models

11

processes PROBLEM

Message Passing, MPI

Remote side of communication does not participate Can test for completion

Barriers and collectives Popular on Cray and SGI hardware, also Blue Gene version

To make sense needs hardware support for low-latency type operations

RDMA-16

28

Trang 29

New shared data structures

shared pointers to distributed data (block or cyclic)

pointers to shared data local to a thread

Synchronization

Language constructs to divide up work on shared data

upc_forall() to distribute iterations of for() loop

Extensions for collectives

Both commercial and open source compilers available

Cray, HP, IBM

Berkeley UPC (from LBL), GCC UPC

19

Fortran 2008 coarray model

Example of a Partitioned Global Address Space (PGAS)

model

Set of participating processes like MPI

Participating processes have access to local memory

via standard program mechanisms

Access to remote memory is directly supported by

Type checking Opportunity to optimize communication

No penalty for local memory access Single-sided programming model more natural for some algorithms

and a good match for modern networks with RDMA

23

Fortran coarrays Basic Features

29

Trang 30

Coarray Fortran

"Coarrays were designed to answer the question:

What is the smallest change required to convert Fortran

into a robust and efficient parallel language?

The answer: a simple syntactic extension

It looks and feels like Fortran and requires

Fortran programmers to learn only a few new rules."

John Reid, ISO Fortran Convener

25

Some History

Introduced in current form by Numrich and Reid in 1998 as a

simple extension to Fortran 95 for parallel processing

Many years of experience, mainly on Cray hardware

A set of core features are now part of the Fortran standard

ISO/IEC 1539-1:2010

Additional features are expected to be published in a

Technical Specification in due course

26

How Does It Work?

SPMD - Single Program, Multiple Data

single program replicated a fixed number of times

Each replication is called an image

Images are executed asynchronously

execution path may differ from image to image

some situations cause images to synchronize

Images access remote data using coarrays

Normal rules of Fortran apply

27

Arrays or scalars that can be accessed remotely

images can access data objects on any other image

Additional Fortran syntax for coarrays

Specifying a codimension declares a coarray

these are equivalent declarations of a array x

of size 10 on each image

x is now remotely accessible

coarrays have the same size on each image!

What are coarrays?

Be careful when updating coarrays:

If we get remote data was it valid?

Could another process send us data and overwrite something we have not yet used?

How do we know that data sent to us has arrived?

Fortran provides synchronisation statements For example, barrier for synchronisation of all images:

do not make assumptions about execution timing on images unless executed after synchronisation

Note there is implicit synchronisation at program start

end if sync all

Making remote references

We used a loop over images

32

Note that array indexing within the coindex is not allowed

so we can not write

do image = 2,num_images() x[image] = x

end do

x[2:num_images()] = x ! illegal

30

Trang 31

You need to implement your view of global data from the local

coarrays as Fortran does not provide the global view

You can be flexible with the coindexing (see later)

You can use any access pattern you wish

ca(1:4)[1] ca(1:4)[2] ca(1:4)[3] ca(1:4)[4]

integer :: ca(4)[*]

do image=1,num_images()

print *,ca(:)[image]

end do

1D cyclic data access

coarray declarations remain unchanged

but we use a cyclic access pattern

code execution on images is independent

programmer has to control execution using synchronisation

synchronise before accessing coarrays

ensure content is not updated from remote images before

you can use it

synchronise after accessing coarrays

ensure new content is available to all images

implicit synchronisation after variable declarations at first

executable statement

guarantees coarrays exist on all images when your first

program statement is executed

We will revisit this topic later

if (this_image() == 1) then

do image = 2, num_images() maximum = max(maximum, maximum[image]) end do

do image = 2, num_images() maximum[image] = maximum end do

end if sync all

37

implicit synchronisation ensure all images set local maximum

ensure all images have copy of maximum value

integer, dimension(nimages) :: nprimes[*]

real density start = (this_image()-1) * n/num_images() + 1 end = start + n/num_images() - 1

nprimes(this_image())[1] = num_primes(start,end)

sync all

Example2: Calculate density of primes

40

if (this_image()==1) then nprimes(1)=sum(nprimes) density=real(nprimes(1))/n print *,"Calculating prime density on", &

& num_images(),"images"

print *,nprimes(1),'primes in',n,'numbers' write(*,'(" density is ",2Pf0.2,"%")')density write(*,'(" asymptotic theory gives ", &

& 2Pf0.2,"%")')1.0/(log(real(n))-1.0) end if

31

Trang 32

Example2: Calculate density of primes

41

Calculating prime density on 2 images

664580 primes in 10000000 numbers

density is 6.65%

asymptotic theory gives 6.61%

Launching a coarray program

The Fortran standard does not specify how a program is

launched

The number of images may be set at compile, link or run-time

A compiler could optimize for a single image

Observations so far on coarrays

Natural extension, easy to learn

Makes parallel parts of program obvious (syntax)

Part of Fortran language (type checking, etc)

No mapping of data to buffers (or copying) or creation of

complex types (as we might have with MPI)

Compiler can optimize for communication

More observations later

43

Exercise Session 1

Look at the Exercise Notes document for full details

number of images

Extend the simple Fortran code provided in order to perform

operations on parts of a picture using coarrays

44

Backup Slides HPF model

45

High Performance Fortran (HPF)

Data Parallel programming model Single thread of control

Arrays can be distributed and operated on in parallel Loosely synchronous

Parallelism mainly from Fortran 90 array syntax, FORALL and intrinsics

This model popular on SIMD hardware (AMT DAP, Connection Machines) but extended to clusters where control thread is replicated

Trang 33

More Coarray Features

Parallel Programming with Fortran Coarrays

Delivered at PRACE Advanced Training Centre,

CSC IT Center for Science Ltd, Finland,

September 13, 2012

David Henty, Alan Simpson (EPCC)

Harvey Richardson, Bill Long (Cray)

Overview

Multiple Dimensions and Codimensions

Allocatable Coarrays and Components of Coarray

images

P(m,n) Variables/arrays

P(m,n)[*]

P(m,n)[k,*]

2D Data

assemble rather than distribute

Can assemble a 2D data structure from 1D arrays

global access: ca(3,1)[2,2]

local access: ca(3,1)

Coarray Subscripts

Fortran arrays defined by rank, bounds and shape integer, dimension(10,4) :: array rank 2

lower bounds 1, 1; upper bounds 10, 4 shape [10, 4]

Coarray Fortran adds corank, cobounds and coshape integer :: array(10,4)[3,*]

corank 2 lower cobounds 1, 1; upper cobounds 3, m coshape [3, m]

m would be ceiling(num_images()/3)

7

Multiple Codimensions

Coarrays with multiple Codimensions:

character :: a(4)[2, *] !2D grid of images

for 4 images, grid is 2x2; for 16 images, grid is 2x8

real :: b(8,8,8)[10,5,*] !3D grid of images

8x8x8 local array; with 150 images, grid is 10x5x3

integer::c(6,5)[0:9,0:*] !2D grid of images

lower cobounds [ 0, 0 ]; upper cobounds [ 9,n]

useful if you want to interface with MPI or want C like coding

Sum of rank and corank should not exceed 15 Flexibility with cobounds

can set all but final upper cobound as required

Ngày đăng: 24/10/2014, 20:52

TỪ KHÓA LIÊN QUAN

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN

w