ĨỹưÔ ÚẺÒÝÌừứÒ ỬỪỰụỨô ẽọừÓĐÔừÝừÌ ÒứÒỹ ừÒÌĨừÒÍừÝ ưÌưÒ ĨỹưÔô ỹÈÌỹĨÒưÔ ữữ Ứ ĨỹưÔô ừÒÌỹÒÌụừÒọ ữữ ẽ ỬỪỰ ã ìẻỏỨụẽọựưÌưÒụỉòđọ ỹÒỵ ÚẺÒÝÌừứÒ ỬỪỰ ỹÒỵ ĐĨứÙĨưÓ ỬỪỰểỪễể ừÒÌỹÙỹĨô ừÒÌỹÒÌụừÒọ ữữ ỗ ĨỹưÔô
Trang 1Fortran 2003/2008
Pekka Manninen Sami Saarinen David Henty
September 11-13, 2012
PRACE Advanced Training Centre
CSC – IT Center for Science Ltd, Finland
Trang 2All material (C) 2012 by the authors.
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License, http://creativecommons.org/licenses/by-nc-sa/3.0/
Trang 311.00-12.00 Exercises
13.00-13.45 Exercises 14.00-14.45 Other handy Fortran features 15.00-16.00 Exercises
Trang 5 Ú±®¬®¿² ìịñîððí · ¬¸» ½«®®»²¬ ¼» º¿½¬± ¬¿²¼¿®¼Ì¸» ´¿¬»¬ ¬¿²¼¿®¼ · Ú±®¬®¿² îððỉ ø¿°°®±ª»¼ îðïð÷ô ¿
×ÒÌÎ×ÒÍ×Ý ÍÏÎÌ ÿ Ú±®¬®¿² ¬¿²¼¿®¼ °®±ª·¼» ³¿²§ ½±³³±²´§ «»¼ º«²½¬·±²
ÿ ݱ³³¿²¼ ´·²» ·²¬»®º¿½»ò ßµ ¿ ²«³¾»® ¿²¼ ®»¿¼ ·¬ ·² ĨÎ×ÌÛ øöôö÷ ùÙ·ª» ¿ ª¿´«» ø²«³¾»®÷ º±® ¨ưù
ÎÛßÜ øöôö÷ ¨
§ê¨ööîõï ÿ б©»® º«²½¬·±² ¿²¼ ¿¼¼·¬·±² ¿®·¬¸³»¬·½ ĨÎ×ÌÛ øöôö÷ ù¹·ª»² ª¿´«» º±® ¨ưùô ¨
ĨÎ×ÌÛ øöôö÷ ù½±³°«¬»¼ ª¿´«» ±º ¨ööî õ ïưùô §
ÿ Ю·²¬ ¬¸» ¯«¿®» ®±±¬ ±º ¬¸» ¿®¹«³»²¬ § ¬± ½®»»² ĨÎ×ÌÛ øöôö÷ ù½±³°«¬»¼ ª¿´«» ±º ÍÏÎÌø¨ööî õ ï÷ưùô ÍÏÎÌø§÷
ݱ³°·´·²¹ ¿²¼ ´·²µ·²¹
×ÓÐÔ×Ý×Ì ÒÑÒÛ
×ÒÌÛÙÛÎ ưư ²ð ÎÛßÔ ưư ¿ô ¾ ÎÛßÔ ưư ®ïêðòð ÝÑÓÐÔÛỈ ưư ½ ÝÑÓÐÔÛỈ ưư ·³¿¹Â²«³¾»®ểðòïô ïòð÷
ÝØßÎßÝÌÛÎøÔÛÒêỉð÷ ưư °´¿½»
ÝØßÎßÝÌÛÎøÔÛÒêỉð÷ ưư ²¿³»ễÖ¿³» Þ±²¼ù ÔÑÙ×ÝßÔ ưư ¬»¬ð ê òÌÎỊÛò
ÔÑÙ×ÝßÔ ưư ¬»¬ï ê òÚßÔÍÛò
Í¿®·¿¾´»
ݱ²¬¿²¬ ¼»º·²»¼ ©·¬¸ ¬¸» ÐßÎßÓÛÌÛÎ ½´¿«» ¬¸»§ ½¿²²±¬ ¾»
Trang 6ỵứ ÉửừÔỹ ụẽ â đọ
ểổểƯƠễềỠ ã ểổểƯƠễềỠ õ ẽ Ĩỹưỵụỏôỏọ ẽ
ÉĨừÌỹụỏôỏọùỠữùô Ỡôù ỗữùô ỗ
ÉĨừÌỹụỏôỏọùÙệỪƯểỪễể ơổỠỠổỗ Ửởếởễổệữ ùôỠ ỹÔÍỹ
ÉĨừÌỹụỏôỏọùÒỪỰƯểởếỪ ếƯƠềỪ ỪỗểỪệỪỬù ỹÒỵ ừÚ ồổễởểởếỪÁơịỪơộ
Trang 7ï î
Trang 9ß®®¿§ §²¬¿¨
ß®®¿§ §²¬¿¨ ¿´´±© º±® ´» »¨°´·½·¬ ÜÑ ´±±°
×ÒÌÛÙÛÎô ÐßÎßÓÛÌÛÎ ææ Ó ã ìô Ò ã ë ÎÛßÔ øµ·²¼ ã è÷ ææ ßøÓôÒ÷ ô ¨øÒ÷ô §øÓ÷
×ÒÌÛÙÛÎ ææ × ô Ö
§ø æ ÷ ã ð ÑËÌÛÎÁÔÑÑÐ æ ¼± Ö ã ïô Ò
Trang 10±º ¿đđ¿Đ âãơá đằư°ằẵơ ơ± ằ¿ẵá ±º ãơư ẳã³ằ²ưã±² íẹậềè ứễÁ¿đđ¿Đ Åụẳã³Ãữ đằơôđ²ư ơáằ ẵ±ô²ơ ±º ằ´ằ³ằ²ơư
ểìềấòễ ủểòẩấòễ ứ¿đđ¿Đ Åụẳã³Ã Åụ ³¿ưàÃữ đằơôđ² ơáằ
³ã²ã³ô³ủ³¿ăã³ô³ ê¿´ôằ ã² ¿ ạãêằ² ¿đđ¿Đ Å¿´±²ạ
ư°ằẵãºãằẳ ẳã³ằ²ư㱲à Åụ ô²ẳằđ ³¿ưàà ểìềễẹíủểòẩễẹí ứ¿đđ¿Đ Åụ ³¿ưàÃữ đằơôđ² ¿ êằẵơ±đ ±º
´±ẵ¿ơã±²ứưữ Åụ ô²ẳằđ ³¿ưàÃụ âáằđằ ơáằ
³ã²ã³ô³ủ³¿ăã³ô³ ê¿´ôằứưữ ãưủ¿đằ º±ô²ẳ10
Trang 11ưệệƯậ ởỗểệởỗễởơ Ứềỗơểởổỗễ
ừÒÌỹÙỹĨ ữữ Óô Ò
ĨỹưÔ ữữ ÈụÓôÒọô ÊụÒọ
ĐĨừÒÌ ỏôÍừẳỹụÈọô ÍừẳỹụÊọ Ữ Óô Òô
Ò
ĐĨừÒÌ ỏôÍửưĐỹụÈọ Ữ Óô Ò
ĐĨừÒÌ ỏôÍừẳỹụÍửưĐỹụÈọọ Ữ ĩ
ĐĨừÒÌ ỏôÝứẺÒÌụÈ âã đọ
ĐĨừÒÌ ỏôưÔÔụÈ âã đô ỵừÓãỉọ
Trang 13í¿´´ ẵ±²êằ²ơã±²
đằư ó ºô²ẵứòẻÙÍữ
Íôắđ±ôơã²ằ
ÍậịẻẹậèìềÛ ưôắứ¿đạô³ằ²ơưữ Åẳằẵ´¿đ¿ơã±²ưà Åươ¿ơằ³ằ²ơưà Ûềĩ ÍậịẻẹậèìềÛ ưôắ
ẵ¿´´ ơằươứưụđằưô´ơữ
ũũũ
ĩằẵ´¿đ¿ơã±²
Trang 14ĨỹưÔ ÚẺÒÝÌừứÒ ỬỪỰụỨô ẽọ
ừÓĐÔừÝừÌ ÒứÒỹ
ừÒÌĨừÒÍừÝ ưÌưÒ ĨỹưÔô ỹÈÌỹĨÒưÔ ữữ Ứ ĨỹưÔô ừÒÌỹÒÌụừÒọ ữữ ẽ
ỬỪỰ ã ìẻỏỨụẽọựưÌưÒụỉòđọ
ỹÒỵ ÚẺÒÝÌừứÒ ỬỪỰ ỹÒỵ ĐĨứÙĨưÓ ỬỪỰểỪễể
ừÒÌỹÙỹĨô ừÒÌỹÒÌụừÒọ ữữ ỗ ĨỹưÔô ừÒÌỹÒÌụứẺÌọô ỵừÓỹÒÍừứÒụỗọ ữữ ẽ ỹÒỵ ÍẺỡĨứẺÌừÒỹ ỰđẻỨƯỨ
ỹÒỵ ừÒÌỹĨÚưÝỹ
ĨỹưÔô ỵừÓỹÒÍừứÒụữọô ừÒÌỹÒÌụứẺÌọ ữữ ểƯớƠỪ ÝưÔÔ ỰđẻỨƯỨụóỉòđô ỉòđô ÍừẳỹụểƯớƠỪọô ểƯớƠỪọ ỹÒỵ ÍẺỡĨứẺÌừÒỹ ỗƯỰÁệƯỗỬ
ỵỪỨởỗởỗỰ Ưỗ ởỗểỪệỨƯơỪ Ứổệ ểịỪ ỰđẻỨƯỨ
ễềớệổềểởỗỪ ổỨ ểịỪ ÒưÙ ƠởớệƯệậ ụỰỪỗỪệƯểỪễ Ư ễỪể ổỨ ệƯỗỬổỠ ỗềỠớỪệễọ
ÓổỬềƠƯệ ồệổỰệƯỠỠởỗỰ ÓổỬềƠƯệởểậ ỠỪƯỗễ ỬởếởỬởỗỰ Ư ồệổỰệƯỠ ởỗểổ ễỠƯƠƠ
Trang 15ìềèÛÙÛẻụ ÍòấÛ ổổ ²ụ ²ơ±ơ
ẻÛòễụ ÍòấÛ ổổ ¿ắươ±´ụ đằ´ơ±´
Ûềĩ ểẹĩậễÛ ẵ±³³±²ư
ấãưãắã´ãơĐ ±º ±ắảằẵơư
ấ¿đã¿ắ´ằư ¿²ẳ °đ±ẵằẳôđằư ã² ³±ẳô´ằư ẵ¿² ắằ éẻìấòèÛ ±đ éậịễìí
éậịễìí ó êãưãắ´ằ º±đ ¿´´ °đ±ạđ¿³ ô²ãơư ôưã²ạ ơáằ ³±ẳô´ằ ứẳằº¿ô´ơữ
Trang 17ß ợữựÍ Ỳỡ³¾ữợ ¾ữŨờữữỲ óĩð ĩðð ả Ù ả ĩð ĩðð ô ựỎỎỡợựŨữ ŨỬ ĩî
ÉịÈỉũụöôöƠ ỏÈồưụựƠô ừẹứũụựƠô ỉÈồđụựƠô ịßồứũụựƠô ƯịũÝÈễÈỦồụựƠ
ÉịÈỉũụöôöƠ ỏÈồưụ¾Ơô ừẹứũụ¾Ơô ỉÈồđụ¾Ơô ịßồứũụ¾Ơô ƯịũÝÈễÈỦồụ¾Ơ
ũƯễÈÔỦồụßƠ ỉĨữ Ứ³ựÍÍữỨŨ ổỬỨỈŨỈởữ Ỳỡ³¾ữợ ựÒÒữÒ ŨỬ ĩòð
ợữŨỡợỲỨ ự Ỳỡ³¾ữợ ộỡỨŨ ÓợữựŨữợ ŨĨựỲ ĩòð ƯịũÝÈễÈỦồụßƠ ưữỎỈ³ựÍ ổợữỎỈỨỈỬỲ
ŨĨữ ổợỬÓợự³³ữợ
ũỪỡỈởựÍữỲŨ ŨỬ ỨŨợỡỎŨỨ ỈỲ Ý ổợỬÓợự³³ỈỲÓ ÍựỲÓỡựÓữ
ÝỬỲỨỈỨŨỨ Ửử ỬŨĨữợ ÒựŨự ŨớổữỨ ỈỲỎÍỡÒỈỲÓ ỬŨĨữợ ÒữợỈởữÒ ŨớổữỨưữợỈởữÒ Ũớổữ ỈỨ ÒữửỈỲữÒ ỈỲ ŨĨữ ởựợỈự¾Íữ ÒữỎÍựợựŨỈỬỲ ỨữỎŨỈỬỲ
Trang 18ÌÇÐÛøĨÇÁÌÇÐÛ÷ ỉỉ ỊÛÉÊßỴ ỊÛÉÊßỴ ã ÊßỴ
ÐỴ×ỊÌ ươỊÛÉÊßỴ ÛỊÜ ÍËÞỴĐËÌ×ỊÛ ÝßƠÝ ÛỊÜ ÍËÞỴĐËÌ×ỊÛ ÍËÞ
ÛỊÜ ÍËÞỴĐËÌ×ỊÛ ÍËÞ
ÍËÞỴĐËÌ×ỊÛ ÝßƠÝ øỊÛÉÊßỴ÷
ËÍÛ ÌÇÐÛĨĐÜơ ĐỊƠÇ ỉ ĨÇÁÌÇÐÛ ÌÇÐÛøĨÇÁÌÇÐÛ÷ ỉỉ ỊÛÉÊßỴ ÐỴ×ỊÌ ươỊÛÉÊßỴ ÛỊÜ ÍËÞỴĐËÌ×ỊÛ ÝßƠÝ
ĨĐÜËỞ ÌÇÐÛĨĐÜ
ÌÇÐÛ ĨÇÁÌÇÐÛ
×ỊÌÛÙÛỴ ỉỉ ÚĐĐ ÛỊÜ ÌÇÐÛ ĨÇÁÌÇÐÛ ÐËÞƠ×Ý ỉỉ ĨÇÁÌÇÐÛ
Trang 19»¨°±²»²¬·¿´ º±®³ô
¿«¬±ó½¿´·²¹÷
Ú©ò¼ Û©ò¼ô Û©ò¼Û»
ïÐôÙ©ò¼
ĨÎ×ÌÛøöôùøÚĩòì÷ù÷ Î ĨÎ×ÌÛøöôùøÛïîòíÛì÷ù÷ Î ĨÎ×ÌÛøöôùøïÐôÙîðòïí÷ù÷ Πݸ¿®¿½¬»® ßô ß© ĨÎ×ÌÛøöôùøß÷ù÷ Ý
Ô±¹·½¿´ Ô© ĨÎ×ÌÛøöôùøÔî÷ù÷ Ô
¿ ¬»®³·²¿´ ½®»»² ±® ®»¿¼·²¹ º®±³ ¿ µ»§¾±¿®¼ Ü·ºº»®»²½»
ÝÔÑÍÛøƠ«²·¬êÊ·« Ơô ±°¬·±²Ê÷
Ú±® »¨¿³°´» ưÑÐÛÒøïðô º·´»ê ù±«¬°«¬ò¼¿¬ùô ¬¿¬«ễ²»©ù÷
ÝÔÑÍÛø«²·¬êïðô ¬¿¬«ễµ»»°ù÷
Ѱ»²·²¹ ú ½´±·²¹ ¿ º·´»
̸» º·®¬ °¿®¿³»¬»® · ¬¸» «²·¬ ²«³¾»®
̸» µ»§©±®¼ «²·¬ê½¿² ¾» ±³·¬¬»¼ ̸» «²·¬ ²«³¾»® ðô ị ¿²¼ í ¿®» °®»¼»º·²»¼
Trang 20ÿ ĩ± ư±³ằơáã²ạ âãơá ơáằ ºã´ằ ựº±±ũẳ¿ơự ÛềĩìÚ
Úã´ằ âđãơã²ạ ¿²ẳ đằ¿ẳã²ạ ẫđãơã²ạ ơ± ¿²ẳ đằ¿ẳã²ạ ºđ±³ ¿ ºã´ằ ãư ẳ±²ằ ắĐ ạãêã²ạ ơáằ
Trang 21ậ²º±đ³¿ơơằẳ ìủẹ ẫđãơằ ơ± ¿ ưằ¯ôằ²ơã¿´ ắ㲿đĐ ºã´ằ
±ôơưãẳằ â±đ´ẳ
ẹ°ằ²ã²ạ ¿²ẳ ẵ´±ưã²ạ ¿ ºã´ằ
ĩ¿ơ¿ đằ¿ẳã²ạ ỳ âđãơã²ạậưằ ô²º±đ³¿ơơằẳ ứắ㲿đĐữ ìủẹ º±đ ¿´´ ằăẵằ°ơ ơằăơ ºã´ằư Íơđằ¿³ ìủẹ
ì²ơằđ²¿´ ìủẹ
21
Trang 23ĨỹưÔô ừÒÌỹÒÌụừÒứẺÌọ ữữ Ưô ớ ĨỹưÔ ữữ ểỪỠồ
ểỪỠồ ã Ưă Ư ã ớă ớ ã ểỪỠồ
ỹÒỵ ÍẺỡĨứẺÌừÒỹ ÍẺỡĨứẺÌừÒỹ ễẹƯồÁơịƯệụƯô ớọ
ÝửưĨưÝÌỹĨô ừÒÌỹÒÌụừÒứẺÌọ ữữ Ưô ớ ÝửưĨưÝÌỹĨ ữữ ểỪỠồ
ểỪỠồ ã Ưă Ư ã ớă ớ ã ểỪỠồ
ỹÒỵ ÍẺỡĨứẺÌừÒỹ ỹÒỵ ÓứỵẺÔỹ ễẹƯồỠổỬ
ÙỪỗỪệởơ ồệổơỪỬềệỪễ ỪẽƯỠồƠỪ
ĐĨứÙĨưÓ ễẹởểơị ẺÍỹ ễẹƯồỠổỬ
ừÓĐÔừÝừÌ ÒứÒỹ ÝửưĨưÝÌỹĨ ữữ ỗôễ ĨỹưÔ ữữ ẽôậ
ỗ ã ùỷù
ễ ã ùÍù
ẽãỉđ
ậãĩđ ĐĨừÒÌ ỏôẽôậ
ĐĨừÒÌ ỏôỗôễ
ÝưÔÔ ễẹƯồụỗôễọ ÝưÔÔ ễẹƯồụẽôậọ
ĐĨừÒÌ ỏôẽôậ
ĐĨừÒÌ ỏôỗôễ
ỹÒỵ ĐĨứÙĨưÓ
ÍồỪơởƯƠ ƯểểệởớềểỪễ Ứổệ ồệổơỪỬềệỪễữ ĨỹÝẺĨÍừÊỹ ĨỪơềệễởổỗ ỠỪƯỗễ ơƯƠƠởỗỰ Ư ồệổơỪỬềệỪ ẹởểịởỗ ởểễỪƠỨ ÌệởỰỰỪệỪỬ ếởƯ ĨỹÝẺĨÍừÊỹ ộỪậẹổệỬ
ĨỹÝẺĨÍừÊỹ ÚẺÒÝÌừứÒ ỨƯơểổệởƯƠụỗọ ĨỹÍẺÔÌụỨƯơọ
ừÒÌỹÙỹĨô ừÒÌỹÒÌụừÒọ ữữ ỗ
ừÒÌỹÙỹĨ ữữ ỨƯơ
ừÚ ụỗããđọ ÌửỹÒ ỨƯơãỉ ỹÔÍỹ ỨƯơãỗỏỨƯơểổệởƯƠụỗóỉọ
ỹÒỵ ừÚ ỹÒỵ ÚẺÒÝÌừứÒ ỨƯơểổệởƯƠ
ÍồỪơởƯƠ ƯểểệởớềểỪễ Ứổệ ồệổơỪỬềệỪễữ ĐẺĨỹ ĐẺĨỹ ộỪậẹổệỬ ởỗỬởơƯểỪễ ểịƯể ểịỪ Ứềỗơểởổỗ ởễ ỨệỪỪ ổỨ ễởỬỪ
ỪỨỨỪơểễ
Íềơị Ưễ Ư ơịƯỗỰỪ ởỗ ếƯƠềỪ ổỨ Ưỗ ởỗồềể ƯệỰềỠỪỗể ổệ ỰƠổớƯƠ ếƯệởƯớƠỪ
ừỗểệởỗễởơ Ứềỗơểởổỗễ ƯệỪ ƯƠẹƯậễ ồềệỪ
Òổ ụỪẽểỪệỗƯƠọ ừựứ ởễ ƯƠƠổẹỪỬ ởỗ ĐẺĨỹ ồệổơỪỬềệỪễ
ĐềệỪ ồệổơỪỬềệỪ Ỡềễể ễồỪơởỨậ ởỗểỪỗểễ ổỨ ởểễ ƯƠƠ ƯệỰềỠỪỗểễ ÌịỪ ỠổểởếƯểởổỗ ởễ ỪỨỨởơởỪỗơậữ ỹỗƯớƠỪễ ỠổệỪ ƯỰỰệỪễễởếỪ
ơổỠồởƠỪệ ổồểởỠởẩƯểởổỗ ƯỗỬ ồƯệƯƠƠỪƠởẩƯểởổỗ ẹởểị ỪòỰò ứồỪỗÓĐ
23
Trang 24ÎÛßÔô Ü×ÓÛÒÍ×ÑÒø²ô²÷ ææ ¿ô ¾ô ½ ÎÛßÔô Ü×ÓÛÒÍ×ÑÒø²÷ ææ ¬ô «ô ª òòò
Trang 25ÌổẹƯệỬễ ÚổệểệƯỗ ĩđđè
ÍỪỪ ịểểồữựựỨổệểệƯỗẹởộởòổệỰựỨổệểệƯỗựễịổẹựÚổệểệƯỗõĩđđèõễểƯểềễ
25
Trang 27Parallel Programming
with
Fortran Coarrays
Delivered at PRACE Advanced Training Centre,
CSC IT Center for Science Ltd, Finland,
September 13, 2012
David Henty, Alan Simpson (EPCC)
Harvey Richardson, Bill Long (Cray)
Tutorial Overview
The Fortran Programming Model in context
Basic coarray features
Programming models for HPC
The challenge is to efficiently map a problem to the architecture we have
Take advantage of all computational resources Manage distributed memories etc
Optimal use of any communication networks The HPC industry has long experience in parallel programming Vector, threading, data-parallel, message-passing etc
We would like to have models or combinations that are efficient
safe easy to learn and use
5
Why consider new programming models?
Next-generation architectures bring new challenges:
Very large numbers of processors with many cores Complex memory hierarchy
even today (2011) we are at 500k cores Parallel programming is hard, need to make this simpler Some of the models we currently use are
bolt-ons to existing languages as APIs or directives Hard to program for underlying architecture unable to scale due to overheads
So, is there an alternative to the models prevalent today?
Most popular are OpenMP
1-8 9-16 17-24 25-32
8
27
Trang 28Shared Memory Directives
Multiple threads share global memory
Most common variant: OpenMP
Program loop iterations distributed to threads,
more recent task features
Each thread has a means to refer to private objects
within a parallel context
Terminology
Thread, thread team
Implementation
Threads map to user threads running on one SMP node
Extensions to distributed memory not so successful
OpenMP is a good model to use within a node
10
Cooperating Processes Models
11
processes PROBLEM
Message Passing, MPI
Remote side of communication does not participate Can test for completion
Barriers and collectives Popular on Cray and SGI hardware, also Blue Gene version
To make sense needs hardware support for low-latency type operations
RDMA-16
28
Trang 29New shared data structures
shared pointers to distributed data (block or cyclic)
pointers to shared data local to a thread
Synchronization
Language constructs to divide up work on shared data
upc_forall() to distribute iterations of for() loop
Extensions for collectives
Both commercial and open source compilers available
Cray, HP, IBM
Berkeley UPC (from LBL), GCC UPC
19
Fortran 2008 coarray model
Example of a Partitioned Global Address Space (PGAS)
model
Set of participating processes like MPI
Participating processes have access to local memory
via standard program mechanisms
Access to remote memory is directly supported by
Type checking Opportunity to optimize communication
No penalty for local memory access Single-sided programming model more natural for some algorithms
and a good match for modern networks with RDMA
23
Fortran coarrays Basic Features
29
Trang 30Coarray Fortran
"Coarrays were designed to answer the question:
What is the smallest change required to convert Fortran
into a robust and efficient parallel language?
The answer: a simple syntactic extension
It looks and feels like Fortran and requires
Fortran programmers to learn only a few new rules."
John Reid, ISO Fortran Convener
25
Some History
Introduced in current form by Numrich and Reid in 1998 as a
simple extension to Fortran 95 for parallel processing
Many years of experience, mainly on Cray hardware
A set of core features are now part of the Fortran standard
ISO/IEC 1539-1:2010
Additional features are expected to be published in a
Technical Specification in due course
26
How Does It Work?
SPMD - Single Program, Multiple Data
single program replicated a fixed number of times
Each replication is called an image
Images are executed asynchronously
execution path may differ from image to image
some situations cause images to synchronize
Images access remote data using coarrays
Normal rules of Fortran apply
27
Arrays or scalars that can be accessed remotely
images can access data objects on any other image
Additional Fortran syntax for coarrays
Specifying a codimension declares a coarray
these are equivalent declarations of a array x
of size 10 on each image
x is now remotely accessible
coarrays have the same size on each image!
What are coarrays?
Be careful when updating coarrays:
If we get remote data was it valid?
Could another process send us data and overwrite something we have not yet used?
How do we know that data sent to us has arrived?
Fortran provides synchronisation statements For example, barrier for synchronisation of all images:
do not make assumptions about execution timing on images unless executed after synchronisation
Note there is implicit synchronisation at program start
end if sync all
Making remote references
We used a loop over images
32
Note that array indexing within the coindex is not allowed
so we can not write
do image = 2,num_images() x[image] = x
end do
x[2:num_images()] = x ! illegal
30
Trang 31You need to implement your view of global data from the local
coarrays as Fortran does not provide the global view
You can be flexible with the coindexing (see later)
You can use any access pattern you wish
ca(1:4)[1] ca(1:4)[2] ca(1:4)[3] ca(1:4)[4]
integer :: ca(4)[*]
do image=1,num_images()
print *,ca(:)[image]
end do
1D cyclic data access
coarray declarations remain unchanged
but we use a cyclic access pattern
code execution on images is independent
programmer has to control execution using synchronisation
synchronise before accessing coarrays
ensure content is not updated from remote images before
you can use it
synchronise after accessing coarrays
ensure new content is available to all images
implicit synchronisation after variable declarations at first
executable statement
guarantees coarrays exist on all images when your first
program statement is executed
We will revisit this topic later
if (this_image() == 1) then
do image = 2, num_images() maximum = max(maximum, maximum[image]) end do
do image = 2, num_images() maximum[image] = maximum end do
end if sync all
37
implicit synchronisation ensure all images set local maximum
ensure all images have copy of maximum value
integer, dimension(nimages) :: nprimes[*]
real density start = (this_image()-1) * n/num_images() + 1 end = start + n/num_images() - 1
nprimes(this_image())[1] = num_primes(start,end)
sync all
Example2: Calculate density of primes
40
if (this_image()==1) then nprimes(1)=sum(nprimes) density=real(nprimes(1))/n print *,"Calculating prime density on", &
& num_images(),"images"
print *,nprimes(1),'primes in',n,'numbers' write(*,'(" density is ",2Pf0.2,"%")')density write(*,'(" asymptotic theory gives ", &
& 2Pf0.2,"%")')1.0/(log(real(n))-1.0) end if
31
Trang 32Example2: Calculate density of primes
41
Calculating prime density on 2 images
664580 primes in 10000000 numbers
density is 6.65%
asymptotic theory gives 6.61%
Launching a coarray program
The Fortran standard does not specify how a program is
launched
The number of images may be set at compile, link or run-time
A compiler could optimize for a single image
Observations so far on coarrays
Natural extension, easy to learn
Makes parallel parts of program obvious (syntax)
Part of Fortran language (type checking, etc)
No mapping of data to buffers (or copying) or creation of
complex types (as we might have with MPI)
Compiler can optimize for communication
More observations later
43
Exercise Session 1
Look at the Exercise Notes document for full details
number of images
Extend the simple Fortran code provided in order to perform
operations on parts of a picture using coarrays
44
Backup Slides HPF model
45
High Performance Fortran (HPF)
Data Parallel programming model Single thread of control
Arrays can be distributed and operated on in parallel Loosely synchronous
Parallelism mainly from Fortran 90 array syntax, FORALL and intrinsics
This model popular on SIMD hardware (AMT DAP, Connection Machines) but extended to clusters where control thread is replicated
Trang 33More Coarray Features
Parallel Programming with Fortran Coarrays
Delivered at PRACE Advanced Training Centre,
CSC IT Center for Science Ltd, Finland,
September 13, 2012
David Henty, Alan Simpson (EPCC)
Harvey Richardson, Bill Long (Cray)
Overview
Multiple Dimensions and Codimensions
Allocatable Coarrays and Components of Coarray
images
P(m,n) Variables/arrays
P(m,n)[*]
P(m,n)[k,*]
2D Data
assemble rather than distribute
Can assemble a 2D data structure from 1D arrays
global access: ca(3,1)[2,2]
local access: ca(3,1)
Coarray Subscripts
Fortran arrays defined by rank, bounds and shape integer, dimension(10,4) :: array rank 2
lower bounds 1, 1; upper bounds 10, 4 shape [10, 4]
Coarray Fortran adds corank, cobounds and coshape integer :: array(10,4)[3,*]
corank 2 lower cobounds 1, 1; upper cobounds 3, m coshape [3, m]
m would be ceiling(num_images()/3)
7
Multiple Codimensions
Coarrays with multiple Codimensions:
character :: a(4)[2, *] !2D grid of images
for 4 images, grid is 2x2; for 16 images, grid is 2x8
real :: b(8,8,8)[10,5,*] !3D grid of images
8x8x8 local array; with 150 images, grid is 10x5x3
integer::c(6,5)[0:9,0:*] !2D grid of images
lower cobounds [ 0, 0 ]; upper cobounds [ 9,n]
useful if you want to interface with MPI or want C like coding
Sum of rank and corank should not exceed 15 Flexibility with cobounds
can set all but final upper cobound as required