1. Trang chủ
  2. » Giáo án - Bài giảng

opensbli a framework for the automated derivation and parallel execution of finite difference solvers on a range of computer architectures

12 4 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề OpenSBLI: a framework for the automated derivation and parallel execution of finite difference solvers on a range of computer architectures
Tác giả Christian T. Jacobs, Satya P. Jammy, Neil D. Sandham
Trường học University of Southampton
Chuyên ngành Computational Science
Thể loại journal article
Năm xuất bản 2017
Thành phố Southampton
Định dạng
Số trang 12
Dung lượng 3,22 MB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

While such applications entail solv-ing the 3D compressible Navier-Stokes equations, in principle otherequationsexpressibleinEinsteinnotationandsolvedusing finitedifferencesarealsosupport

Trang 1

jo u r n al ho me p a g e :w w w e l s e v i e r c o m / l o c a t e / j o c s

architectures

Christian T Jacobs∗, Satya P Jammy, Neil D Sandham

a r t i c l e i n f o

Keywords:

a b s t r a c t

Exascalecomputingwillfeaturenovelandpotentiallydisruptivehardwarearchitectures.Exploitingthese

totheirfullpotentialisnon-trivial.Numericalmodellingframeworksinvolvingfinitedifferencemethods arecurrentlylimitedbythe‘static’natureofthehand-codeddiscretisationschemesandrepeatedlymay havetobere-writtentorunefficientlyonnewhardware.Incontrast,OpenSBLIusescodegenerationto derivethemodel’scodefromahigh-levelspecification.Usersfocusontheequationstosolve,whilstnot concerningthemselveswiththedetailedimplementation.Source-to-sourcetranslationisusedtotailor thecodeandenableitsexecutiononavarietyofhardware

©2016TheAuthor(s).PublishedbyElsevierB.V.ThisisanopenaccessarticleundertheCCBYlicense

(http://creativecommons.org/licenses/by/4.0/)

1 Introduction

HighPerformanceComputing(HPC)systemsandarchitectures

areevolvingrapidly.Traditionalsingleprocessor-basedCPU

clus-tersaremovingtowardsmulti-core/multi-threadedCPUs.Atthe

sametimenewarchitecturesbasedonmany-coreprocessorssuch

asgraphicsprocessingunits(GPUs)andIntel’sXeonPhiare

emerg-ingasimportantsystemsandfurtherdevelopmentsareexpected

withenergy-efficient designsfrom ARM and IBM.Accordingto

theITindustry,suchadvancesareexpectedtodelivercompute

hardwarecapableofexascale-performance(i.e.1018floating-point

operationspersecond)by2018[1].Yetmanyframeworksaimed

atcomputational/numericalmodellingarecurrentlynotreadyto

exploitsuchnewandpotentiallydisruptivetechnologies

Traditional approaches to numerical model development

involvetheproductionofstatic,hand-writtencodetoperformthe

numericaldiscretisationandsolutionofthegoverningequations

NormallythisiswritteninalanguagesuchasCorFortranthatis

considerablylessabstractwhencomparedtoanear-mathematical

domainspecificlanguage.Explicitlyinsertingthenecessarycallsto

MPIorOpenMPlibrariesenablestheexecutionofthecodeon multi-coreormulti-threadhardware.However,shouldauserwishtorun thecodeonalternativeplatformssuchasGPUs,theywouldlikely needtore-writelargesectionsofthecode,includingcallstonew librariessuchasCUDAorOpenCL,andoptimiseitforthatparticular hardwarebackend[2].AsHPChardwareevolves,anincreasing bur-denfacedbycomputationalscientistsbecomesapparent;inorder

tokeepupwithtrendsinHPC,notonlymustamodeldeveloper

beadomainspecialistintheirareaofstudy,butalsoanexpertin numericalalgorithms,softwareengineering,andparallel comput-ingparadigms[3,4]

Onewaytoaddressthisissueistointroduce aseparationof concernsusinghighlevelabstractions,suchasdomainspecific lan-guages(DSLs)andactivelibraries[4–8].Thisparadigmshiftallows

adomainspecialisttodescribetheirproblemasahigh-level, near-mathematicalspecification Thetaskof takingthis specification andtransformingit intoexecutablecomputercodecanthenbe handledinthesubsequentabstractionlayer;unlikethetraditional approachofhand-writingtheC/Fortrancodethatdiscretisesthe governingequations,thislayergeneratesthecodeautomatically fromtheproblemspecification.Finally,thegeneratedcodecanbe readilytargettedtowardsa specifichardware platformthrough source-to-sourcetranslation.Hence,domainspecialistsfocuson theequationstheywishtosolveandthesetupoftheirproblem, whilsttheparallelcomputingexpertscanintroducesupportfor

http://dx.doi.org/10.1016/j.jocs.2016.11.001

Trang 2

havetoundergoa fundamentalre-write ifthedesiredbackend

changes.Useofsuchstrategiescanhavesignificantbenefitsforthe

productivityofboththeuseranddeveloper,byremovingtheneed

tospendtimere-writingcodeand/ortheproblemspecification[5]

Giventhemotivationfortheuseofautomatedsolution

tech-niques, in this paper we present a new framework, OpenSBLI,

for the automated derivation and parallel execution of finite

difference-based models.This is an open-source release of the

recent developments in the SBLI codebase developed at the

University ofSouthampton,involvingthe replacementof SBLI’s

Fortran-based core withflexible Python-basedcode generation

capabilities,andthecouplingofSBLItotheOPSactivelibrary[9–12]

whichtargetsthegenerated codetowardsa particularbackend

usingsource-to-sourcetranslation.Currently,OpenSBLIcan

gen-erateOPS-compliantCcodetodiscretiseandsolvethegoverning

equations,usingarbitrary-ordercentralfinitedifferenceschemes

andachoiceofeithertheforwardEulerschemeorathird-order

Runge-Kutta time-steppingscheme.OpenSBLIthenuses OPSto

producecodetargettedtowardsdifferentbackends.Itisworth

not-ingthatbackend APIssuchasOpenMP (version4.0and above)

arealsocapableofrunningonCPU,GPUandIntelXeonPhi

archi-tectures,forexample.However,currentlyOPShasnosupportfor

OpenMPversion4.0andabove.Moreover,codesthatarewritten

byhandinOpenMPwouldstillpotentiallyneedtobere-writtenif

differentalgorithmsorequationsweretobeconsidered.Thus,the

benefitsofcodegenerationstillplayacrucialrolehere,regardless

ofwhichbackendischosen

Theapplicationof SBLIhasso-far concentratedonproblems

inaeronauticsandaeroacoustics, inparticularlookingat

shock-boundarylayerinteractions(seee.g.[13–16]andthereferences

therein for more details) While such applications entail

solv-ing the 3D compressible Navier-Stokes equations, in principle

otherequationsexpressibleinEinsteinnotationandsolvedusing

finitedifferencesarealsosupportedbythenewcodegeneration

functionality, highlighting anotheradvantage of such a flexible

approachtonumericalmodeldevelopment.Notealsothatwhile

OpenSBLIdoesnotyetfeatureshock-capturingschemesandLarge

EddySimulationmodels(unlikethelegacySBLIcode),thesewill

beimplementedinthefutureaspartoftheproject’sroadmap.The

mainpurposesofthisinitialreleaseisthealgorithmicchangesto

legacySBLI’score

Details the abstraction and design principles employed by

OpenSBLIaregiveninSection2.Section3detailsthreeverification

andvalidationtestcasesthatwereusedtocheckthecorrectness

oftheimplementation.Thepaperfinisheswithsomeconcluding

remarksinSection4

2 Design

LegacyversionsofSBLIcomprisestatichand-writtenFortran

code,parallelisedwithMPI,thatimplementsafourth-order

cen-traldifferencingschemeandalow-storage,thirdorfourth-order

Runge-Kuttatimesteppingroutine.Itiscapableofsolvingthe

com-pressibleNavier-Stokesequationscoupledwithvariousturbulence

parameterisations(e.g.LargeEddySimulationmodels)and

diag-nosticroutines.Incontrast,OpenSBLIiswritteninPython,andby

replacingthelegacycorewithmoderncodegenerationtechniques,

theexisting functionality ofSBLI is enrichedwithnew

flexibil-ity;thecompressibleNavier-Stokesequationscanstillbesolved

inOpenSBLIforthesakeofcontinuity,butthesetofequationsthat

canbereadilysolvedessentiallybecomesasupersetofthatofthe

legacycode.Furthermore,theuseoftheOPSlibraryallowsthe

gen-eratedcodetoeasilybetargettedtowardssequential,MPI,oran

MPI+OpenMPhybridbackend(forCPUparallelexecution),CUDA

andOpenCL(forGPUparallelexecution),andOpenACC(forparallel executiononaccelerators),withouttheneedtore-writethemodel code.OPSisreadilyextensibleintermsofnewbackends,making thecodegenerationtechniqueanattractivewayoffuture-proofing thecodebaseandpreparingtheframeworkforexascale-capable hardwarewhenitarrives.ThemainachievementofOpenSBLIis theability toexpressmodel equationsat ahigh-level withthe helpoftheSymPylibrary[17],expandingtheequationsbasedon theindexnotation,andcouplingthisfunctionalitywiththe gen-erationofOPSC-basedmodelcodeandalsowiththeOPSlibrary whichperformscodetargetting.OpenSBLI’sfocusonthe genera-tionofcomputationalkernelsessentiallyformsabridgebetween thehigh-levelequationsandthecomputationalparallelloops (‘par-loops’) that iterateover thegrid pointsto solvethe governing equations

Foranygivensimulationthatistobeperformedwith OpenS-BLI,theproblem(comprisingtheequationstobesolved,thegrid

to solve them on, their associated boundary and initial condi-tions,etc)mustbedefinedinasetupfile,whichisnothingbut

aPythonfilewhichinstantiatesthevariousrelevantcomponents

oftheOpenSBLIframework.Allcomponentsfollowtheprinciple

of object-oriented design, and each class is explained in detail throughoutthesubsectionsthatfollow.Anoverviewoftheclass relationshipsisalsoprovidedinFig.1

Trang 3

2.1 Equationspecification

Inasimilarfashiontootherproblemsolvingenvironmentssuch

asOpenFOAM [18],Firedrake [4], FEniCS[5,6], OPESCI-FD [19],

Devito[20,21], deal.II[22]and FreeFEM++ [23],OpenSBLI

com-prisesahigh-levelinterfaceforspecifyingthedifferentialequations

thataretobesolved.Theseequations(andanyaccompanying

for-mulasfortemperature-dependentviscosity,forexample)canbe

expressedinEinsteinnotation,alsoknownasindexnotation.The

adoptionofsuchanabstractionisadvantageoussinceitremoves

theneed for the usertoexpand the equations by hand which

canbean error-pronetask Furthermore, much liketheDevito

domainspecificlanguage(DSL)[20,21]forfinitedifferencestencil

compilation,OpenSBLImakesuseoftheSymPysymbolicalgebra

librarythatsuppliesthebasiccomponentsrequiredforthe

mod-ellingfunctionalitythathasbeenimplementedinthepresentwork

This functionality includes the automatic expansion of indices

basedontheircontractionstructure,suchthatrepeatedindicesare

expandedintoasumaboutthatindex,andtheimplementationof

varioustypesofdifferentialoperator

2.1.1 Expressing

Considertheconservationofmassequation

∂

∂t +∂ ∂

whereujis thejthcomponentofthevelocity vectoru,isthe

densityfield,andxjisthecoordinatefield inthejthdimension

InanOpenSBLIproblemsetupfile,theuserwouldspecifythisasa

string,givingtheleft-handsideandright-handsideoftheequation

inthefollowingformat:

mass=“Eq(Der(rho,t),−Conservative(rho∗uj,xj))”

The functions Der and Conservative here are

OpenSBLI-specific derivative operators, each defined in their own class

derivedfromSymPy’sFunctionclass.Otherhigh-levelinterfaces

suchasOpenFOAMoffersimilardifferentialoperatorssuchasdiv

andgrad,forexample[18].Generalderivativesarerepresented

using the Der operator, whereas the Conservative operator

ensuresthatthederivativewillnotbeexpandedusingthe

prod-uctrule.Askew-symmetricformofthederivativeisalsoavailable

usingtheSkewfunction,discussedlaterinSection3.3.Allofthese

areessentially‘handler’/placeholder objectsthat OpenSBLIuses

for spatial/temporal discretisation after parsing and expanding

theequations abouttheEinstein indices.Special functionssuch

astheKronecker deltafunction and theLevi-Civitasymbol are

alsoavailable,derivedfromSymPy’sLeviCivitaand

Kroneck-erDeltaclassesinordertohandleEinsteinexpansion;thesetoo

areexpandedlaterbyOpenSBLI

2.1.2 Parsing

Onceallofthegoverningequationshavebeenexpressedby

theuserinstring format,theyarecollectedtogetherin

OpenS-BLI’s Problem class (see Appendix A This class also accepts

substitutions, formulas, and constants For long equations, such

optionalsubstitutions(suchasthedefinitionofthestresstensor)

canbewrittenasaseparatestring(inthesamewayasthe

gov-erningequations)toallowbetterequationreadability,andthen

automaticallysubstitutedintotheequations(suchasthe

conser-vationof momentum and energy equations)at expansion-time

insteadofperformingsucherror-pronemanipulationsbyhand

Theconstitutiveequationswhich definearelationshipbetween

theprognosticand non-prognosticvariablesaregivenas

formu-las,forexampletemperature-dependentviscosityrelations,andan

equationofstateforpressure.Theconstantsarethespatiallyand

temporallyindependentvariableswhicharerepresentedasstrings UponinstantiationoftheProblemclass,theprocessisinvokedto transformtheequationsintotheirfinalexpandedform

Foreach equationinstringform,anewOpenSBLIEquation object is created During its initialisation, SymPy’s parseexpr functionconvertstheequationstringintoaSymPyEqdatatype AnyoftheOpenSBLIderivativeoperatorssuchasDerand Conser-vative(currentlyinstringformat)arereplacedbyactualinstances

oftheDerandConservativeclasses.Similarly,anysubstitutions givenintheProblemareparsedandsubstituteddirectlyintothe expressionusingSymPy’sxreplacefunction.Allothertermsin theparsedexpressionarerepresentedbyOpenSBLI’s Einstein-Termclass,derived fromSymPy’s Symbolclass, which contains itsownmethodsand attributesfordetermining/expanding Ein-steinindices.Forexample,theclass’sinitialisationmethod init splitsupthetermujwherethereareunderscoremarkers,and storestheEinsteinindex jina listasa SymPyIdxobject.The get expanded method later replaces the alphabetical Einstein indiceswithactualnumericalindices,replacing jwith0and1,

inthe2Dcase.Finally,anyconstantsintheProblemobjectarealso representedasanEinsteinTermobject,butareflaggedasconstant termsinOpenSBLI,sothattheyarenotspatiallyor temporally-dependent Thecoordinate vectorcomponentsxj(and thetime termt)areaspecialcaseofanEinsteinTerm;thesearemarked withais coordinateflagsothat,duringtheexpansionphase,the EinsteinTermsaremadedependentonthecoordinatefield(and time,ifappropriate) toensurethat differentiationisperformed correctly

2.1.3 Expanding After the parsing and substitution stage, the equations are expandedaboutrepeatedindices.Note thatthisprocess is per-formedbyOpenSBLI,althoughvariousSymPyclassesunderpinthe functionality.Followingtheexample,(1)wouldbeexpandedas

∂

∂t + ∂

∂x0[u0]+ ∂

OpenSBLIloopsovereachEinsteinTermstoredintheparsed Equationobject,andmapsittoaSymPyIndexedobject.For exam-ple,thetermukwouldfirstbemappedtou[k].Theindexkinthe termisthenexpandedover0, ,d−1(wheredisthedimensionof theproblem)byreplacingitwitheachintegerdimension,yieldinga SymPyMutableDenseNDimArrayarrayofsized(foravector func-tion,ord×dforatensorofrank2)ofexpandedvariableswhich

isstoredasaclassattribute.Forexample,expandingthevector u[k]yieldstheexpansionarray[u0, u1]in2D.Uponexpansion, thetermsarealsomadespatially-dependent(i.e.indexedbyx0,

x1,x2coordinates,dependingonthedimension)and,ifapplicable, temporally-dependent(i.e.indexedalsobyt).Theonlyexceptions

tothisareconstantssuchastheReynoldsnumberRe.The expan-sionarrayfromthepreviousexamplethenbecomes[u0[x0, x1, t], u1[x0, x1, t]](and[x0, x1]fortheconstantcoordinate field)

Eachequationisexpandedbylocatinganyrepeatedindicesand thensummingoverthemasappropriate.Forexample,after map-pingeachEinsteinTerm(e.g.uk)toanIndexedobject(e.g.u[k]), themassequationisrepresentedinternallyas

Eq(Der(rho,t),−Conservative(rho∗u[k],x[k])) Sincetheindexkisrepeated,theexpansionarraysareusedto expandthisexpressionto

Eq(Der(rho[x0,x1,t],t),

−Conservative(rho[x0,x1,t]∗u0[x0,x1,t],x0[x0,x1,t])

−Conservative(rho[x0,x1,t]∗u1[x0,x1,t]),x1[x0,x1,t]))

Trang 4

Fig 2.The regular grid of solution points upon which the governing equations

Finally,theDerandConservativefunctionsareapplied,with

theexpressionbecoming

Eq(Derivative(rho[x0,x1,t],t),

−Derivative(rho[x0,x1,t]∗u0[x0,x1,t],x0)

−Derivative(rho[x0,x1,t]∗u1[x0,x1,t],x1))

whichisequivalentto(2).Similarexpansioncanalsobeapplied

foranyotherequationsinvolvinge.g.diagnosticfields.Notehow

thecalls toDerandConservativehavebeenreplacedbycalls

toSymPy’sDerivativeclass(whichinturnusesSymPy’sdiff

function);whileitisSymPythathandlesthedifferentiation,itis

OpenSBLIthathandlestheexactformulationofthederivative(i.e

OpenSBLIhasensuredthatthederivativehasnotbeenexpanded

usingtheproductrulehere)

Anynested derivatives are also handledhere It is not

cur-rently possibleto specify,for example, diff(diff(uj, x i),

xj)usingSymPy’sdifffunctiondirectlybecausethefactthat

uj is dependenton xi and x j is not takeninto account In

contrast,theuseofDerandEinsteinTermslikeu jin

OpenS-BLIallowsthederivativetobecomputedcorrectlysincetheterms

aremadedependentthroughtheuseofIndexedobjectsas

previ-ouslydescribed.OpenSBLIusersmustinsteadusetheDerfunction

Der(Der(uj, xi), xj).Foreachnestedderivative(ornested

functioningeneral),theinnerfunctionisevaluatedfirstalongwith

allother non-nested functions.Only thenis the outerfunction

applied

For the purposes of debugging, OpenSBLI includes a

LatexWriterclass that takestheexpanded equations asinput

and writes them out in LaTeX format sodevelopers can more

easilyspoterrors,forexamplewhereindiceshavebeenexpanded

incorrectly

2.2 Grid

Thegoverningequationsarediscretisedonaregulargridof

solu-tionpointsthatspanthedomainofinterest;anexampleisprovided

inFig.2.Allgrid-relatedfunctionalityishandledbytheGridclass,

whichmustbeinstantiatedbytheuserintheproblemsetupfile

Thedimensionalityoftheproblemd,thenumberofpointsineach dimension,andthegridspacingmustallbesupplied.Aproblem

ofdimensiondwouldgenerateagridofNx0×···×Nxd−1 solution pointsintotal,whereNxi representstheuser-definednumberof gridpointsindirectionxi

Forthesakeofloopingovereachsolutionpointand comput-ingthenecessaryderivativesviathefinitedifferencemethod,each (non-constant)termisprocessedfurtherbyOpenSBLI;theindex

ofeachspatialcoordinate(e.g.x0)ismappedontoanindexover thegridpointsinthatspatialdirection(e.g.i0)whichwilliterate from0toNxi−1(foragivendirectionxi)whenthecomputational kerneliseventuallygenerated

Inadditiontothesolutionpointswithinthephysicaldomain,a setofhalopoints(or‘ghost’points),whichbordertheouter-most gridpoints,arealsocreatedautomaticallydependingonthe bound-aryconditionsandthespatialorderofaccuracy.Thesehalopoints arenecessarytoensurethatthederivativesneartheboundarycan

becomputedwiththesamestencilasthe‘inner’points.Theexact numberofhalopointsrequiredthereforedependsonthenumber

ofstencilpoints;forexample,inFig.2thestencilforasecond-order centraldifference(using3pointsineachdirection)wouldrequire onehalopointateachendofthedomain.Thevaluesthatthesehalo pointsholddependonthetypeofboundaryconditionapplied,and thisisdiscussedinmoredetailinSection2.6

Everyfield/terminthegoverningequationsthatisrepresented

by thegridindices holds a so-called‘work array’which essen-tiallycontainsthefield’snumericalvalueateachofthegridpoints, includingthehalos.Theimplementationofinitialandboundary conditionsisdonebyaccessingandmodifyingthisworkarray,as willbedescribedinSections2.5and2.6

2.3 Computationalkernels

TheKernelclassdefinesasequenceofcomputationalstepsthat shouldbeperformedtosolvethegoverningequations.Forinstance, onekernelmaybecreatedtocomputethespatialderivativeofa field,while anotherkernel handlestheinitialisationofthefield valuesbasedonagiveninitialcondition,andanotherhandlesthe enforcementof boundaryconditionsthatinvolvecomputations Duringtheinstantiationofakernel,therelevantvariablesandfields areclassifiedasinputs,outputsandinput/outputs(i.e.bothaninput andanoutput),andthekernel’srangeofevaluation(i.e.therange

ofgridindicesoverwhichthekernelisapplied).Thishelpsto min-imisedatatransfer,sinceonlythose variables/fieldsrequiredto performthecomputationarepassedtothegeneratedkernelcode

2.4 Discretisationschemes Onceagridiscreated,theequationsarediscretiseduponthat grid.ForspatialdiscretisationpurposesOpenSBLIoffersacentral differencingschemeforfirstandsecond-orderderivatives;allthe stencilcoefficientsarecomputedusingSymPy,whichallows sten-cilsofanarbitraryorderofaccuracytobecreated.Fortemporal discretisation purposes,OpenSBLI features the (first-order) for-wardEulerschemeaswellasthesamelow-storage,third-order Runge-Kuttatimesteppingscheme[24]presentinthelegacySBLI code

Touseaparticularscheme,oneshouldinstantiatea discreti-sationschemederivedfromthegenericbaseclasscalledScheme, whichessentiallystoresthefinitedifferencestencilcoefficientsor theweightsusedinaparticulartime-steppingscheme.Spatialand temporalschemesshouldbeinstantiatedseparately

Forthepurposeofspatialdiscretisation,handledbythe OpenS-BLI SpatialDiscretisation class, an Evaluations object is createdforeachoftheformulas,andthederivativesinthe equa-tions.Each objectautomaticallyfindsandstoresthe

Trang 5

depend-enciesAandB).OncealltheEvaluationshavebeencreated,they

aresortedwithrespecttotheirdependenciesbeingevaluated(e.g

ifBdependsonA,thenAshouldbeevaluatedfirst).Thenextstep

involvesdefiningtherangeofgridpointindicesoverwhicheach

evaluationshouldbeperformed,andalsoassigningatemporary

workarrayfor each evaluation Allof the evaluationsare then

describedbyaKernelobject (seeSection2.3).It ishere,while

creatingthekernels,thatthe(continuous)spatialderivativesare

automaticallyreplacedbytheirdiscretecounterparts.Itshouldbe

notedthat,fortheevaluationofformulas,thesekernelsarefused

togetheriftheyhavenointer-dependenciestoavoidrace

condi-tionswhenrunningonthreadedarchitectures.Finally,toevaluate

theresidualforthepurposesoftemporaldiscretisation,the

deriva-tivesintheexpandedequations(representedbyanEvaluations

object)aresubstitutedbytheirtemporaryworkarrays,anda

Ker-neliscreatedforevaluatingtheresidualofeachequation

The temporal discretisation, handled by the

TemporalDis-cretisationclass, involvesapplyingthe variousstages of the

time-steppingschemesuppliedusingtheresidualscomputedby

thespatialdiscretisationprocess.Similarly,aKernelobjectis

cre-atedfortheevaluationsinthetime-steppingscheme

2.5 Initialconditions

Inorder fortheprognosticfields tobeadvancedforward in

time,initialconditionscanbeappliedusingthe

GridBasedIni-tialisationclass.Thisisaccomplishedinmuchthesameway

asspecifyingequations,butinvolvesassignmentofgridvariables

andworkarraysofgridpointvalues.Forexample,inthe

simula-tionsetupfilethex0coordinatecanbedefinedusingthegridpoint

indexandx0:

x0=“Eq(grid.gridvariable(x0),grid.Idx[0]∗grid.deltas[0])”,

whichinturndefines theinitialvalueforeach prognostic

vari-able,byassigningthistothearrayofvaluesateachgridpoint(also

knownasthevariable’sworkarray),e.g.:

rho=“Eq(grid.workarray(rho),2.0∗sin(x0))”

2.6 Boundaryconditions

OpenSBLIcurrentlycomprisestwotypesofboundarycondition,

implemented in the classes PeriodicBoundaryCondition and

SymmetryBoundaryCondition.Usersmayapplydifferent

bound-ary conditions in different directions if they so wish Periodic

boundaries are defined such that, for each prognostic field ,

(x0)=(xN)whereNisthenumberofpointsinthedomain.This

conditionisachievedviatheexchangeofhalopointdataateachend

ofthedomain.Symmetryboundaryconditionsenforcethe

condi-tionthat(xN)=(xN−1)forscalarfieldsandi(xN)=−i(xN−1)for

vectorfields(inthedirectioni),whichisachievedusinga

compu-tationalkernel

2.7 Inputandoutput

Thestateoftheprognosticfieldscanbewrittentodiskevery

niterationsasdefinedbytheuser,oronlyattheendofthe

simu-lation.ThisfunctionalityishandledbytheFileIOclass.OpenSBLI

adoptstheHDF5format[25,26]asitfeaturesparallelread/write

capabilitiesandthereforehasthepotentialtoovercometheserial

input/outputbottleneckcurrentlyplaguingmanylarge-scale

par-allelapplications[27,28].FuturereleasesofOpenSBLIwillcome

withtheabilitytoreadinmeshfilesandthestatefieldsfroman

HDF5file,enablingtherestartingofsimulationsfrom‘checkpoints’

aswellastheassignmentofinitialconditionsthatcannotbesimply definedbyaformula

2.8 Codegeneration OpenSBLIcurrentlygeneratescodeintheOPSClanguagewhich performsthesimulation;thisisessentiallystandardC++codethat includescallstotheOPSlibrary.Suchfunctionalityisaccomplished usingtheOpenSBLIOPSCCodePrinterclass(derivedfromSymPy’s CCodePrinterclass,usedtoperformthegenerationofOPSCcode statements)and theOPSC class(which agglomeratestheliteral stringsofOPSCstatementsandkernelfunctionsandwritesthemto file).Thegeneratedcode’sstructurefollowsagenerictemplatethat mapsouttheorderinwhichthesimulationsteps/computations aretobecalled.Thetemplateisrepresentedasamulti-linePython stringtemplate,witheachlinecontainingaplace-holderforthe codethatperforms aparticularstep.Examplesinclude$header which is replaced by any generic boilerplate headercode (e.g

#include <stdlib.h> and kernel function prototypes), $ini-tialisationwhich isreplacedbythegridandfieldsetup(e.g

bydeclaringanOPS blockusingtheopsdeclblockfunction), and$bccallswhichisreplacedbycallstotheboundary condi-tionkernel(s).Thistemplatecanbereadilychangedtoincorporate additionalfunctionality,suchastheinclusionofturbulence mod-els.Onceallcomponentplace-holdershavebeenreplacedbyOPSC code,thecodeiswrittenouttodisk.ForthecaseoftheOPSC lan-guage,twofilesarewritten;oneisaC++headerfilecontainingthe computationalkernels,andtheotheristheC++sourcefile contain-ingvariousconstantdefinitions(e.g.thetimestepsizedeltat,and theconstantsoftheButchertableauforthetime-steppingscheme), OPSdatastructures,andcallstothekernelsspecifiedintheheader file

OpenSBLI’slocalPythonobjects(mostpertinently,thekernel objectsthat describethecomputationstobeperformed onthe grid)areessentiallytranslatedtoOPSCdatastructuresand func-tioncallsduringthepreparationofthecode.Forinstance,when declaringcomputationalstencilsthat defineaparticularcentral differencingscheme,thelocalgridindicesstoredintheCentral schemeobjectareusedtowriteoutanopsstencildefinition duringcodegeneration.Similarly,opshalostructuresandcalls

toopshalotransferareproducedtofacilitatethe implemen-tationoftheperiodicboundaryconditions.Allfieldsaredeclared

asopsdatdatasets;foranexampleofwheretheseareused,see thefunctionopsargumentcallinthefileopsc.pywhich gen-erates/accumulatescallstotheOPSfunctionopsargdatthrough theuseof‘printf’-stylestringformatting,fillinginthe‘placeholder’ arguments(e.g.%sinPython)withvaluesfromthelocalOpenSBLI objects.Finally,callstoOpenSBLIKernelobjectsarerepresentedin OPSCasregularC++functions(seeFig.3)whicharepassedtothe opsparloopfunction(seeFig.4 whichexecutesthefunction efficientlyovertherangeofgridpointswithinthedesiredblock; OpenSBLIiscurrentlyasingle-blockcodesoonlyoneblock, con-tainingallthegridpoints,isused.FurtherdetailsontheOPSdata structuresandfunctionalitycanbefoundintheworkbyReguly

etal.[10] Someoptimisationsareperformedduringthecodegeneration stagebyOpenSBLItoavoid unnecessaryandexpensivedivision operationsinthekernels;rationalnumbers(e.g.finitedifference stencilweights that are rational) and constant EinsteinTerms raisedtonegativepowers(e.g.Re−1)areevaluatedandstored(e.g

byover-ridingthe printRational methodin the OPSCCode-Printerclass)

Oncethecodegenerationprocessiscomplete,theOPSlibrary

is called to target the code towards various backends These includethesequentialcode,MPIandhybridMPI+OpenMP paral-lellisedversionsofthecodeforCPUs,CUDAandOpenCLversions

Trang 6

Fig 3. Code snippit showing two kernels from a 2D ‘method of manufactured solutions’ (MMS) simulation (see Section 3.2 ) using second-order central differences The first

of the code for GPUs, and an OpenACC version for

acceler-ators The test cases presented in this paper (see Section 3)

considerthesequential,MPI,andCUDAbackends.Targetting

‘hand-written’/manually-generated model code towards a particular

architectureissomethingthatiswell-knownasatime-consuming,

error-prone and often unsustainable activity; often numerical

modelshavetobecompletelyre-written,involvingmanyif-else

statementsand#ifdef-stylepragmastoensurethatthecorrect

branchofthecodeisfollowedforagivenbackend.Asthenumber

ofbackendsgrows,thecodebecomesunsustainable.Incontrast,

withtheabstractionintroducedherethroughcodegeneration,

sup-portforanewbackendonlyneedstobeaddedtotheOPSlibrary;

thetop-level,abstractdefinitionoftheequationsandtheir

imple-mentationneednotbemodifiedduetotheseparationofconcerns,

therebyhighlightingoneofthekeyadvantagesofautomatedmodel

development

Whencomparingthenumberoflinesandthecomplexityofthe

codethatgetsgeneratedbyOpenSBLI,anotheradvantageof

auto-matedmodeldevelopmentbecomesclear;inthecaseofthe3D

Taylor-Greenvortextestcase,theproblemspecificationfile

con-taining∼100linesgeneratesOPSCcodethatisapproximately1500

lineslong(excludingblanklinesandcomments).Asmore

param-eterisations(e.g.LargeEddySimulationturbulencemodels) and

diagnosticfield computationsareadded,it isexpectedthatthis

numberwouldgrowevenfurtherrelativetothenumberoflines

requiredinthesetupfile

3 Verification and Validation

InordertoverifythecorrectnessofOpenSBLIandbeconfident

intheabilityofthesolutionalgorithmstoaccuratelyrepresentthe

underlyingphysics, threerepresentativetest casescovering1,2 and3dimensionswerecreatedandarepresentedhere

3.1 Propagationofawave This1Dtestcaseconsidersthefirst-orderwaveequation,given by

∂

∂t +c∂

where isthequantitythatistransportedatconstant speedc Theexpectedbehaviouristhatanarbitraryinitialprofileattime

t=0isdisplacedbyadistancedt=ct,suchthat(x,t=0)=(x=dT,

t=T)forsomefinishtimeT.Theconstantcwassetto0.5ms−1

inthiscase,andtheequationwassolvedontheline0≤x≤1m Eighth-ordercentraldifferencingwasusedtodiscretisethedomain

inspaceinconjunctionwithathird-orderRunge-Kuttaschemefor temporaldiscretisation.Thegridspacingxwassetto0.001m,and thetimestepsizetwassetto4×10−4s,yieldingaCourant num-berof0.2.Asmooth,periodicinitialcondition(x,t=0)=sin(2x) wasused,andperiodicboundaryconditionswereenforcedatboth endsofthedomain

Thesimulationwasruninserial(onanIntel® CoreTMi7-4790 CPU)untilafinishtimeoft=1s.Theinitialandfinalstatesofthe solutionfieldareshowninFig.5.Asdesired,theerrorinthe solu-tionisverysmallatO(10−10),andprovidessomeconfidenceinthe implementationofthesolutionmethodandtheperiodicboundary conditions

Trang 7

Fig 5.Results from the 1D wave propagation simulation Left: The solution field  at time t = 0 s and t = 1 s Right: The error between the analytical solution and the numerical

3.2 Methodofmanufacturedsolutions

Themethodofmanufacturedsolutions(MMS)isarigorousway

tocheckthecorrectnessofanumericalmethod’simplementation

[29–31].Theoverallalgorithminvolvesconstructinga

manufac-turedsolutionmfortheprognosticvariable(s)andsubstituting

this intothegoverning equation.Since themanufactured

solu-tionwillnot, in general,betheexact solutiontotheequation,

a non-zeroresidual term willbepresent This residualterm is

thensubtractedfromtheRHSsuchthatthemanufacturedsolution

essentiallybecomestheexact/analyticalsolutionofthemodified

equation(i.e.theonewiththesourceterm).Asuiteofsimulations

canthenbeperformedusingincreasinglyfinegridstocheckthat

thenumericalsolutionconvergestothemanufacturedsolutionat

theexpectedratedeterminedbythediscretisationscheme

Forthistest,the2Dadvection-diffusionequation(withasource

termS)givenby

∂

∂t + ∂

∂xj



uj−k∂

∂xj



isconsidered

The constant k is the diffusivity coefficient which is set to 0.75m2s−1here.Theprescribedfieldui istheithvelocity com-ponent, with u0=1.0ms−1 and u1=−0.5ms−1 The prognostic field  isto bedeterminedand hasaninitialconditionof (x,

t=0)=0.Inasimilarfashiontotheworksof[29–31],the manufac-tured/‘analytical’solutionm=sin(x0)cos(x1)employsamixtureof sineandcosinefunctionssincethesearecontinuousandinfinitely differentiable.TheSAGEframework[32]wasusedtosymbolically determinetheresidual/sourcetermS

Thedomainisa2Dsquarewithdimensions0≤x0≤2mand

0≤x1≤2msuchthatthemanufacturedsolutionisperiodic Fur-thermore,periodicboundaryconditionsareappliedonallsides

ofthedomain.Sixcentraldifferencingschemesoforder2,4,6,

8,10and 12areconsideredforthespatialdiscretisation,and a third-orderRunge-Kuttaschemeis usedthroughouttoadvance theequationin time.To performtheconvergenceanalysis,the gridspacingwashalvedforeachsuccessivecasesuchthatx=

y= 

2,4,8,16 and 32.Thetimestepsizetwasalsohalvedfor eachcasetomaintainamaximumboundof0.025ontheCourant number;thiswaspurposefully keptsmallandnear-constantto minimisetheinfluenceoftemporal discretisationerror[33].All

Trang 8

Fig 7.The absolute error (in the L2 norm) between the numerical solution  and

simulationswereruninserial(onanIntel® CoreTMi7-4790CPU)

untilafinishtimeofT=100stoensurethatasteady-statesolution

wasattained

Fig.6demonstrates how convergestowards the

manufac-tured solution m as thegrid is refined.The convergence rate

foreach order ofthecentraldifferenceschemeis illustratedin

Fig.7.Theanomalyinthetwelfth-orderconvergenceplotwaslikely

causedbyreachingthelimitofmachineprecision.Overall,these

resultsprovideconfidenceinthecorrectnessofthe

automatically-generatedcode/model

3.3 3DTaylor-Greenvortex

TheTaylor-Greenvortexisawell-knownhydrodynamic

prob-lem[34–36]characterisedby transitiontoturbulence,decay of

turbulence,andtheenergydissipationduringitsevolution Itis

frequently used to evaluatethe ability of a numerical method

to capture the underlying physical processes During the

ini-tialstagesofevolution,thedynamicsdisplaystructuralchanges

(rollingup,stretchingandinteractionofthevortices).Thisprocess

isinviscidinnature.Laterthevorticesbreakdownandtransition

intofully-turbulentdynamics.Astherearenoexternalforcesor

turbulence-generatingmechanisms,thesmall-scalestructures

dis-sipatealltheenergy,andthefluideventuallycomestorest[34]

Thenumericalmethodemployedshouldbeabletocaptureeachof

thesestagesaccurately

The3DcompressibleNavier-Stokesequationsweresolvedin

non-dimensionalform,writteninEinsteinnotationas

∂

∂t +∂ ∂

∂ui

∂t +∂ ∂

xj[uiuj+pıij−ij]=0, (6)

and

∂E

∂t + ∂

∂xj[Euj+ujp−qj−uiij]=0 (7)

fortheconservationofmass,momentumandenergy,respectively

The(dimensionless)quantityisthefluid density,uiis theith

(scalar)componentofthevelocityvectoru,pisthepressurefield,

Eisthetotalenergy.Thecomponentsofthestresstensoraregiven

by

ij=Re1



∂ui

∂xj +∂uj

∂xi−23ıij∂uk

∂xk



whereıij istheKroneckerDeltafunctionandReistheReynolds number.Thecomponentsoftheheatfluxtermqaregivenby

qj=  ( −1)M2PrRe

∂T

whereTisthetemperaturefield, istheratioofspecificheats,Mis theMachnumber,andPristhePrandtlnumber.Thevarious quan-titiesarenon-dimensionalisedusingthereferencevelocityuref,the referencelengthL,thereferencedensity ref,and thereference temperatureTref

Theequationofstatelinkingp,andT,isdefinedby

p= 1

andthetotalenergyisgivenby

E= p −1+1

2u

2

Thepressurepisnon-dimensionalisedbyrefu2

ref Centralfinite differenceschemesarenon-dissipativeandare therefore suitable for accurately capturing turbulent dynamics However,thelackofdissipationcanmaketheschemeunstable.To improvethestability,askew-symmetricformulation[37–40]was appliedtotheconvectivetermsin(5)–(7);theconvectivetermthen becomes

∂xj[uj]=12



∂xjuj+uj ∂

∂xj+ ∂

∂xjuj



, (12)

whereshouldbesetto1,ujandEforthecontinuity,momentum andenergy equations,respectively.It shouldalsobenotedthat theboththeconvectiveandviscoustermsarediscretisedusing thesamespatialorder Inallof thesimulationsperformed,the Laplacianin theviscous termis expandedusing a finite differ-encerepresentation ofthesecondderivative(i.e.nottreatedby successivefirstderivatives)

AspertheworkofDeBonis[35]andBullandJameson[36],the equationsweresolvedina3Dcube,with0≤x0≤2L,0≤x1≤2L, and0≤x2≤2L.Periodicboundaryconditionswereappliedonall surfaces.Thefollowinginitialconditionswereimposedattimet=0:

u0(x0,x1,x2,t=0)=sin

x

0

L



cos

x

1

L



cos

x

2

L



, (13)

u1(x0,x1,x2,t=0)=−cos

x

0

L



sin

x

1

L



cos

x

2

L



, (14)

u2(x0,x1,x2,t=0)=0, (15) p(x0,x1,x2,t=0)= 1

M2 +161 

cos

2x

0

L



+cos

2x

1

L

2+cos

2x

2

L



, (16)

Inallthesimulations,Re=1600,Pr=0.71,M=0.1,and =1.4.The referencequantitiesL,urefandrefweresetto1.0,andthereference temperatureTrefwasevaluatedusingtheequationofstate(10)

Afourth-orderaccuratecentraldifferencingschemewasused

tospatiallydiscretisethedomain,andathird-orderRunge-Kutta timesteppingschemewasusedtomarchtheequations forward

intime.Asetofsimulationswasperformedoverarangeof res-olutions,namely643,1283,2563and5123uniformly-spacedgrid points.Forthe643case,anon-dimensionaltime-stepsizetof 3.385×10−3[35]wasused.Eachtimethenumberofgridpoints wasdoubled,thetime-stepsizewashalvedtomaintainaconstant upperboundontheCourantnumber.Thegeneratedcodewas tar-gettedtowardstheCUDAbackendusingOPSandexecutedonan NVIDIATeslaK40GPUuntilanon-dimensionaltimeoft=20,except forthe5123case;thiswastargettedtowardstheMPIbackendand

Trang 9

Fig 8.Visualisations of the non-dimensional vorticity (z-component) iso-contours,

runinparallelover1440processesontheUKNational

Supercom-putingService(ARCHER)duetolackofavailablememoryonthe

GPU,and provideda goodexample ofhowthebackendcanbe

readilychanged

Thez-componentofthevorticityfieldatvarioustimescanbe

foundinFig.8.Atnon-dimensionaltimet=2.5vortexevolution

andstretchingareclearlyvisible,progressingontohighly

turbu-lentdynamicswheretherelativelysmoothstructuresroll-upand

eventuallybreakdownataroundt=9.Thispointischaracterisedby

peakenstrophyinthesystem.Thefinalstageofthesimulation

fea-turesthedecayoftheturbulentstructuressuchthattheenstrophy

tendstowardsitsinitialvalue

FollowingthedefinitionsofDeBonis[35],theintegralsofthe

kineticenergy

Ek= 1

ref



1

andenstrophy

ε= 1

ref



1

2



ijk∂uk

∂xj

2

(18)

werecomputedthroughoutthesimulations.Note that isthe

wholedomainand ijkis theLevi-Civitafunction.These

quanti-tiesareshowninFigs.9and10forthevariousgridresolutions,and

areplottedagainstthereferencedatafromaspectralelement

sim-ulationbyWangetal.[41]usinga5123gridforcomparison.Fig.10

highlights theinviscid natureof theTaylor-Green vortex

prob-lemfort<∼3–4.Thetransitiontoturbulenceoccursfrom∼3<t<9

(whichisassociatedwiththepeakinenstrophyinFig.9).Finally,

dissipationoccursatt>9.Theresultsshowaclearagreementwith thereferencedata,andrepresentsa solidfirststeptowardsthe validationofOpenSBLI

4 Conclusion

Advancesincomputehardwarearedrivinganeedtochange thecurrentstateofnumericalmodeldevelopment.Bydeveloping

anewmodellingframework basedonautomatedsolution tech-niques,we have effectively future-proofed thecoreof theSBLI codebase;nolongerdoesacomputationalscientistneedtore-write significantportionsofcodeinordertogetitupandrunningona newpieceofhardware.Instead,themodelisderivedfroma high-levelspecificationindependentofthearchitecturethatitwillrun

on,andtheunderlyingcodeisautomaticallygeneratedandtailored

toaparticularbackend,theresponsibilityfor whichwouldrest withcomputerscientistswhoareexpertsinparallelprogramming paradigms.Furthermore,theeaseatwhichthegoverningequations canbechangedisafundamentaladvantageofusingsuchabstract

Trang 10

cases,each ofwhich comprisedadifferentsetofequations.The

discretisation,codegenerationandcodetargettingisperformed

automatically,therebyreducingdevelopmentcostsandpotentially

avoiding errors, bugs, and non-performant/non-optimal

opera-tions Inaddition, codethat solvesthedifferentvariants ofthe

samegoverningequationscanbeeasilygenerated.Forexample,in

thecompressibleNavier-Stokesequations,viscositycanbetreated

eitherasaconstantorasaspatially-varyingterm.Instatic,

hand-writtencodesthisflexibilitycomesatthecostofwritingdifferent

routinesforthevariousformulations,unlikewithautomatedcode

generationtechniques.Thisis particularlyusefulwhen wanting

to switch between Cartesian and generalised coordinates This

particularframeworkalsofacilitatesthefastandefficientswitching betweendifferentspatialordersofaccuracy,andreducesthe devel-opmenttimeandeffortwhenwishingtotryoutnewnumerical formulationsoftheequations(oranewspatial/temporalscheme)

onawidevarietyoftestcases

4.1 Futurework Explicit schemes suchas the oneimplemented here can be readilyextendibletoarangeofapplicationareassuchas compu-tationalaeroacoustics,aero-thermodynamics,problemsinvolving shocks,andhypersonicflow.Incompressibleflowsmayalsobe han-dledwiththeexplicit,compressiblesolverinOpenSBLIsolongas

Fig A.11.A cut-down version of the 3D Taylor-Green vortex setup/configuration file (67 lines long including whitespace), showing the key components and classes available

... between Cartesian and generalised coordinates This

particularframeworkalsofacilitatesthefastandefficientswitching betweendifferentspatialordersofaccuracy,andreducesthe devel-opmenttimeandeffortwhenwishingtotryoutnewnumerical... withcomputerscientistswhoareexpertsinparallelprogramming paradigms.Furthermore,theeaseatwhichthegoverningequations canbechangedisafundamentaladvantageofusingsuchabstract

Trang... class="text_page_counter">Trang 10

cases,each ofwhich comprisedadifferentsetofequations .The

discretisation,codegenerationandcodetargettingisperformed

Ngày đăng: 04/12/2022, 16:02

Nguồn tham khảo

Tài liệu tham khảo Loại Chi tiết
[1] P. Thibodeau, Scientists, IT Community Await Exascale Computers, 2009 http://www.computerworld.com/article/2550451/computer-hardware/scientists-it-community-await-exascale-computers.html Sách, tạp chí
Tiêu đề: Scientists, IT Community Await Exascale Computers
Tác giả: P. Thibodeau
Nhà XB: Computerworld
Năm: 2009
[2] F. Rathgeber, G.R. Markall, L. Mitchell, N. Loriant, D.A. Ham, C. Bertolli, P.H.Kelly, PyOP2: a high-level framework for performance-portable simulations on unstructured meshes, in: High Performance Computing, Networking Storage and Analysis, SC Companion, IEEE Computer Society, 2012, pp.1116–1123 Sách, tạp chí
Tiêu đề: PyOP2: a high-level framework for performance-portable simulations on unstructured meshes
Tác giả: F. Rathgeber, G.R. Markall, L. Mitchell, N. Loriant, D.A. Ham, C. Bertolli, P.H.Kelly
Nhà XB: IEEE Computer Society
Năm: 2012
[3] C.T. Jacobs, M.D. Piggott, Firedrake-Fluids v0.1: numerical modelling of shallow water flows using an automated solution framework, Geoscientif.Model Dev. 8 (3) (2015) 533–547, http://dx.doi.org/10.5194/gmd-8-533-2015 Sách, tạp chí
Tiêu đề: Firedrake-Fluids v0.1: numerical modelling of shallow water flows using an automated solution framework
Tác giả: C. T. Jacobs, M. D. Piggott
Nhà XB: Copernicus Publications
Năm: 2015
[4] F. Rathgeber, D.A. Ham, L. Mitchell, M. Lange, F. Luporini, A.T.T. McRae, G.-T.Bercea, G.R. Markall, P.H.J. Kelly, Firedrake: automating the finite element method by composing abstractions, ACM Trans. Math. Soft., http://arxiv.org/abs/1501.01809 Sách, tạp chí
Tiêu đề: Firedrake: automating the finite element method by composing abstractions
Tác giả: F. Rathgeber, D.A. Ham, L. Mitchell, M. Lange, F. Luporini, A.T.T. McRae, G.-T.Bercea, G.R. Markall, P.H.J. Kelly
Nhà XB: ACM Transactions on Mathematical Software
Năm: 2015
[9] M. Giles, I. Reguly, G. Mudalige, OPS C++ User’s Manual, University of Oxford, 2015 http://www.oerc.ox.ac.uk/projects/ops Sách, tạp chí
Tiêu đề: OPS C++ User’s Manual
Tác giả: M. Giles, I. Reguly, G. Mudalige
Nhà XB: University of Oxford
Năm: 2015
[10] I.Z. Reguly, G.R. Mudalige, M.B. Giles, D. Curran, S. McIntosh-Smith, The OPS domain specific abstraction for multi-block structured grid computations, in:Proceedings of the 2014 Fourth International Workshop on Domain-Specific Languages and High-Level Frameworks for High Performance Computing, IEEE Computer Society, 2014, pp. 58–67, http://dx.doi.org/10.1109/WOLFHPC.2014.7 Sách, tạp chí
Tiêu đề: The OPS domain specific abstraction for multi-block structured grid computations
Tác giả: I.Z. Reguly, G.R. Mudalige, M.B. Giles, D. Curran, S. McIntosh-Smith
Nhà XB: IEEE Computer Society
Năm: 2014
[11] G.R. Mudalige, I.Z. Reguly, M.B. Giles, A.C. Mallinson, W.P. Gaudin, J.A.Herdman, Performance analysis of a high-level abstractions-based hydrocode on future computing systems, in: Proceedings of the 5th International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computing Systems (PMBS ’14), held in conjunction with IEEE/ACM Supercomputing 2014 (SC’14), 2014 Sách, tạp chí
Tiêu đề: Performance analysis of a high-level abstractions-based hydrocode on future computing systems
Tác giả: G.R. Mudalige, I.Z. Reguly, M.B. Giles, A.C. Mallinson, W.P. Gaudin, J.A.Herdman
Năm: 2014
[12] S.P. Jammy, G.R. Mudalige, I.Z. Reguly, N.D. Sandham, M. Giles, Block-structured compressible Navier-Stokes solution using the OPS high-level abstraction, Int. J. Comput. Fluid Dyn. 30 (2016) 450–454, http://dx.doi.org/10.1080/10618562.2016.1243663 Sách, tạp chí
Tiêu đề: Block-structured compressible Navier-Stokes solution using the OPS high-level abstraction
Tác giả: S.P. Jammy, G.R. Mudalige, I.Z. Reguly, N.D. Sandham, M. Giles
Nhà XB: Int. J. Comput. Fluid Dyn.
Năm: 2016
[13] E. Touber, N.D. Sandham, Large-eddy simulation of low-frequencyunsteadiness in a turbulent shock-induced separation bubble, Theor. Comput.Fluid Dyn. 23 (2) (2009) 79–107, http://dx.doi.org/10.1007/s00162-009-0103-z Sách, tạp chí
Tiêu đề: Large-eddy simulation of low-frequency unsteadiness in a turbulent shock-induced separation bubble
Tác giả: E. Touber, N.D. Sandham
Nhà XB: Theoretical and Computational Fluid Dynamics
Năm: 2009
[15] J. Redford, N.D. Sandham, G.T. Roberts, Numerical simulations of turbulent spots in supersonic boundary layers: effects of Mach number and wall temperature, Prog. Aerospace Sci. 52 (2012) 67–79, http://dx.doi.org/10.1016/j.paerosci.2011.08.002 Sách, tạp chí
Tiêu đề: Numerical simulations of turbulent spots in supersonic boundary layers: effects of Mach number and wall temperature
Tác giả: J. Redford, N.D. Sandham, G.T. Roberts
Nhà XB: Progress in Aerospace Sciences
Năm: 2012
[16] B. Wang, N.D. Sandham, W. Hu, W. Liu, Numerical study of oblique shock-wave/boundary-layer interaction considering sidewall effects, J. Fluid Mech. 767 (2015) 526–561, http://dx.doi.org/10.1017/jfm.2015.58 Sách, tạp chí
Tiêu đề: Numerical study of oblique shock-wave/boundary-layer interaction considering sidewall effects
Tác giả: B. Wang, N.D. Sandham, W. Hu, W. Liu
Nhà XB: Journal of Fluid Mechanics
Năm: 2015
[19] T. Sun, OPESCI-FD: Automatic Code Generation Package for Finite Difference Models, Master’s thesis, Imperial College London, 2016 https://arxiv.org/abs/1605.06381 Sách, tạp chí
Tiêu đề: OPESCI-FD: Automatic Code Generation Package for Finite Difference Models
Tác giả: T. Sun
Nhà XB: Imperial College London
Năm: 2016
[20] N. Kukreja, M. Louboutin, F. Vieira, F. Luporini, M. Lange, G. Gorman, Devito:automated fast finite difference computation, in: Proceedings of the Sixth International Workshop on Domain-Specific Languages and High-Level Frameworks for High Performance Computing, 2016 https://arxiv.org/abs/ Sách, tạp chí
Tiêu đề: Devito: automated fast finite difference computation
Tác giả: N. Kukreja, M. Louboutin, F. Vieira, F. Luporini, M. Lange, G. Gorman
Nhà XB: Proceedings of the Sixth International Workshop on Domain-Specific Languages and High-Level Frameworks for High Performance Computing
Năm: 2016
[6] A. Logg, K.-A. Mardal, G.N. Wells, et al., Automated Solution of Differential Equations by the Finite Element Method, Springer, 2012, http://dx.doi.org/10.1007/978-3-642-23099-8 Link
[14] N. De Tullio, N.D. Sandham, Direct numerical simulation of breakdown to turbulence in a Mach 6 boundary layer over a porous surface, Phys. Fluids 22 (9) (2010), http://dx.doi.org/10.1063/1.3481147 Link
[7] M.S. Alnổs, A. Logg, K.B. ỉlgaard, M.E. Rognes, G.N. Wells, Unified form language: a domain-specific language for weak formulations of partial differential equations, ACM Trans. Math. Soft. 40 (2) (2014) Khác

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN

🧩 Sản phẩm bạn có thể quan tâm

w