1. Trang chủ
  2. » Giáo án - Bài giảng

fast parallel integration for three dimensional discontinuous petrov galerkin method

10 6 0

Đang tải... (xem toàn văn)

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề Fast parallel integration for three dimensional Discontinuous Petrov-Galerkin method
Tác giả Maciej Wozniak, Marcin Los, Maciej Paszynski, Leszek Demkowicz
Trường học AGH University of Science and Technology, Kraków
Chuyên ngành Computational Science
Thể loại conference paper
Năm xuất bản 2016
Thành phố Kraków
Định dạng
Số trang 10
Dung lượng 327,04 KB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

leszek@ices.utexas.edu Abstract Finite Element Method comes with a challenge of constructing test functions, that provide better stability.. Discontinuous Petrov-Galerkin method construc

Trang 1

Procedia Computer Science 101 , 2016 , Pages 8 – 17

YSC 2016 5th International Young Scientist Conference on Computational Science,

doi: 10.1016/j.procs.2016.11.003

Peer-review under responsibility of organizing committee of the scientific committee of the

5th International Young Scientist Conference on Computational Science

© 2016 The Authors Published by Elsevier B.V

YSC 2016 5th International Young Scientist Conference on Computational Science

Fast parallel integration for three dimensional

Discontinuous Petrov Galerkin method Maciej Wo´zniak1, Marcin Lo´s1, Maciej Paszy´nski1, and Leszek Demkowicz2

1 AGH University of Science and Technology Krak´ow, Poland macwozni@agh.edu.pl, los@agh.edu.pl paszynsk@agh.edu.pl

2 The University of Texas at Austin, Austin, Texas, U.S.A.

leszek@ices.utexas.edu

Abstract

Finite Element Method comes with a challenge of constructing test functions, that provide better stability Discontinuous Petrov-Galerkin method constructs optimal test functions “on the fly” However this method comes with relatively high computational cost In this paper we show a parallelization method to reduce computation time

Keywords: Finite Element Method, Discontinuous Petrov-Galerkin, parallel, shared memory

In this paper we present a parallelization of the algorithm for generation of the element matrices for Discontinuous Petrov Galerkin (DPG) method [3, 4, 5] The DPG method is a new rapidly growing method for solving the numerical problems It enables for automatic control of the stability of the numerical formulations We have parallelized the element routines of the hp3d DPG code developed by the group of prof Demkowicz We are aware of other parallel FEM packages supporting adaptive computations for DPG, including CAMELIA [14] and DUNE-DPG [10] However, the hp3d framework is unique in the following ways:

• It supports hexahedral, tetrahedral, prism and pyramid elements in 3D To our best

knowledge, CAMELIA supports only hexahedral elements, and DUNE supports triangular elements only

• It enables for parallel anisotropic refinements over computational domain distributed with

different kind of finite elements, including tetrahedral, hexahedra, prism and pyramid The CAMELIA and DUNE packages do not allow for anisotropic refinements, and thus the exponential convergence of the numerical solution is not possible there

• CAMELIA and DUNE do not support complex Hcurl discretization Our framework will

enable for parallel automatic hp-adaptive computations for different classes of problems, including H1, Hdiv, and Hcurl

8

Trang 2

Our preliminary work presented in this paper concerns the parallelization of the element matri-ces for the elliptic problem However, our future work will involve parallelization of Hdiv and Hcurl element routines

Let us focus on the model elliptic problem In the Sobolev space

H1(Ω) ={u ∈ L2(Ω) :D α u ∈ L, |α| ≤ 1, tr u = 0 on ∂Ω} (1)

we introduce classical weak formulation for Poisson problem inH1(Ω) We seek foru ∈ H1(Ω).



Ω

∇u∇v dx =



Ω

fv dx ∀v ∈ H1

We may also express the above problem with abstract notation:

where in our model problem we have

b(u, v) =



Ω

l(v) =



Ω

We project the weak problem into the finite dimensional subspaceV h ⊂ H1(Ω)



Ω

∇u h ∇v hdx =



Ω

fv hdx ∀v h ∈ V h ⊂ H1

The actual mathematical theory concerning the stability of the numerical method for general weak formulation (2) is based on the famous “Babu´ska-Brezzi condition” (BBC) developed in years 1971-1974 at the same time by Ivo Babu´ska, and Franco Brezzi [13, 12, 9] The condition states that the problem (2) is stable when

sup

v∈V

|b(u, v)|

However, the inf-sup condition in the above form concerns the abstract formulation where we consider all the test functions from v ∈ V and look for solution at u ∈ U (e.g U = V ) The

above condition is satisfied also if we restric to the space of trial functionsu h ∈ U h

sup

v∈V

|b(u h , v)|

However, if we use test functions from the finite dimensional test spaceV h= span{v h }

sup

v h ∈V h

|b(u h , v h)|

v h  V h

we do not have a guarantee that the supremum (9) will be equal to the original supremom (7), since we have restricted V to V h The optimality of the method depends on the quality

Trang 3

10

of the polynomial test functions defining the space V h = span{v h } and how far are they from

the supremum realized in (7) Many scientists spent their lives on constructing test functions providing better stability of the method for given class of problems [11, 7, 8, 1] They have developed several techniques for stabilization different kind of problems In 2010 the DPG was proposed, with the modern summary of the method described in [2] The key idea of the DPG method is to construct the optimal test functions “on the fly”, element by element The DPG automatically guarantee the numerical stability of difficult computational problems, thanks to the automatic selection of the optimal basis functions

The DPG method can be derived using one of the following three methods [2, 6]

a) Minimum residuum method

b) Petrov-Galerkin with optimal test functions

c) Special mixed methods

Minimum residuum method is good to illustrate the idea of the DPG method, however, it re-sults in the optimal test functions problem that is as expensive as the original problem itself The construction of the DPG method with Petrov-Galerkin approach is relatively difficult, so

we will only illustrate it here as a tool for making the optimization problem local over each finite element Special mixed method formulation of the DPG method is the most suitable for efficient implementation, since it results in a modification of the classical finite element method

We will present this method to discuss the implementation issues

Ad a) Let us start now from the derivation of the DPG method using the Minimum residuum method For our weak problem (2) we construct the operator

such that

so we can reformulate the problem as

We wish to minimize the residual

u h= arg min

w h ∈U h

1

2Bw h − l2

We introduce the Riesz operator being the isometric isomorphism

We can project the problem back to V

u h= arg min

w h ∈U h

1

2R −1 V (Bw h − l)2

Trang 4

11

The minimum is attained atu hwhen the Gˆateaux derivative is equal to 0 in all directions:

−1

V (Bu h − l), R −1

V (B δu h) V = 0 ∀ δu h ∈ U h (16)

Using again the definition of the Riesz map we get

h − l, R −1

which is equivalent to our original residuum problem

with optimal test functions

v δu h =R −1

V B δu hfor each trial function δu h (19)

In other words, with the help of the Riesz operator, it is possible to construct the optimal test functions [2] However, with the traditional weak formulation, the numerical solution of this optimization problem is as expensive as the solution of the original problem itself [2] In order

to make the basis function optimization problem local over particular finite elements, we need

to break the test spaces and reformulate the weak formulation

Ad b) In other words we switch now to the Petrov-Galerkin method, where we use original trial functions and broken test functions In order to allow for local element-wise solution of the problem of the optimization of the test functions, we derive the weak formulation with broken test spaces We introduce the space of broken test functions

H1(Ω

h) ={u ∈ L2(Ω) :u| K ∈ H1 K) ∀K ∈ T h } (20) and we introduce the weak formulation with broken test functions We seek for u ∈ H1(Ω) as

well as for fluxest ∈ trΓh H(div, Ω)



K



K

∇u∇v dx +

K



∂K

tv ds =



Ω

fv dx for all v ∈ H1(Ω

We denote by 

K



∂K tv ds Everythig now is computed element-wise and summed up.

In particular the term



K



∂K

tv ds =

f



f

where f denotes the faces of an element and n f denotes the face normal vector, and

[v] =



(for element faces located on the boundary of the domain we just take the normal component

of v, but for element faces located inside the domain, we consider difference of the normal

components v[+] and v[−] resulting from the two elements sharing the common face) Over a

Trang 5

single element, we have the following contributions now: element frontal matrices

K ∇u∇v dx+



∂K tv ds and right-hand-sides 

K fv dx Notice that the first term comes from the integration

inside the element, and the second term results from the integration on the boundary of the element Again, we express the above weak-formulation with broken basis functions in the following abstract form

where

(25) When we compare the standard weak formulation and the formulation with broken spaces, we can see that in the second one we have more unknowns, namely (u, t) For such the weak

for-mulation with broken test functions, we can compute automatically the optimal test functions This is the main idea of the DPG method (Discontinuous Petrov-Galerkin method) - to con-struct on fly, element by element, the optimal test functions that will ensure numerical stability

of the method However, the implementation with this Petrov-Galerkin method is relatively difficult, and we will rather switch to the formulation of the DPG with special mixed method

Ad c) We will focus now on the formulation of the DPF by using the special mixed method, with error representation function [6] The error representation functions given by

Ψ =R −1

allows us to develop alternative formulation of the DPG method: Find Ψ ∈ V, u h ∈ U h , t ∈

trΓh H(div, K) such as

(29) Based on the above formulation, we can construct now element matrices for this problem

B G T −B1 −B2

B T

⎣Ψu

t

⎦ =

−l0 0

in DPG method

The formulation of the DPG with special mixed method results in the structure of the element local matrices, as presented in equation (30) for the case of approximation in H1 K) spaces.

Analogous formulation forH(div, K) and H(curl, K) approximations also results in a similar

structure of the element local matrix, but with different distribution of base functions and degrees of freedom over element vertices, edges, faces and interiors In the following estimations

we assume three dimensional hexahedral finite elements with vertices, edges, faces and interior nodes Notice thatu ≈u i e i is approximated over the element with polynomials of orderp

from the space u h ∈ U h ⊂ H1 K) This means that we have one degree of freedom per each

vertex node,p − 1 degrees of freedom per each edge node, (p − 1)2degrees of freedom per each

Trang 6

13

face node and (p − 1)3 degrees of freedom per each interior node Tracest ≈i=1, ,O(p3 )t i f i

are approximated with polynomials of orderp from the space trΓh H(div, K), on the boundary

of elements only This means that we have one degree of freedom per each vertex node,p − 1

degrees of freedom per each edge node, and (p − 1)2 degrees of freedom per each face node The

error representation function Ψi=1, ,O((p+Δp)3 )Ψi e i is approximated with polynomials of orderp + Δp (from the enriched space) also forming a subspace of H1 K) This means that

we have one degree of freedom per each vertex node,p + Δp − 1 degrees of freedom per each

edge node, (p + Δp − 1)2 degrees of freedom per each face node and (p + Δp − 1)3 degrees of

freedom per each interior node

Summing, up:

• The G is the Gram matrix, and it is a block-diagonal matrix

• u ≈ i=1, ,p u i e i is approximated with polynomial of order p, which means over the

3D element there are O(p3) unknowns related tou, since they are defined over element

vertices, edges, faces and interiors

• t ≈i=1, ,p t i e i is approximated with polynomial of orderp, which means over the 3D

element there are also O(p2) unknowns related to t, since the fluxes are defined over

element edges and faces

Ψi=1, ,p+ΔpΨi e i is approximated with polynomial of orderp + Δp which means over the

3D element there are alsoO((p + Δp)3) unknowns related to Ψ.

Thus, the Gram matrix has a square shape of sizeO((p + Δp)3× (p + Δp)3), the matrixB1

has rectangular shape of sizeO((p+Δp)3×p2) and the matrixB2has rectangular shape of size

O((p + Δp)3× p3) The cost of generation of the Gram matrix is of the order ofO(p9+ Δp9

using Gaussian quadratures

In general, the generation of the DPG matrices involves nested loops, starting from the Gauss integration points, through test basis functions, to trial basis functions

The generation of the matrices involves Gramm matrix and the so-called extended HH matrices

u s e o m p l i b

c l o o p o v e r i n t e g r a t i o n p o i n t s

!$OMP PARALLEL DO

!$OMP& DEFAULT(SHARED)

!$OMP& PRIVATE( l , x i , wa , weight , k1 , v1 , dv1 , k2 , v2 , dv2 , k , gradH ,

!$OMP& nrdofH , shapH , x , dxdxi , z f v a l , i f l a g , r j a c , d x i d x )

!$OMP& FIRSTPRIVATE( nrdofHH )

!$OMP& REDUCTION(+:BLOADH)

!$OMP& REDUCTION(+:AP)

!$OMP& REDUCTION(+:STIFFHH)

do l =1 , n i n t

c P r e p a r e common d a t a ( e g i n t e g r a t i o n p o i n t s , w e i g h t s )

Trang 7

















Figure 1: Execution time of the parallel integration algorithm for a single DPG element, when increasing number of cores 3D hexahedral element with cubic polynomials















Figure 2: Parallel efficiency of the parallel integration algorithm for a single DPG element 3D hexahedral element with cubic polynomials

c f i r s t l o o p t h r o u g h e n r i c h e d H1 t e s t f u n c t i o n s

do k1 =1 , nrdofHH

c compute t h e RHS e n t r y

BLOADH( k1 ) = BLOADH( k1 ) + z f v a l∗v1∗ weight

c l o o p t h r o u g h H1 t r i a l f u n c t i o n s

do k2 =1 , nrdofHH

dv2 ( 1 : 3 ) = gradHH ( 1 , k2 )∗ dxidx ( 1 , 1 : 3 )

+ gradHH ( 2 , k2 )∗ dxidx ( 2 , 1 : 3 )

+ gradHH ( 3 , k2 )∗ dxidx ( 3 , 1 : 3 )

Trang 8

15

















Figure 3: Parallel speedup of the parallel integration algorithm for a single DPG element 3D hexahedral element with cubic polynomials

c −− GRAM MATRIX −− ( s t o r e d in t r i a n g u l a r format )

c d e t e r m i n e i n d e x i n t r i a g u l a r f o r m a t

k = nk ( k1 , k2 ) AP( k ) = AP( k ) + ( dv1 ( 1 )∗ dv2(1)+dv1 (2)∗ dv2(2)+dv1 (3)∗ dv2 ( 3 )

enddo

c l o o p t h r o u g h H1 t r i a l f u n c t i o n s

do k2 =1 , nrdofH

v2 = shapH ( k2 ) dv2 ( 1 : 3 ) = gradH ( 1 , k2 )∗ dxidx ( 1 , 1 : 3 )

+ gradH ( 2 , k2 )∗ dxidx ( 2 , 1 : 3 )

+ gradH ( 3 , k2 )∗ dxidx ( 3 , 1 : 3 )

c P o i s s o n e q u a t i o n

STIFFHH( k1 , k2 ) = STIFFHH( k1 , k2 ) + ( dv1 ( 1 )∗ dv2(1)+dv1 (2)∗ dv2(2)+dv1 (3)∗ dv2 ( 3 ) ) ∗ weight

enddo

enddo

enddo

!$OMP END PARALLEL DO

We have parallelized the two external loops, through Gauss integration points and through rows (test functions)

Trang 9

We summarized our paper with parallel implementation of the DPG element matrices generator

We focused on the simple Poisson problem, with the pseudo-code described in previous section The numerical experiments have been performed for hexahedral DPG elements with cubic polynomials, integrated over Linux cluster node equipped with 14 cores

We can observe the efficiency going through 70% of 8 cores, 60 % on 11 cores down to 50 percent on 14 cores The corresponding speedup reaches up to 6.5 on 10 cores (see figures [1-3]) All computations were performed on a 8× 8 × 4 elements mesh.

In this paper we presented a scalability of parallel OpenMP integration of DPG matrices The parallel integrator has been obtained through OpenMP parallelization of sequential 3D DPG code developed for model Poisson problem by the group of prof Demkowicz We observe

a speedup up to 6.5 on 10 cores Future work will include the domain decomposition based parallelization of the DPG code, with hybrid parallelism including MPI on the level of particular elements and OpenMP on the level of matrices

The work of MW was supported by Deans grant no 15.11.230.270

References

[1] Franco Brezzi, Marie-Odile Bristeau, Leopoldo P Franca, Michel Mallet, and Gilbert Rog A rela-tionship between stabilized finite element methods and the galerkin method with bubble functions

Computer Methods in Applied Mechanics and Engineering, 96:117–129, 1992.

[2] Leszek Demkowicz and Jay Gopalakrishnan Recent developments in discontinuous galerkin

fi-nite element methods for partial differential equations IMA Volumes in Mathematics and its

Applications, 157:149–180, 2014 An Overview of the DPG Method.

[3] Leszek Demkowicz and Jay Gopalarkishnan A class of discontinuous petrov-galerkin methods part i: The transport equation 199:1558–1572, 2010

[4] Leszek Demkowicz and Jay Gopalarkishnan Numerical methods for partial differential, a class of discontinuous petrov-galerkin methods part ii optimal test functions 27:70–105, 2011

[5] Leszek Demkowicz, Jay Gopalarkishnan, and Anti Niemmi A class of discontinuous petrov-galerkin methods part iii: Adaptivity 62:396–427, 2012

[6] Tim Ellis, Leszek Demkowicz, and Jessy Chan Locally conservative discontinuous petrov-galerkin finite elements for fluid problems 68:1530–1549, 2014

[7] Leopoldo P Franca, Sergio L Frey, and Thomas J.R Hughes Stabilized finite element methods:

I application to the advective-diffusive model Computer Methods in Applied Mechanics and Engineering, 95:253–276, 1992.

[8] Leopoldo P Franca and Srgio L Frey Stabilized finite element methods: Ii the incompressible

navier-stokes equations Computer Methods in Applied Mechanics and Engineering, 99:209–233,

1992

[9] Brezzi Franco On the existence, uniqueness and approximation of saddle-point problems arising

from lagrange multipliers ESAIM: Mathematical Modelling and Numerical Analysis - Modlisation

Mathmatique et Analyse Numrique, 8:129–151, 1974.

Trang 10

17

[10] F Gruber, A Klewinghaus, and O Mula The dune-dpg library for solving pdes with discontinuous

petrov-galerkin finite elements http://arxiv.org/abs/1602.08338, 2016.

[11] Thomas J.R Hughes, Guglielmo Scovazzi, and Tayfun Tezduyar Stabilized methods for

com-pressible flows Journal of Scientific Computing, 43:343–368, 2010.

[12] Babuska Ivo Error bounds for finite element method 16:322–333, 1971

[13] Demkowicz Leszek Babuska↔ brezzi Technical Report 0608, The University of Texas at Austin,

2006

[14] Nathan Roberts Camelia: A software framework for discontinuous petrov-galerkin methods

https://github.com/CamelliaDPG/Camellia, 2016.

Ngày đăng: 04/12/2022, 10:29

Nguồn tham khảo

Tài liệu tham khảo Loại Chi tiết
[1] Franco Brezzi, Marie-Odile Bristeau, Leopoldo P. Franca, Michel Mallet, and Gilbert Rog. A rela- tionship between stabilized finite element methods and the galerkin method with bubble functions.Computer Methods in Applied Mechanics and Engineering, 96:117–129, 1992 Sách, tạp chí
Tiêu đề: A relationship between stabilized finite element methods and the Galerkin method with bubble functions
Tác giả: Franco Brezzi, Marie-Odile Bristeau, Leopoldo P. Franca, Michel Mallet, Gilbert Rog
Nhà XB: Computer Methods in Applied Mechanics and Engineering
Năm: 1992
[2] Leszek Demkowicz and Jay Gopalakrishnan. Recent developments in discontinuous galerkin fi- nite element methods for partial differential equations. IMA Volumes in Mathematics and its Applications, 157:149–180, 2014. An Overview of the DPG Method Sách, tạp chí
Tiêu đề: Recent developments in discontinuous Galerkin finite element methods for partial differential equations
Tác giả: Leszek Demkowicz, Jay Gopalakrishnan
Nhà XB: IMA Volumes in Mathematics and its Applications
Năm: 2014
[3] Leszek Demkowicz and Jay Gopalarkishnan. A class of discontinuous petrov-galerkin methods.part i: The transport equation. 199:1558–1572, 2010 Sách, tạp chí
Tiêu đề: A class of discontinuous petrov-galerkin methods.part i: The transport equation
Tác giả: Leszek Demkowicz, Jay Gopalarkishnan
Năm: 2010
[4] Leszek Demkowicz and Jay Gopalarkishnan. Numerical methods for partial differential, a class of discontinuous petrov-galerkin methods. part ii. optimal test functions. 27:70–105, 2011 Sách, tạp chí
Tiêu đề: Numerical methods for partial differential, a class of discontinuous petrov-galerkin methods. part ii. optimal test functions
Tác giả: Leszek Demkowicz, Jay Gopalarkishnan
Năm: 2011
[5] Leszek Demkowicz, Jay Gopalarkishnan, and Anti Niemmi. A class of discontinuous petrov- galerkin methods. part iii: Adaptivity. 62:396–427, 2012 Sách, tạp chí
Tiêu đề: A class of discontinuous petrov- galerkin methods. part iii: Adaptivity
Tác giả: Leszek Demkowicz, Jay Gopalarkishnan, Anti Niemmi
Năm: 2012
[6] Tim Ellis, Leszek Demkowicz, and Jessy Chan. Locally conservative discontinuous petrov-galerkin finite elements for fluid problems. 68:1530–1549, 2014 Sách, tạp chí
Tiêu đề: Locally conservative discontinuous Petrov-Galerkin finite elements for fluid problems
Tác giả: Tim Ellis, Leszek Demkowicz, Jessy Chan
Năm: 2014
[7] Leopoldo P. Franca, Sergio L. Frey, and Thomas J.R. Hughes. Stabilized finite element methods:I. application to the advective-diffusive model. Computer Methods in Applied Mechanics and Engineering, 95:253–276, 1992 Sách, tạp chí
Tiêu đề: Stabilized finite element methods: I. application to the advective-diffusive model
Tác giả: Leopoldo P. Franca, Sergio L. Frey, Thomas J.R. Hughes
Nhà XB: Computer Methods in Applied Mechanics and Engineering
Năm: 1992
[8] Leopoldo P. Franca and Srgio L. Frey. Stabilized finite element methods: Ii. the incompressible navier-stokes equations. Computer Methods in Applied Mechanics and Engineering, 99:209–233, 1992 Sách, tạp chí
Tiêu đề: Stabilized finite element methods: II. the incompressible Navier-Stokes equations
Tác giả: Leopoldo P. Franca, Sergio L. Frey
Nhà XB: Computer Methods in Applied Mechanics and Engineering
Năm: 1992
[9] Brezzi Franco. On the existence, uniqueness and approximation of saddle-point problems arising from lagrange multipliers. ESAIM: Mathematical Modelling and Numerical Analysis - Modlisation Mathmatique et Analyse Numrique, 8:129–151, 1974 Sách, tạp chí
Tiêu đề: On the existence, uniqueness and approximation of saddle-point problems arising from lagrange multipliers
Tác giả: Franco Brezzi
Nhà XB: ESAIM: Mathematical Modelling and Numerical Analysis - Modlisation Mathmatique et Analyse Numrique
Năm: 1974
[11] Thomas J.R. Hughes, Guglielmo Scovazzi, and Tayfun Tezduyar. Stabilized methods for com- pressible flows. Journal of Scientific Computing, 43:343–368, 2010 Sách, tạp chí
Tiêu đề: Stabilized methods for compressible flows
Tác giả: Thomas J.R. Hughes, Guglielmo Scovazzi, Tayfun Tezduyar
Nhà XB: Journal of Scientific Computing
Năm: 2010
[13] Demkowicz Leszek. Babuska ↔ brezzi. Technical Report 0608, The University of Texas at Austin, 2006 Sách, tạp chí
Tiêu đề: Babuska ↔ brezzi
Tác giả: Leszek Demkowicz
Nhà XB: The University of Texas at Austin
Năm: 2006
[14] Nathan Roberts. Camelia: A software framework for discontinuous petrov-galerkin methods.https://github.com/CamelliaDPG/Camellia, 2016 Sách, tạp chí
Tiêu đề: Camelia: A software framework for discontinuous petrov-galerkin methods
Tác giả: Nathan Roberts
Năm: 2016

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN