We will introduce the tools of interval arithmetic and of Taylor Models to outline a method of how a rigorous ODE solver can be implemented on a computer system.. Hence the definition is
Trang 1Rigorous Numerics for ODE’s
Joseph Galante September 2008
1 Abstract
We will give an overview of how computers solve ODE’s numerically in a nonrigorous fashion and examine the sources of error We will introduce the tools of interval arithmetic and of Taylor Models to outline a method of how
a rigorous ODE solver can be implemented on a computer system
2 Introduction
Many (real world) problems have ODE’s describing them that are too com-plex to estimate by hand and understand in detail However rigorously ver-ified information must be known about solutions in order complete proofs For example, tracking a solution to see if it connects two regions, or examin-ing the shape of ball of initial conditions as the flow acts on it Computers can be used to quickly give numerical simulations of a particular solution, however most ODE solvers (for example Runge-Kutta Methods) give that the computer generated solution is only O(hn) where h and n depend on the method, and the big-O holds unknown derviatives from the ODE in ques-tion This is unacceptable for serious proof since we have no assurance that computer solution and the actual solution agree after some long time accept for a heurestic ”pick h small and n large”
It is helpful to have a model method to understand the sources of error a computer can make For this we use Euler’s Method The idea is simple
Trang 2Suppose we have the initial value problem
(
˙x = f (x)
Throughout this paper, we will assume that solutions exist, are unique, and are defined for all time, and that f is sufficently smooth (either C∞ or real analytic) We specify a fixed step size h for the Euler Method If x(t) is a solution to the IVP, then we have from Taylor’s Theorem
x(t + h) = x(t) + h ˙x(t) + O(h2) ≈ x(t) + hf (x(t)) (2) Euler’s Method forgets the remainder and makes the a linear approximation
at each time step to give
(
ti = ti−1+ h
xi = xi−1+ hf (xi−1) (3) Each step of the Euler Method makes an error of O(h2), which for h small isn’t too bad The small errors made by disregarding the O(h2) causes the method to track a slightly different solution each timestep Nearby solutions usually behave similarly, however after many steps, these small errors can accumulate and destroy the method’s usefulness by jumping to a solution which has different behavior from the one desired
Figure 1: Truncation errors can lead to tracking the wrong solution
Trang 33 Interval Arithmetic
When working on a computer, there is another source of error which must
be accounted for - floating point error Differential equations usually have solutions which require real numbers to represent, however a computer is incapable of representing a general real number The set of ”Representable Numbers” is the set of numbers which our computer can perform computa-tions with This is obviously dependent on the computer’s architechture and software, however we most computers adopt IEEE standards which specify such representable numbers We assume that we have adopted such a stan-dard For IEEE, the gap inbetween representable numbers is approximately
10−16, which is usually denoted called the machine- When performing com-putations, all calculations are subject to errors introduced by allowing this discrepancy - to the computer 3 = 3 + 2 = 3 − 2
A well known trick to get around these difficulties is by using so called inter-val arithmetic If x is a real number, then on a computer, we can represent
x as the interval [a,b], where a and b are machine representable numbers Rules may be developed to handle basic operations For example if x is represented by [a,b] and y is represented by [c,d], then to compute x+y, we compute [a+c,b+d] We can then perform a ’round’ which ensures that a+c and b+d are still machine representable This corresponds roughly to rep-resenting x+y as [a + c − , b + d + ] The other operations of subtraction, multiplication, and division, as well as the concept of ’round’ and repre-sentable number can be made completely rigorous and are done so in [KM]
As an added benefit, we gain the operations of union and intersection For example if we are computing a specific quantity that we know is positive, then we may take the interval arithmetic calculation and intersect the in-terval with [0, ∞] As a down side, we lose the concept of equality x=y becomes x-y=0, which says compute x-y, then check if zero is in the interval [−, ] Equality on a computer is only good up to the size of machine- Another downside from a practical standpoint is that interval operations will have at least twice as long as conventional computer arithmetic, however we gain mathematical rigor
Returning to our problem of rigorous numerics for ODE’s, interval arith-metic can help to produce a rigorous solver We must first reformulate the
Trang 4problem as an ’Interval Value Problem’.
(
˙x = f (x)
where now I is some small interval of initial conditions, x is now made up
of intervals instead of reals, and operations are performed via interval arith-metic From a dynamical systems perspective we are seeking to transport a ball of initial conditions under the flow from the ODE
We can solve the ’IVP’ using an intervalized Euler Method which uses in-terval arithmetic However we can do slightly more The other source of error in Euler’s Method is error introduced by truncating off the O(h2) term Suppose we have information which allows us to bound the error so that O(h2) ∈ E where E is some interval Then we can make a rigorous solver by doing
(
ti = ti−1+ h
xi = xi−1+ hf (xi−1) + E (5) where operations are carried out via interval arithmetic As we have included both the floating point errors, and the truncation errors, then we have pro-duced a method which rigorously solves the IVP However in practice this method is useless It turns out that (with the exception of a few special cases) the intervals which contain the true solution grow very quickly due
to the repeated addition of the interval term E For example, if E=[-1,1] (a seemingly reasonable bound on an error term) then after two steps we have accumulated E+E=[-1,1]+[-1,1]=[-2,2] In the literature, this is known as the wrapping effect Iterating, by the nth step we will have a bound for the solution at least as bad as [-n,n] This being completely unacceptable, we persue an idea which greatly refines interval methods
4 Taylor Models
Definition: Suppose f(x) is Cn+1 in an open domain D We define an n-th order Taylor Model about x0 ∈ D as a pair (P,I) where P is nth order Taylor Polynomial of f about x0, and I is some interval such that for all x ∈ D we have f (x) ∈ P (x − x0) + I
Trang 5Notice that since f is in Cn+1, then Taylor’s theorem gives that the size
of I shrinks as n grows Hence the definition is nothing more than a clever statement that says that smooth functions behave like their Taylor Poly-nomial approximations (up to some small error) inside a sufficently small neighborhood
Rules for ’Taylor Model Arithmetic’ have been generated in [BM1] For ex-ample, suppose T1 = (P1, I1) and T2 = (P2, I2) are nth order Taylor Models about x0 ∈ D Then we have
T1+ T2 = (P1 + P2, I1+ I2) (6)
T1· T2 = (P1 · P2− Ph, B(Ph) + B(P1) · I2+ B(P2) · I1+ I1· I2 (7) where Ph is the polynomial made up of all terms of order (n+1) or larger in
P1· P2, and B(P) is a bound on the polynomial Notice that since we can obtain a bound on any polynomial by simply performing interval evaluation
on the domain in which it is defined Other arithmetic operations, trunca-tion, and the notion of an antiderviative can also be defined
One advantage in working with Taylor Models is that bounds for most mon functions are known and can be computed automatically with a com-puter Bounds for polynomials, trig, exponential, and logarithmic functions,
as well as operations like 1/x and Sqrt(x) (all referred to a intrinsic functions) have been computed in [BM1] All of these quantities are known explicitly and for the domain [-h,h] have remainders that scale like O(hn+1) Since most complicated expressions (ie the ones we care about) are made from com-posing these simple known quantities together, then we can get nice Taylor Models with remainder intervals that scale like O(hn+1) This is known as
The Fundamental Theorem of Taylor Model Arithmetic Suppose that the function f is described by an nth order Taylor Model (Pf, If) on its domain D Let g be a function which is composed of finitely many elementary operations and intrinsic functions, and suppose g is defined on the range of
f Let (P,I) be the Taylor Model which arises by plugging in (Pf, If) into g and evaluating using Taylor Model arithmetic Then (P,I) is a Taylor Model for g ◦ f Furthermore, if the remainder interval If scales like O(hn+1), then
so does the remainder interval I
Trang 6Proof of this theorem, as well as a detailed list of intrinsic functions and their Taylor Models is found in [BM1] It basically amounts to induction on each operation performed by g This theorem is important since it allows
to think of Taylor Models as data objects which we can move around and manipulate in a computer without risking losing control over the size of the remainder bound
Notice that zeroth-order Taylor Model Arithmetic is simply just interval arithmetic Higher order Taylor Models however give us so much more Con-sider the function g(x)=x-x If we feed g the function f(x)=x on [-1,1], and use interval arithmetic to bound the answer, then we get g(f (x)) ∈ [−2, 2] since [-1,1]-[-1,1]=[-2,2], which is hardly a tight bound Now suppose instead we use Taylor Models to represent f(x) as x+I where x ∈ [−1, 1] and I = [−, ] (machine precision) Then we get (x+I)-(x+I)=(x-x)+(I-I)=[−2, 2] This
is quite a dramatic improvement over intervals Essentially by using a Taylor Model, we are storing the higher order information about the shape of the range of f, which can be manipulated and cancellated to gain better bounds Due to the this extra storage of information, one of the primary disadvan-tages to Taylor Models is that they are very slow and require alot of storage space in comparsion to interval methods Somewhat efficient methods to do this have been developed in [BM1] (A nice trick is that polynomial coef-ficents below the machine- don’t need to be storaged, they only need to thrown into the remainder bound.) In practice however, n=5 is usually good enough in terms of both speed and accuracy, for your average problem
There is a scripting language called COSY developed by Martin Berz and Kyoto Makino (currently at MSU) which has Taylor Models as built in ob-jects that can automatically perform all of the operations described above Additionally COSY has some more advanced theoretical tricks to obtain the bounds B(P) It also has a full interval arithmetic package built in and can convert between Taylor Models, intervals, and reals to the extent that it makes sense to do so All intrinsic functions on Taylor Models are built
in and bounds can be instantly obtained by working with them COSY is currently avaliable free of charge for academic use (under some restrictions)
at www.cosyinfinity.org As an added bonus, COSY has been ’verified’ by other rigorous computer arithmetic packages.[BM1] [BM2] [BM3] [MB1] [M1] [RMB1]
Trang 75 Schauder’s Theorem and Verified Integra-tion
Taylor Models have another operation which can be defined in a natural way, the operation of antidifferentiation If (P,I) is a Taylor Model, then we define the antidifferentation operator
∂i−1(P, I) = (
Z x i
0
Pn−1(x)dx, (B(P − Pn−1) + I) · B(xi)) (8)
where Pn−1 is the (n-1)-th degree truncation of P, and xi ∈ [ai, bi] Notice this operation is easy to compute since integration of a polynomial is just manipulation of its coefficents The bounds in the remainder term are easily computed by interval evaluation of the polynomial piece, and interval oper-ations This is the key to implementing a rigorous ODE solver
Recall that every ODE of the form (1) can be written as an integral equation x(t) = x(0) +Rt
0 f (x(s))ds for t ≤ h We define the Picard operator
Af(x)(t) = x(0) +
Z t 0
Af is a map from C0([0, h]) onto itself and fixed points of Af correspond to solutions of our IVP (Note Af is continuous because we are assuming that f
is continuous.) We have theorem which gives us the existence of such a fixed point
Schauder’s Theorem Let A be a continuous operator on the Banach Space
X Let M ⊂ X be compact and convex, and let A(M ) ⊂ M Then A has a fixed point in M
We going to apply this theorem to a subset of X = C0([0, h]) which con-tains all Taylor Models and A=Af This approach was originally done in [BM4] and we follow it here
We start by finding large Y ⊂ X which is agreeable to our analysis on Taylor Models and to which we can apply the Schauder Theorem Let (P+I) be a Taylor Model depending on both time and the initial condition x0 Then we
Trang 8define the set M(P,I) so that M(P,I) ⊂ X = C0([0, t0]) and for x ∈ M(P,I) we have
x(t) ∈ P + I∀t ∈ [0, h]∀x0 (11)
|x(t0) − x(t00)| ≤ k|t0− t00|∀t0, t00 ∈ [0, h]∀x0 (12) The last condition is a Lipschitz condition for existence/uniqueness of solu-tions to the ODE We take k to be some Lipschitz bound on the function k Define Y as
Y = [
(P,I)
So Y will contain all Taylor Models Now if M ⊂ Y and if x1, x2 ∈ M , then
ax1+ (1 − a)x2 ∈ M ∀a ∈ [0, 1] since (ax1+ (1 − a)x2)(0) = x0, ax1+ (1 − a)x2
is also Lipschitz with constant k, and ax1 + (1 − a)x2 will be in the same Taylor Models as x1 and x2 (due to FTTMA) Hence M is convex Some point-set topology and an application of Ascoli-Arzela Theorem shows that
M is compact Finally note that A maps Y into self since (Af(x))(0) = x0
Af(x) is continuous due to the integral and Lipschitz continuous with con-stant k because f is bounded by k Lastly, since A is made up of intrinsic functions, then FTTMA gives that A maps Taylor Models to Taylor Models,
ie Y into Y
To apply Schauder’s Theorem, we must then find a Taylor Model (P,I) so that A(P + I) ⊂ P + I Then the fixed point, ie solution of the ODE will be contained in the Taylor Model Notice that if I is small, then we will have succeeded in closely modeling the solution with the polynomial part Find-ing such a Taylor Model is relatively easy computationally Start with the zero polynomial, and repeatedly iterate it through A, each time disregarding terms of order (n+1) or higher
Claim After (n+1) steps, this will produce an nth degree polynomial invari-ant under A
Proof We will show that after k applications of A to the zero polynomial, all terms of degree (k-1) will be fixed Since the Taylor Model is of degree n,
Trang 9then applying A (n+1)-times will produce the result.
Let P = Ak(0) Then it suffices to show that A(P + O(tk)) = P + O(tk) We proceed by induction on k
Basis: k=1 P = A(0) = x0
A(x0+ O(t)) = x0+
Z t 0
f (x0+ O(τ ))dτ = x0+ O(t) (14)
since all terms in the integral will pick up at least a factor of t after the integration Now assume the result holds true for k We will show it holds for (k+1) Let Q = A(P ) = Ak+1(0) By the inductive hypothesis, Q = R+S +O(tk+1) where R is the degree k-1 polynomial such that P = R+O(tk) (hence R is fixed under iterates of A), and S is the polynomial composed of the degree k terms in Q
A(Q) =A(R + S + O(tk+1)) = x0 +
Z t 0
f (R + S + O(τk+1))dτ (15)
= x0+
Z t 0
f (R + S) + f0(R + S) · O(τk+1) + f
00(R + S)
2 O(τ
k+1)2+ dτ (16)
= A(R + O(tk)) +
Z t 0
O(τk+1)(stuff)dτ (17)
= R + (kth order terms) + O(tk+1) (19)
We have used the Taylor Series expansion of f in terms of its argument x, and the inductive hypothesis Our claim will be complete if the (kth order terms)=S Notice that since deg(S)=k we have
Z t
0
f (R + S)dτ =
Z t 0
f (R) + f0(R)S +f
00
(R)
2 S
2
+ dτ (20)
=
Z t 0
f (R)dτ +
Z t 0
R · (stuff)dτ (21)
=
Z t 0
f (R)dτ + O(tk+1) (22)
Trang 10ie the (kth order terms) are actually independent of S since all the terms in
S get integrated and land in the O(tk+1) Now
= x0+
Z t 0
f (R + (P − R))dτ (24)
= x0+
Z t 0
f (R) + f0(R)(P − R) + dτ (25)
= A(R) +
Z t 0
(P − R)(stuff)dτ (26)
since deg(P-R)=k So we have A(Q) = A(R) + O(tk+1) and A(R) = A(P ) + O(tk+1) = A(Ak(0)) + O(tk+1) = Q + O(tk+1), so we must have that A(Q) =
Q + O(tk+1) = R + S + O(tk+1) which implies that (kth order terms)=S
Using this algorithm, we can generate a polynomial invariant under A To complete the application of Schauder’s Theorem we must find a Taylor Model which is invariant under A Let P be the A invariant nth degree polynomial
We desire an interval I so that A(P + I) ⊂ P + I, ie A(P + I) − P ⊂ I We have A(P + I) = x0 +R0tf (P + I)dτ By FTTMA f(P+I) will be a Taylor Model We can decompose as f (P + I) = Q + R + ˆI where Q is all terms of degree (n-1) or less, R is all degree n terms, and ˆI is the remainder Since A(P ) = P + O(tn+1) and since deg(R)=n it will integrate to an (n+1) order term, and then we must have that P = x0+Rt
0 Qdτ Thus the other terms will contribute only to the remainder ie
A(P + I) − P ⊂
Z t 0
We want to better understand this relation since ˆI depends upon I We