Wornell Massachusetts Institute of Technology 72.1 Introduction 72.2 Eventually Expanding Maps and Markov Maps Eventually Expanding Maps 72.3 Signals From Eventually Expanding Maps 72.4
Trang 1Steven H Isabelle, et Al “Nonlinear Maps.”
2000 CRC Press LLC <http://www.engnetbase.com>.
Trang 2Nonlinear Maps
Steven H Isabelle
Massachusetts Institute of Technology
Gregory W Wornell
Massachusetts Institute of Technology
72.1 Introduction 72.2 Eventually Expanding Maps and Markov Maps
Eventually Expanding Maps
72.3 Signals From Eventually Expanding Maps 72.4 Estimating Chaotic Signals in Noise 72.5 Probabilistic Properties of Chaotic Maps 72.6 Statistics of Markov Maps
72.7 Power Spectra of Markov Maps 72.8 Modeling Eventually Expanding Maps with Markov Maps References
72.1 Introduction
One-dimensional nonlinear systems, although simple in form, are applicable in a surprisingly wide variety of engineering contexts As models for engineering systems, their richly complex behavior has provided insight into the operation of, for example, analog-to-digital converters [1], nonlinear oscillators [2], and power converters [3] As realizable systems, they have been proposed as random number generators [4] and as signal generators for communication systems [5,6] As analytic tools, they have served as mirrors for the behavior of more complex, higher dimensional systems [7,8,9] Although one-dimensional nonlinear systems are, in general, hard to analyze, certain useful classes
of them are relatively well understood These systems are described by the recursion
x[n] = f (x[n − 1]) (72.1a) y[n] = g(x[n]) , (72.1b)
initialized by a scalar initial conditionx[0], where f (·) and g(·) are real-valued functions that describe
the evolution of a nonlinear system and the observation of its state, respectively The dependence
of the sequencex[n] on its initial condition is emphasized by writing x[n] = f n (x[0]) where f n (·)
represents then-fold composition of f (·) with itself.
Without further restrictions of the form off (·) and g(·), this class of systems is too large to
easily explore However, systems and signals corresponding to certain “well-behaved” mapsf (·)
and observation functionsg(·) can be rigorously analyzed Maps of this type often generate chaotic
signals—loosely speaking, bounded signals that are neither periodic nor transient—under easily verifiable conditions These chaotic signals, although completely deterministic, are in many ways analogous to stochastic processes In fact, one-dimensional chaotic maps illustrate in a relatively simple setting that the distinction between deterministic and stochastic signals is sometimes artificial
Trang 3and can be profitably emphasized or deemphasized according to the needs of an application For instance, problems of signal recovery from noisy observations are often best approached with a deterministic emphasis, while certain signal generation problems [10] benefit most from a stochastic treatment
72.2 Eventually Expanding Maps and Markov Maps
Although signal models of the form [1] have simple, one-dimensional state spaces, they can behave
in a variety of complex ways that model a wide range of phenomena This flexibility comes at a cost, however; without some restrictions on its form, this class of models is too large to be analytically tractable Two tractable classes of models that appear quite often in applications are eventually expanding maps and Markov maps
72.2.1 Eventually Expanding Maps
Eventually expanding maps—which have been used to model sigma-delta modulators [11], switching power converters [3], other switched flow systems [12], and signal generators [6,13]—have three defining features: they are piecewise smooth, they map the unit interval to itself, and they have some iterate with slope that is everywhere greater than unity Maps with these features generate time series that are chaotic, but on average well behaved For reference, the formal definition is as follows, where the restriction to the unit interval is convenient but not necessary:
DEFINITION 72.1 A nonsingular mapf : [0, 1] → [0, 1] is called eventually expanding if
1 There is a set of partition points 0= a0< a1< · · · a N = 1 such that restricted to each
of the intervalsV i = [a i−1 , a i ), called partition elements, the map f (·) is monotonic,
continuous and differentiable
2 The function 1/|f0(x)| is of bounded variation [14] (In some definitions, this smooth-ness condition on the reciprocal of the derivative is replaced with a more restrictive bounded slope condition, i.e., there exists a constant B such that|f0(x)| < B for all x.)
3 There exists a realλ > 1 and a integer m such that
dx d f m (x)
≥ λ
wherever the derivative exists This is the eventually expanding condition
Every eventually expanding map can be expressed in the form
f (x) =XN
i=1
where eachf i (·) is continuous, monotonic, and differentiable on the interior of the ith partition
element and the indicator functionχ i (x) is defined by
χ i (x) =
1 x ∈ V i ,
This class is broad enough to include for example, discontinuous maps and maps with discontinuous
or unbounded slope Eventually expanding maps also include a class that is particularly amenable to analysis—the Markov maps
Trang 4Markov maps are analytically tractable and broadly applicable to problems of signal estimation, signal generation, and signal approximation They are defined as eventually expanding maps that are piecewise-linear and have some extra structure
DEFINITION 72.2 A mapf : [0, 1] → [0, 1] is an eventually expanding, piecewise-linear, Markov map if f is an eventually expanding map with the following additional properties:
1 The map is piecewise-linear, i.e., there is a set of partition points 0= a0< a1< · · · <
a N = 1 such that restricted to each of the intervals V i = [a i−1 , a i ), called partition
elements, the mapf (·) is affine, i.e., the functions f i (·) on the right side of (72.2) are of the form
f i (x) = s i x + b i
2 The map has the Markov property that partition points map to partition points, i.e., for eachi, f (a i ) = a jfor somej.
Every Markov map can be expressed in the form
f (x) =XN
i=1
(s i x + b i ) χ i (x) , (72.4) wheres i 6= 0 for all i Fig.72.1shows the Markov map
f (x) =
(1 − a)x/a + a 0 ≤ x ≤ a (1 − x)/(1 − a) a < x ≤ 1 , (72.5)
which has partition points{0, a, 1}, and partition elements V1= [0, a) and V2= [a, 1).
FIGURE 72.1: An example of a piecewise-linear Markov map with two partition elements
Markov maps generate signals with two useful properties: they are, when suitably quantized, indistinguishable from signals generated by Markov chains; they are close, in a sense, to signals generated by more general eventually expanding maps [15] These two properties lead to applications
of Markov maps for generating random numbers and approximating other signals The analysis underlying these types of applications depends on signal representations that provide insight into the structure of chaotic signals
Trang 572.3 Signals From Eventually Expanding Maps
There are several general representations for signals generated by eventually expanding maps Each provides different insights into the structure of these signals and proves useful in different applications First, and most obviously, a sequence generated by a particular map is completely determined by (and is thus represented by) its initial conditionx[0] This representation allows certain signal
estimation problems to be recast as problems of estimating the scalar initial condition Second, and less obviously, the quantized signaly[n] = g(x[n]), for n ≥ 0 generated by (72.1) withg(·) defined
by
uniquely specifies the initial conditionx[0] and hence the entire state sequence x[n] Such quantized
sequencesy[n] are called the symbolic dynamics associated with f (·) [7] Certain properties of a map, such as the collection of initial conditions leading to periodic points, are most easily described
in terms of its symbolic dynamics Finally, a hybrid representation ofx[n] combining the initial
condition and symbolic representations
H[N] = {g(x[0]), , g(x[N]), x[N]}
is often useful
72.4 Estimating Chaotic Signals in Noise
The hybrid signal representation described in the previous section can be applied to a classical signal processing problem—estimating a signal in white Gaussian noise For example, suppose the problem
is to estimate a chaotic sequencex[n], n = 0, , N − 1 from the noisy observations
r[n] = x[n] + w[n], n = 0, , N − 1 (72.7)
wherew[n] is a stationary, zero-mean white Gaussian noise sequence with variance σ2
w, andx[n]
is generated by iterating (72.1) from an unknown initial condition Because w[n] is white and
Gaussian, the maximum likelihood estimation problem is equivalent to the constrained minimum distance problem
minimize
x[n] : x[i] = f (x[i − 1]) ε[N] =XN
k=0
(r[k] − x[k])2 (72.8) and to the scalar problem
minimize
x[0] ∈ [0, 1] ε[N] =
N
X
k=0
r[k] − f k (x[0])2 (72.9)
Thus, the maximum-likelihood problem can, in principle, be solved by first estimating the initial condition, then iterating (72.1) to generate the remaining estimates However, the initial condition is often difficult to estimate directly because the likelihood function (72.9), which is highly irregular with fractal characteristics, is unsuitable for gradient-descent type optimization [16] Another solution divides the domain off (·) into subintervals and then solves a dynamic programming problem [17]; however, this solution is, in general, suboptimal and computationally expensive
Although the maximum likelihood problem described above need not, in general, have a computa-tionally efficient recursive solution, it does have one when, for example, the mapf (·) is a symmetric
tent map of the form
f (x) = β − 1 − β|x| , x ∈ [−1, 1] (72.10)
Trang 6with parameter 1< β ≤ 2 [5] This algorithm solves for the hybrid representation of the initial condition from which an estimate of the entire signal can be determined The hybrid representation
is of the form
H[N] = {y[0], , y[N], x[N]} ,
where eachy[i] takes one of two values which, for convenience, we define as y[i] = sgn (x[i]) Since
eachy[n] can independently takes one of two values, there are 2 Nfeasible solutions to this problem
and a direct search for the optimal solution is thus impractical even for moderate values ofN.
The resulting algorithm has computational complexity that is linear in the length of the observation,
N This efficiency is the result of a special separation property, possessed by the map [10]: given
y[0], , y[i − 1] and y[i + 1], , y[N] the estimate of the parameter y[i] is independent of y[i + 1], , y[N] The algorithm is as follows Denoting by ˆφ[n|m] the ML estimates of any
sequenceφ[n] given r[k] for 0 ≤ k ≤ m, the ML solution is of the form,
ˆx[n|n] = β2− 1
β2n r[n] + β2n− 1ˆx[n|n − 1]
ˆy[n|N] = sgn ˆx[n|n] (72.12)
whereˆx[n|n−1] = f ( ˆx[n−1|n−1]), the initialization is ˆx[0|0] = r[0], and the function L β ( ˆx[n|n]),
defined by
L β (x) =
x x ∈ (−1, β − 1)
−1 x ≤ −1 ·
serves to restrict the ML estimates to the intervalx ∈ (−1, β−1) The smoothed estimates ˆx ML [n|N]
are obtained by converting the hybrid representation to the initial condition and then iterating the estimated initial condition forward
72.5 Probabilistic Properties of Chaotic Maps
Almost all waveforms generated by a particular eventually expanding map have the same average behavior [18], in the sense that the time average
¯h(x[0]) = lim n→∞1n n−1X
k=0
h(x[k]) = lim
n→∞
1
n
n−1
X
k=0
hf k (x[0]) (72.15)
exists and is essentially independent of the initial conditionx[0] for sufficiently well-behaved
func-tionsh(·) This result, which is reminiscent of results from the theory of stationary stochastic
processes [19], forms the basis for a probabilistic interpretation of chaotic signals, which in turn leads to analytic methods for characterizing their time-average behavior
To explore the link between chaotic and stochastic signals, first consider the stochastic process generated by iterating (72.1) from a random initial conditionx[0], with probability density function
p0(·) Denote by p n (·) the density of the nth iterate x[n] Although, in general, the members of the
sequencep n (·) will differ, there can exist densities, called invariant densities, that are time-invariant,
i.e.,
p0(·) = p1(·) = = p n (·) = p(·) (72.16) 1
When the initial conditionx[0] is chosen randomly according to an invariant density, the resulting
stochastic process is stationary [19] and its ensemble averages depend on the invariant density Even
Trang 7when the initial condition is not random, invariant densities play an important role in describing the time-average behavior of chaotic signals This role depends on, among other things, the number of invariant densities that a map possesses
A general one-dimensional nonlinear map may possess many invariant densities For example, eventually expanding maps withN partition elements have at least one and at most N invariant
densities [20] However, maps can often be decomposed into collections of maps, each with only one invariant density [19], and little generality is lost by concentrating on maps with only one invariant density In this special case, the results that relate the invariant density to the average behavior of chaotic signals are more intuitive
The invariant density, although introduced through the device of a random initial condition, can also be used to study the behavior of individual signals Individual signals are connected to ensembles of signals, which correspond to random initial conditions, through a classical result due to Birkhoff, which asserts that the time average ¯h(x[0]) defined by Eq (72.15) exists wheneverf (·) has
an invariant density When thef (·) has only one invariant density, the time average is independent
of the initial condition for almost all (with respect to the invariant densityp(·)) initial conditions
and equals
lim
n→∞
1
n
n−1
X
k=0
h(x[k]) = lim
n→∞
1
n
n−1
X
k=0
hf k (x[0])=
Z
h(x)p(x)dx (72.17)
where the integral is performed over the domain off (·) and where h(·) is measurable.
Birkhoff ’s theorem leads to a relative frequency interpretation of time-averages of chaotic signals
To see this, consider the time-average of the indicator function˜χ [s−,s+] (x), which is zero everywhere
but in the interval[s − , s + ] where it is equal to unity Using Birkhoff’s theorem with Eq (72.17) yields
lim
n→∞
1
n
n−1
X
k=0
˜χ [s−,s+] (x[k]) =
Z
˜χ [s−,s+] (x)p(x)dx (72.18)
=
Z
[s−,s+] p(x)dx (72.19)
where Eq (72.20) follows from Eq (72.19) when is small and p(·) is sufficiently smooth The
time-average (72.18) is exactly the fraction of time that the sequencex[n] takes values in the interval
[s − , s + ] Thus, from (72.20), the value of the invariant density at any points is approximately
proportional to the relative frequency with whichx[n] takes values in a small neighborhood of the
point Motivated by this relative frequency interpretation, the probability that an arbitrary function
h(x[n]) falls into an arbitrary set A can be defined by
P r {h(x) ∈ A} = lim n→∞ n1
n−1
X
k=0
˜χ A (h(x[k])) (72.21)
Using this definition of probability , it can be shown that for any Markov map, the symbol sequence
y[n] defined in Section72.3is indistinguishable from a Markov chain in the sense that
P r {y[n]|y[n − 1], , y[0]} = P r {y[n]|y[n − 1]} , (72.22) holds for alln [21] The first order transition probabilities can be shown to be of the form
P r(y[n]|y[n − 1]) = V y[n]
s y[n] V y[n−1] ,
Trang 8where thes i are the slopes of the mapf (·) as in Eq (72.4) and|V y[n]| denotes the length of the intervalV y[n] As an example, consider the asymmetric tent map
f (x) =
(1 − x)/(1 − a) a < x ≤ 1 ,
with parameter in the range 0 < a < 1 and a quantizer g(·) of the form (72.6) The previous results establish thaty[n] = g(x[n]) is equivalent to a sample sequence from the Markov chain with
transition probability matrix
[P ] ij =
a 1 − a
a 1 − a
,
where[P ] ij = P r{y[n] = i|y[n − 1] = j} Thus, the symbolic sequence appears to have been
generated by independent flips of a biased coin with the probability of heads, say, equal toa When
the parameter takes the valuea = 1/2, this corresponds to a sequence of independent equally likely
bits Thus, a sequence of Bernoulli random variables can been constructed from a deterministic sequencex[n] Based on this remarkable result, a circuit that generates statistically independent bits
for cryptographic applications has been designed [4]
Some of the deeper probabilistic properties of chaotic signals depend on the integral (72.17), which
in turn depends on the invariant density For some maps, invariant densities can be determined explicitly For example, the tent map (72.10) withβ = 2 has invariant density
p(x) =
1/2 −1 ≤ x ≤ 1
0 otherwise
as can be readily verified using elementary results from the theory of derived distributions of functions
of random variables [22] More generally, all Markov maps have invariant densities that are piecewise-constant function of the form
n
X
i=1
wherec i are real constants that can be determined from the map’s parameters [23] This makes Markov maps especially amenable to analysis
72.6 Statistics of Markov Maps
The transition probabilities computed above may be viewed as statistics of the sequencex[n] These
statistics, which are important in a variety of applications, have the attractive property that they are defined by integrals having, for Markov maps, readily computable, closed-form solutions This property holds more generally—Markov maps generate sequences for which a large class of statistics can be determined in closed form These analytic solutions have two primary advantages over empirical solutions computed by time averaging: they circumvent some of the numerical problems that arise when simulating the long sequences of chaotic data that are necessary to generate reliable averages; and they often provide insight into aspects of chaotic signals, such as dependence on a parameter, that could not be easily determined by empirical averaging
Statistics that can be readily computed include correlations of the form
R f ;h0,h1, ,h r[k1, , k r] = lim
L→∞
1
L
L−1
X
n=0
h0(x[n])h1(x[n + k1]) · · · h r (x[n + k r ])(72.24)
= Z
h0(x[n])h1(x[n + k1]) · · · h r (x[n + k r ]) p(x) dx ,(72.25)
Trang 9where theh i (·)0s are suitably well-behaved but otherwise arbitrary functions, the k0
i s are nonnegative
integers, the sequence x[n] is generated by Eq (72.1), andp(·) is the invariant density This
class of statistics includes as important special cases the autocorrelation function and all higher-order moments of the time-series Of primary importance in determining these statistics is a linear transformation called the Frobenius-Perron (FP) operator, which enters into the computation of these correlations in two ways First, it suggests a method for determining an invariant density Second, it provides a “change of variables” within the integral that leads to simple expressions for correlation statistics
The definition of the FP operator can be motivated by using the device of a random initial condition
x[0] with density p0(x) as in Section72.5 The FP operator describes the time evolution of this initial probability density More precisely, it relates the initial density to the densitiesp n (·) of the random
variablesx[n] = f n (x[0]) through the equation
p n (x) = P f n p0(x) (72.26)
whereP n
f denotes then-fold self-composition of P f This definition of the FP operator, although phrased in terms of its action on probability densities, can be extended to all integrable functions This extended operator, which is also called the FP operator, is linear and continuous Its properties are closely related to the statistical structure of signals generated by chaotic maps (see [9] for a thorough discussion of these issues) For example, the evolution equation (72.26) implies that an invariant density of a map is a fixed point of its FP operator, that is, it satisfies
This relation can be used to determine explicitly the invariant densities of Markov maps [23], which may in turn be used to compute more general statistics
Using the change of variables property of the FP operator, the correlation statistic (72.25) can be expressed as the ensemble average
R f ;h0,h1, ,h r[k1, , k r]= (72.28) Z
h r (x)P k r −k r−1 f
n
h r−1 (x) · · · P k2−k1
f
n
h1(x)P k1
f {h0(x)p(x)}o· · ·odx (72.29)
Although such integrals are, for general one-dimensional nonlinear maps, difficult to evaluate, closed-form solutions exist when f (·) is a Markov map— a development that depends on an explicit
expression for FP operator
The FP operator of a Markov map has a simple, finite-dimensional matrix representation when it operates on certain piecewise polynomial functions Any function of the form
h(·) =XK
i=0
N
X
j=1
a ij x i χ j (x)
can be represented by anN(K + 1) dimensional coordinate vector with respect to the basis
θ1(x), θ2(x), , θ N(K+1) =1
n
χ1(x), , χ N (x), xχ1(x), , xχ N (x), , x K χ1(x), , x K χ N (x)o .(72.30)
The action of the FP operator on any such function can be expressed as a matrix-vector product: when the coordinate vector ofh(x) is h, the coordinate vector of q(x) = P f h(x) is
q = PKh,
Trang 10where Pkis the squareN(K + 1) dimensional, block upper-triangular matrix
PK =
P00 P01 · · · P0K
0 P11 P12 · · · P1K
0 0 · · · PKK
and where each nonzeroN × N block is of the form
Pij =
j i
P0Bj−iSj forj ≥ i (72.32)
TheN × N matrices B and S are diagonal with elements B ii = −b i andS ii = 1/s i, respectively,
while P0= P00is theN × N matrix with elements
[P0]ij =
1/ s j i ∈ I j ,
The invariant density of a Markov map, which is needed to compute the correlation statistic (72.25), can be determined as the solution of an eigenvector problem It can be shown that such invariant densities are piecewise constant functions so that the fixed point equation (72.27) reduces to the matrix expression
P0p= p
Due to the properties of the matrix P0, this equation always has a solution that can be chosen to have
nonnegative components It follows that the correlation statistic (72.29) can always be expressed as
R f ;h0,h1, ,h r [k1, , k r]= gT1Mg2 (72.34)
where M is a basis correlation matrix with elements
[M]ij =
Z
θ i (x)θ j (x) dx (72.35)
and giare the coordinate vectors of the functions
g2(x) = P k r −k r−1
f
n
h r−1 (x) · · · P k2−k1
f
n
h1(x)P k1
f {h0(x)p(x)}o· · ·o . (72.37)
By the previous discussion, the coordinate vectors g1and g2can be determined using straightforward matrix-vector operations Thus, expression (72.34) provides a practical way of exactly computing the integral (72.29), and reveals some important statistical structure of signals generated by Markov maps
72.7 Power Spectra of Markov Maps
An important statistic in the context of many engineering applications is the power spectrum The power spectrum associated with a Markov map is defined as the Fourier transform of its autocorre-lation sequence
R xx [k] =
Z
x[n]x[n + k]p(x)dx (72.38)
... example, discontinuous maps and maps with discontinuousor unbounded slope Eventually expanding maps also include a class that is particularly amenable to analysis—the Markov maps