Handbook of mathematics for engineers and scienteists part 157 potx

Random variables X1 and X2are independent if and only if the charac-teristic function of the bivariate random variable X1, X2 is equal to the product of the characteristic functions of X

Trang 1

Random variables X1and X2are said to be independent if the relation

P (X1 S1, X2 S2) = P (X1 S1) P (X2 S2) (20.2.5.7)

holds for any measurable sets S1and S2

THEOREM1 Random variables X1and X2are independent if and only if

F X1 ,X2(x1, x2) = F X1(x1)F X2(x2)

THEOREM2 Random variables X1 and X2are independent if and only if the

charac-teristic function of the bivariate random variable (X1, X2) is equal to the product of the

characteristic functions of X1and X2,

f X1 ,X2(x1, x2) = f X1(x1)f X2(x2)

20.2.5-4 Numerical characteristics of bivariate random variables

The expectation of a function g(X1, X2) of a bivariate random variable (X1, X2) is defined

as the expression computed by the formula

E{g (X1, X2)}=

⎧

⎪

⎨

⎪

⎩

i

j

g (x1i , x2j )p ij in the discrete case,

–∞

–∞ g (x1, x2)p(x1, x2) dx1dx2 in the continuous case,

(20.2.5.8)

if these expressions exist in the sense of absolute convergence; otherwise, one says that

E{g (X1, X2)}does not exist

The moment of order r1+ r2 of a two-dimensional random variable (X1, X2) about a

point (a1, a2) is defined as the expectation E{(X1– a1)r1(X2– a2)r2}

If a1 = a2=0, then the moment of order r1+ r2of a two-dimensional random variable

(X1, X2) is called simply the moment, or the initial moment The initial moment of order

r1+ r2is usually denoted by α r1 ,r2; i.e., α r1,r2 = E{X r1

1 X2r2}

The first initial moments are the expectations of the random variables X1and X2; i.e.,

α1 , 0= E{X1

1X20}= E{X1}and α0,1= E{X0

1X21}= E{X2} The point (E{X1}, E{X2})

on the OXY -plane characterizes the position of the random point (X1, X2); this position

spreads about the point (E{X1}, E{X2}) Obviously, the first central moments are zero The second initial moments are given by the formulas

α2 , 0= α2(X1), α0 , 2= α2(X2), α1 , 1 = E{X1X2}

If a1 = E{X1} and a2 = E{X2}, then the moment of order r1+ r2 of the bivariate

random variable (X1, X2) is called the central moment The central moment of order r1+ r2

is usually denoted by μ r1,r2; i.e., μ r1,r2 = E{(X1– E{X1})r1(X2– E{X2})r2}

The second central moments are of special interest and have special names and notation:

λ11= μ2 , 0= Var{X1}, λ22= μ0 , 2= Var{X2},

λ12= λ21= μ1,1= E{(X1– E{X1})(X2– E{X2})} The first two of these moments are the variances of the respective random variables, and

the third moment is called the covariance and will be considered below.

Trang 2

20.2.5-5 Covariance and correlation of two random variables.

The covariance (correlation moment, or mixed second moment) Cov(X1, X2) of random

variables X1and X2is defined as the central moment of order (1+1):

Cov(X1, X2) = α1,1= E{(X1– E{X1})(X2– E{X2})} (20.2.5.9) Properties of the covariance:

1 Cov(X1, X2) = Cov(X2, X1)

2 Cov(X, X) = Var{X}

3 If the random variables X1 and X2 are independent, then Cov(X1, X2) = 0 If

Cov(X1, X2)≠ 0, then the random variables X1and X2are dependent

4 If Y1= a1X1+ b1and Y2= a2X2+ b2, then Cov(Y1, Y2) = a1a2Cov(X1, X2)

5 Cov(X1, X2) = E{X1X2}– E{X1}E{X2}

6 |Cov(X1, X2)| ≤√Var{X1}Var{X2} Moreover, Cov(X1, X2) = √

Var{X1}Var{X2}

if and only if the random variables X1and X2are linearly dependent

7 Var{X1+ X2}= Var{X1}+ Var{X2}+2Cov(X1, X2)

If Cov(X1, X2) =0, then the random variables X1and X2are said to be uncorrelated;

if Cov(X1, X2) ≠ 0, then they are correlated Independent random variables are always

uncorrelated, but correlated random variables are not necessarily independent in general

Example 1 Suppose that we throw two dice Let X1 be the number of spots on top of the first die, and

let X2be the number of spots on top of the second die We consider the random variables Y1= X1+ X2 and

Y2= X1– X2 (the sum and difference of points obtained) Then

Cov(Y1, Y2) = E{(X1+ X2– E{X1+ X2})(X1– X2– E{X1– X2} ) } =

= E{(X1– E{X1} )2– (X2– E{X2} )2} = Var {X1} – Var {X2} = 0 ,

since X1 and X2 are identically distributed and hence Var {X1} = Var {X2} But Y1 and Y2 are obviously

dependent; for example, if Y1= 2then one necessarily has Y2= 0

The covariance of random variables X1 and X2characterizes both the degree of their

dependence on each other and their spread around the point (E{X1}, E{X2}) The

covari-ance of X1 and X2 has the dimension equal to the product of dimensions of X1 and X2

Along with the covariance of random variables X1and X2, one often uses the correlation

ρ (X1, X2), which is a dimensionless normalized variable The correlation (or correlation coefficient) of random variables X1 and X2is the ratio of the covariance of X1 and X2to the product of their standard deviations,

ρ (X1, X2) = Cov(X1, X2)

σ X1σ X2

The correlation of random variables X1and X2indicates the degree of linear dependence

between the variables If ρ(X1, X2) =0, then there is no linear relation between the random variables, but there may well be some different relation between them

Properties of the correlation:

1 ρ(X1, X2) = ρ(X2, X1)

2 ρ(X, X) =1

3 If random variables X1and X2are independent, then ρ(X1, X2) =0 If ρ(X1, X2)≠ 0,

then the random variables X1and X2are dependent

4 If Y1= a1X1+ b1and Y2= a2X2+ b2, then ρ(X1, X2) = ρ(X1, X2).

5 |ρ (X1, X2)| ≤ 1 Moreover, ρ(X1, X2) = 1if and only if the random variables X1and

X2are linearly dependent.

Trang 3

20.2.5-6 Conditional distributions.

The joint distribution of random variables X1and X2determines the conditional distribution

of one of the random variables given that the other random variable takes a certain value (or lies in a certain interval) If the joint distribution is discrete, then the conditional

distributions of X1and X2are also discrete The conditional distributions are described by the formulas

P1|2(x1i|x2j ) = P (X1= x1i|X2= x2j) = P (X1P = x (X1i , X2= x2j

2= x2j =

p ij

P X2 ,j,

P2|1(x2j|x1i ) = P (X2= x2j|X1 = x1i) = P (X1P = x (X1i , X2= x2j

1= x1i) =

p ij

P X1 ,i,

i=1, , m; j =1, , n.

(20.2.5.11)

The probabilities P1|2(x1i|x2j ), j = 1, , n, define the conditional probability mass function of the random variable X2 given X1 = x1i ; and the probabilities P2|1(x2j|x1i),

i=1, , m, define the conditional probability mass function of the random variable X1 given X2 = x2j These conditional probability mass functions have the properties of usual probability mass functions; for example, the sum of probabilities in each of them is equal

i

P1|2(x1i|x2j) =

j

P2|1(x2j|x1i) =1

If the joint distribution is continuous, then the conditional distributions of the random

variables X1and X2 are also continuous and are described by the conditional probability density functions

p1|2(x1|x2) = p X1,X2(x1, x2)

p X2(x2) , p2|1(x2|x1) = p X1,X2(x1, x2)

p X1(x1) . (20.2.5.12)

The conditional distributions of the random variables X1and X2can also be described

by the conditional cumulative distribution functions

F X2(x2|X1= x1) = P (X2 < x2|X1 = x1),

F X1(x1|X2= x2) = P (X1 < x1|X2 = x2) (20.2.5.13) The total probability formulas for the cumulative distribution functions of continuous random variables have the form

F X2(x2) =

–∞ F X2(x2|X1 = x1)p X1(x1)dx1,

F X1(x1) =

–∞ F X1(x1|X2 = x2)p X2(x2)dx2

(20.2.5.14)

THEOREM ON MULTIPLICATION OF DENSITIES The joint probability function for two random variables is equal to the product of the probability density function of one random variable by the conditional probability density function of the other random variable, given the value of the first random variable:

p X1 ,X2(x1, x2) = p X2(x2)p1|2(x1x2) = p X1(x1)p2|1(x2|x1). (20.2.5.15)

Trang 4

Bayes’ formulas:

P1|2(x1i|x2j) = p X1(x1i )p2|1(x2j|x1i)

i P (X1= x1i ) P2|1(x2j x1i)

,

P2|1(x2j|x1i) = p X2(x2j )p1|2(x1i|x2j

j P (X2= x2j )p1|2(x1i|x2j ;

(20.2.5.16)

p1|2(x1|x2) = 7+∞ p X1(x1)p2|1(x2|x1)

–∞ p X1(x1)p2|1(x2|x1) dx1,

p2|1(x2|x1) = 7+∞ p X2(x2)p1|2(x1|x2)

–∞ p X2(x2)p1|2(x1|x2) dx2.

(20.2.5.17)

20.2.5-7 Conditional expectation Regression

The conditional expectation of a discrete random variable X2, given X1= x1(where x1is a

possible value of the random variable X1), is defined to be the sum of products of possible

values of X2by their conditional probabilities,

E{X2|X1= x1}=

j

x2j p2|1(x2j|x1) (20.2.5.18) For continuous random variables,

E{(X2|X1 = x1}=

–∞ x2p2|1(x2|x1) dx2. (20.2.5.19) Properties of the conditional expectation:

1 If random variables X and Y are independent, then their conditional expectations coincide with the unconditional expectations; i.e., E{Y|X = x}= E{Y}and E{X|Y =

y}= E{X}

2 E{f (X)h(Y )|X = x}= f (x)E{h (Y )|X = x}

3 Additivity of the conditional expectation:

E{Y1+ Y2|X}= E{Y1|X = x}+ E{Y2|X = x}

A function g2(X1) is called the best mean-square approximation to a random variable X2

if the expectation E{[X2– g2(X1)]2}takes the least possible value; the function g2(x1) is

called the mean-square regression of X2on X1

The conditional expectation E{X2|X1}is a function of X1,

E{X2|X1}= g2(X1) (20.2.5.20)

It is called the regression function of X2 on X1 and is the mean-square regression of X2

on X1

In a majority of cases, it suffices to approximate the regression (20.2.5.20) by the linear function

2g2(X1) = α + β21X1= E{X2}+ β21(X1– E{X1})

Here the coefficient β21 = ρ12σ X2/σ X1 is called the regression coefficient of X2 on X1 (ρ12= ρ(X1, X2)) The number σ X22(1– ρ212) is called the residual standard deviation of the random variable X2with respect to the random variable X1; this number characterizes the

error arising if X2is replaced by the linear function g2(X1) = α + β21X1.

Trang 5

Remark 1 The regression (20.2.5.20) can be approximated more precisely by a polynomial of degree

k> 1(parabolic regression of order k) or some other nonlinear functions (exponential regression, logarithmic

regression, etc.).

Remark 2. If X2is taken for the independent variable, then we obtain the mean-square regression

E{X1|X2}= g1(X2)

of X1on X2 and the linear regression

2g1(X2) = E{X1}+ β12(X2– E{X2} ), β12= ρ12σ X1

σ X2

of X1on X2.

Remark 3. All regression lines pass through the point (E{X1}, E{X2} ).

20.2.5-8 Distribution function of multivariate random variable

The probability P (X1< x1, , X n < x n ) treated as a function of a point x = (x1, , x n)

of the n-dimensional space and denoted by

FX(x) = F (x) = P (X1 < x1, , X n < x n) (20.2.5.21)

is called the multiple (or joint) distribution function of the n-dimensional random vector

X = (X1, , X n)

Properties of the joint distribution function of a random vector X:

1 F (x) is a nondecreasing function in each of the arguments.

2 If at least one of the arguments x1, , x n is equal to –∞, then the joint distribution

function is equal to zero

3 The m-dimensional distribution function of the subsystem of m < n random variables

X1, , X mcan be determined if the arguments corresponding to the remaining random

variables X m+1, , X nare set to +∞,

F X1 , ,X m (x1, , x m ) = FX(x1, , x m, –∞, , +∞).

(The m-dimensional distribution function F X1, ,X m (x1, , x m) is usually called the

marginal distribution function.)

4 The function FX (x) is left continuous in each of the arguments.

An n-dimensional random variable X is said to be discrete if each of the random

variables X1, X2, , X n is discrete The distribution of a subsystem X1, , X m of random variables and the conditional distributions are defined as in Paragraphs 20.2.5-6 and 20.2.5-7

An n-dimensional random variable X is said to be continuous if its distribution function

F(x) can be written in the form

F(x) =

–∞<y1 <x1

.

–∞<y n x n

p (y) dy, (20.2.5.22)

where dy = dy1 dy n and the function p(x), called the multiple (or joint) probability function of the random variables X1, , X n, is piecewise continuous The joint probability function can be expressed via the joint distribution function by the formula

p(x) = ∂ n FX (x)

∂x1 ∂x n; (20.2.5.23)

Trang 6

i.e., the joint probability function is the nth mixed partial derivative (one differentiation in

each of the arguments) of the joint distribution function

Formulas (20.2.5.22) and (20.2.5.23) establish a one-to-one correspondence (up to sets of probability zero) between the joint probability functions and the joint distribution

functions of continuous multivariate random variables The differential p(x) dx is called a

probability element The joint probability function of n random variables X1, X2, , X n has the same properties as the joint probability function of two random variables X1and X2

(see Paragraph 20.2.1-4.) The marginal probability functions and conditional probability

functions obtained from a continuous n-dimensional probability distribution are defined

precisely as in Paragraphs 20.2.1-4 and 20.2.1-8

Remark 1. The distribution of a system of two or more multivariate random variables X1= (X11, X12, )

and X2= (X21, X22, ) is the joint distribution of all variables X11, X12, ; X21, X22, ;

Remark 2 A joint distribution can be discrete in some random variables and continuous in the other random variables.

20.2.5-9 Numerical characteristics of multivariate random variables

The expectation of a function g(X) of a multivariate random variable X is defined by the

formula

E{g(X)}=

⎧

⎪

⎨

⎪

⎩

i1

.

i n

g (x1i1, , x2i n )p i1i2 i n in the discrete case,

–∞ .

–∞ g (x)p(x) dx in the continuous case

(20.2.5.24)

if these expressions exist in the sense of absolute convergence; otherwise, one says that

E{g(X)}does not exist

The moment of order r1+· · · + r n of a random variable X about a point (a1, , a n) is

defined as the expectation E{(X1– a1)r1 (X n – a n)r n}

For a1 =· · · = a n=0, the moment of order r1+· · · + r n of an n-dimensional random

variable X is called the initial moment and is denoted by

α r1 r n = E{X r1

1 X n r n}

The first initial moments are the expectations of the coordinates X1, , X n The point

(E{X1}, , E{X n}) in the space Rn characterizes the position of the random point

(X1, , X n ), which spreads about the point (E{X1}, , E{X n}) The first central moments are naturally zero

If a1 = E{X1}, , a n = E{X n}, then the moment of order r1+ · · · + r n of the

n -dimensional random variable X is called the central moment and is denoted by

μ r1 r n = E5

(X1– E{X1})r1 (X n – E{X n})r n6

The second central moments have the following notation:

λ ij = λ ji = E5

(X i – E{X i})(X j – E{X j})6

=

Var{X i}= σ i2 for i = j, Cov(X i , X j for i≠j (20.2.5.25)

The moments λ ij given by relation (20.2.5.25) determine the covariance matrix (matrix

of moments) [λ ij] Obviously, the covariance matrix is real and symmetric; its

determi-nant det[λ ij ] is called the generalized variance of the n-dimensional distribution The

correlations

ρ ij = ρ(X i , X j) = Cov(X σ i , X j

X i σ X j

= λ ij

λ ii λ jj (i, j =1,2, , n) (20.2.5.26)

Trang 7

determine the correlation matrix [ρ ij ] of the n-dimensional distribution provided that all

variances Var{X i}are nonzero Obviously, the correlation matrix is real and symmetric The quantity

det[ρ ij] is called the spread coefficient

20.2.5-10 Regression

A function g1(X2, , X n ) is called the best mean-square approximation to a random variable X1 if the expectation E{[X1– g1(X2, , X n)]2}takes the least possible value

The function g1(x2, , x n ) is called the mean-square regression of X1on X2, , X n

The conditional expectation E{X1|X2, , X n}is a function of X2, , X n,

E{X1|X2, , X n}= g1(X2, , X n) (20.2.5.27)

It is called the regression function of X1on X2, , X nand is the mean-square regression

of X1on X2, , X n

In a majority of cases, it suffices to approximate the regression (20.2.5.27) by the linear function

2g i = E{X i}+

j≠i

β ij (X j – E{X j}) (20.2.5.28)

Relation (20.2.5.28) determines the linear regression of X i on the other n –1variables

The regression coefficients β ij are determined by the relation

β ij = –Λij

Λii, whereΛij are the entries of the inverse of the covariance matrix The measure of correlation

between X i and the other n –1variables is the multiple correlation coefficient

ρ (X i 2g i) = 1– 1

λ iiΛii.

The residual of X i with respect to the other n –1 variables is defined as the random variableΔi = X i–2g i It satisfies the relations

Cov(Δi , X j) =0 for i≠j,

Var{Δi} for i = j (residual variance).

20.2.5-11 Characteristic functions

The characteristic function of a random variable X is defined as the expectation of the

random variable exp

in j=1t j X j

:

fX(t) = f (t) = E

exp

in j=1t j X j

4

where t = (t1, , t n ), i is the imaginary unit, i2= –1

Λii, whereΛij are the entries of the inverse of the covariance matrix The measure of correlation

between X i and the other n –1variables is the multiple... Characteristic functions

The characteristic function of a random variable X is defined as the expectation of the

random variable exp

in j=1t...

It is called the regression function of X1on X2, , X nand is the mean-square regression

of X1on X2,

Định dạng
Số trang	7
Dung lượng	393,51 KB