Random Variables and Distributions

Now we have established the existence of a probability measure based on a specific form ofσ-algebras called Borel fields. The question is, can we extend this rather specialized formulation to broader groups of random variables?

Of course, or this would be a short textbook. As a first step, let’s take the simple coin-toss example. In the case of a coin there are two possible outcomes (heads or tails). These outcomes completely specify the sample space. To add a little structure, we construct a random variable X that can take on two valuesX = 0 or 1 (as depicted in Table 2.6). IfX = 1 the coin toss resulted in a head, while ifX = 0 the coin toss resulted in a tail. Next, we define each outcome based on an event spaceω:

P (X = 1) = P ({ω∈Ω :X(ω) = 1}) = P ([H])

P (X = 0) = P ({ω∈Ω :X(ω) = 0}) = P ([T]). (3.18) In this case the physical outcome of the experiment is either a head (ω = heads) or a tail (ω= tails). These events are “mapped into” number space – the measure of the event is either a zero or a one.

The probability function is then defined by the random eventω. Defining ω as a uniform random variable from our original example, one alternative would be to define the function as

X(ω) = 1 ifω≤0.50. (3.19)

This definition results in the standard 50-50 result for a coin toss. However, it admits more general formulations. For example, if we let

X(ω) = 1 ifω≤0.40 (3.20)

the probability of heads becomes 40 percent.

Given this intuition, the next step is to formally define a random variable.

Three alternative definitions should be considered

Definition 3.4. Arandom variableis a function from a sample spaceSinto the real numbers.

Definition 3.5. A random variableis a variable that takes values according to a certain probability.

Definition 3.6. A random variable is a real-valued function defined over a sample space.

In this way a random variable is an abstraction. We assumed that there was a random variable defined on some sample space like flipping a coin. The flipping of the coin is an outcome in an abstract space (i.e., a Borel set).

S={s1, s2,ã ã ãsn}. (3.21)

We then define a numeric value to this set of random variables.

X :S →R1+

xi=X(si) or X(ω) : Ω→R1+

xi=X(ωi). (3.22) There are two ways of looking at this tranformation. First, the Borel set is simply defined as the real number line (remember that the real number line is a valid Borel set). Alternatively, we can view the transformation as a two step mapping. For example, a measure can be used to define the quantity of wheat produced per acre. Thus, we are left with two measures of the same phenomena — the quantity of wheat produced per acre and the probability of producing that quantity of wheat. The probability function (or measure) is then defined based on that random variable for either case defined as

PX(X =xi) = P ({si∈S:X(si) =xi})

P (X(ω) =xi) = P ({ω∈Ω :X(ω) =xi}). (3.23) Using either justification, for the rest of this text we are simply going to define a random variable as either a discrete (xi = 1,2,ã ã ãN) or real number (x= (−∞,∞)).

3.2.1 Discrete Random Variables

Several of the examples used thus far in the text have been discrete random variables. For example, the coin toss is a simple discrete random variable where the outcome can take on a finite number of values –X ={T ails, Heads}or in numeric formX ={0,1}. Using this intuition, we can then define a discrete random variable as

Definition 3.7. Adiscrete random variableis a variable that takes a countable number of real numbers with certain probability.

In addition to defining random variables as either discrete or continuous we can also define random variables as either univariate or multivariate. Con- sider the dice rolls presented in Table 3.2. Anna rolled two six-sided dice (one blue and one red) while Alex rolled one eight-sided die and one-six sided die.

Conceptually, the die rolled by each individual is a bivariate discrete set of random variables as defined in Definition 3.8.

Definition 3.8. Abivariate discrete random variableis a variable that takes a countable number of bivariate points on the plane with certain probability.

For example, the pair {2,1} is the tenth outcome of Anna’s rolls. In most board games the sum of the outcomes of the two dice is the important number – the number of spaces moved in MonopolyTM. However, in other games the outcome may be more complex. For example, the outcome may be whether a player suffers damage defined by whether the eight-sided die is greater than three while the amount of damage suffered is determined by the six-sided

TABLE 3.2

Anna and Alex’s Dice Rolls

Anna Alex

Eight Six Roll Blue Red Sided Sided

1 6 4 5 6

2 6 3 8 9

3 5 4 1 1

4 5 3 7 6

5 3 5 5 1

6 5 1 4 5

7 3 6 6 1

8 4 4 5 1

9 5 2 5 5

10 2 1 2 5

11 4 2 6 4

12 2 5 3 1

13 5 3 1 4

14 3 2 1 3

15 1 4 6 6

16 2 3 6 6

17 3 3 3 2

18 5 4 2 3

19 3 3 1 3

20 6 6 7 2

die. Thus, we may be interested in defining a secondary random variable (the number of spaces moved as the sum of the result of the blue and red die or the amount of damage suffered by a character of a board game based on a more complex protocal) based on the outcomes of the bivariate random variables. However, at the most basic level we are interested in a bivariate random variable.

3.2.2 Continuous Random Variables

While discrete random variables are important in some econometric applications, most econometric applications are based on continuous random variables such as the price of consumption goods or the quantity demanded and supplied in the market place. As discussed in Chapter 2, defining a continuous random variable as some subset on the real number line complicates the definition of probability. Because the number of real numbers for any subset of the real number line is infinite, the standard counting definition of probability used by the frequency approach presented in Equation 2.2 implies a zero probability. Hence, it is necessary to develop probability using the concept of a probability density function (or simply the density function) as presented in Definition 3.9.

Definition 3.9. If there is a non-negative function f(x) defined over the whole line such that

P (x1≤X≤x2) = Z x2

f(x)dx (3.24)

for anyx1andx2 satisfyingx1≤x2, thenX is a continuous random variable andf(x) is called its density function.

By the second axiom of probability (see definition 2.12) Z ∞

−∞

f(x)dx= 1. (3.25)

The simplest example of a continuous random variable is the uniform distribution

f(x) =

1 if 0≤x≤1

0 otherwise. (3.26)

Using the definition of the uniform distribution function in Equation 3.26, we can demonstrate that the probability of the continuous random variable defined in Equation 3.24 follows the required axioms for probability. First, f(x) ≥ 0 for all x. Second, the total probability equals one. To see this, consider the integral

Z ∞

−∞

f(x)dx= Z 0

−∞

f(x)dx+ Z 1

f(x)dx+ Z ∞

f(x)dx

= Z 1

f(x)dx= Z 1

dx=

x|10 +C= (1−0) +C.

(3.27)

Thus the total value of the integral is equal to one ifC= 0.

The definition of a continuous random variable, like the case of the univariate random variable, can be extended to include the possibility of a bivariate continuous random variable. Specifically, we can extend the univariate uniform distribution in Equation 3.26 to represent the density function for the bivariate outcome{x, y}

f(x, y) =

1 if 0≤x≤1,0≤y≤1

0 otherwise. (3.28)

The fact that the density function presented in Equation 3.28 conforms to the axioms of probability are left as an exercise.

Definition 3.10. If there is a non-negative functionf(x, y) defined over the whole plane such that

P (x1≤X≤x2, y1≤Y ≤y2) = Z y2

Z x2

f(x, y)dxdy (3.29) forx1, x2, y1, andy2 satisfyingx1≤x2, y1 ≤y2, then (X, Y) is a bivariate continuous random variable andf(X, Y) is called the joint density function.

Much of the work with distribution functions involves integration. In order to demonstrate a couple of solution techniques, I will work through some examples.

Example 3.11. Iff(x, y) =xyexp (−x−y), x >0,y >0 and 0 otherwise, what isP(X >1, Y <1)?

P (X >1, Y <1) = Z 1

Z ∞ 1

xye−(x+y)dxdy. (3.30) First, note that the integral can be separated into two terms:

P (X >1, Y <1) = Z ∞

xe−1dx Z 1

ye−ydy. (3.31) Each of these integrals can be solved using integration by parts:

d(uv) =v du+u dv v du=d(uv)−u dv Rv du=uv−R

u dv.

(3.32) In terms of a proper integral we have

Z b a

v du= (uv|ba− Z b

u dv. (3.33)

In this case, we have Z ∞

xe−xdx⇒

v=x, dv= 1 du=e−x, u=−e−x Z ∞

xe−xdx= −xe−x

∞ 1 +

Z ∞ 1

e−xdx= 2e−1= 0.74.

(3.34)

Working on the second part of the integral, Z 1

ye−ydy= −ye−1

1 0+

Z 1 0

e−ydy

= −ye−1

0+ −e−y

1 0

= −e−1+ 0

+ −e−1+ 1 .

(3.35)

Putting the two parts together, P (X >1, Y <1) =

Z ∞ 1

xe−xdx Z 1

ye−ydy

= (0.735) (0.264) = 0.194.

(3.36)

Definition 3.12. A T-variate random variable is a variable that takes a countable number of points on theT-dimensional Euclidean space with certain probabilities.

Following our development of integration by parts, we have attempted to keep the calculus at an intermediate level throughout this textbook. However, the development of certain symbolic computer programs may be useful to stu- dents. Appendix A presents a brief discussion of two such symbolic programs – Maxima (an open source program) andMathematica(a proprietary program).

Two Definitions of Probability for Econometrics