CS6015: Linear Algebra and Random Processes

Lecture 33: Expectation, Variance and their properties, Computing expectation and variance of some known distributions

Learning Objectives

What is expectation?

What is variance?

What are some of their properties?

How do you compute expectation and variance of some standard distributions?

Expectation

00, 0, 1, 2,3, ..., 36

Does gambling pay off?

Standard pay off - 35:1 if the ball lands on your number

If you play this game a 1000 times how much do you expect to win on average ?

00

0

1

2

3 ...

36 Does gambling pay off?

X: profit

\Omega

-1

35

p_X(35) =

\frac{1}{38} = 0.026

p_X(-1) = 1 - \frac{1}{38} = 0.974

Does gambling pay off?

P(win) = \frac{\#wins}{\#games}

\frac{1}{38} = 0.026

p_X(-1) = 1 - \frac{1}{38} = 0.974

If you play this game a 1000 times how much do you expect to win on average ?

p_X(34) =

0.026 = \frac{\#wins}{1000}

Avg. gain = ~~~~~~~(26*35 + 974*(-1))

\#wins = 26

\frac{1}{1000}

= -0.064

(Stop gambling!!)

Expectation: the formula

E[X]

E[X] = \frac{1}{1000}(26*34 + 974*(-1))

E[X] = \frac{26}{1000} * 34 + \frac{974}{1000} * (-1)

E[X] = \sum_{x\in\{-1, 34\}}x*p_X(x)

E[X] = p_X(34)*34 + p_X(-1)*(-1)

E[X] = 0.026*34 + 0.974*(-1)

Expectation: the formula

E[X]

E[X] = \sum_{x\in\mathbb{R}_X}x*p_X(x)

The expected value or expectation of a discrete random variable X whose possible values are

x_1, x_2, \dots, x_n

is denoted by and computed as

E[X]

E[X] = \sum_{i=1}^n x_i P(X = x_i) = \sum_{i=1}^n x_i*p_X(x_i)

Expectation: Insurance

A person buys a car theft insurance policy of INR 200000 at an annual premium of INR 6000. There is a 2% chance that the car may get stolen.

What is the expected gain of the insurance company at the end of 1 year?

X: profit

X: \{6000, -194000\}

p_X(6000) = 0.98

p_X(-194000) = 0.02

E[X] = \sum_{x \in \mathbb{R}_x} x*p_X(x)

\therefore E[X] = 0.98*6000 + 0.02 * (-194000)

= 2000

Expectation: Insurance

A person buys a car theft insurance policy of INR 200000. Suppose there is a 10% chance that the car may get stolen

What should the premium be so that the expected gain in still INR 2000?

X: profit

X: \{x, -(200000 - x) \}

p_X(x) = 0.90

p_X(x - 200000) = 0.10

E[X] = \sum_{x \in \mathbb{R}_x} x*p_X(x)

\therefore E[X] = 0.9*x + 0.1 * (x - 200000)

\therefore 2000 = 0.9*x + 0.1 * (x - 200000)

\therefore x = 22000

X: \{x, x - 200000 \}

Function of a Random Variable

(1,1)

(1,2)~(2,1)

(1,3)~(2,2)~(3,1)

(1,4)~(2,3)~(3,2)~(4,1)

(1,5)~(2,4)~(3,3)~(4,2)~(5,1)

(1,6)~(2,5)~(3,4)~(4,3)~(5,2)~(6,1)

(2,6)~(3,5)~(4,4)~(5,3)~(6,2)

(3,6)~(4,5)~(5,4)~(6,3)

(4,6)~(5,5)~(6,4)

(5,6)~(6,5)

(6,6)

\Omega

Y = g(X)

E[Y] = ?

Y = \begin{cases} 1~~if~~x < 5 \\ 2~~if~~5 \leq x \leq 8 \\ 3~~if~~x > 8 \\ \end{cases}

Y = g(X)

Y = \begin{cases} 1~~if~~x < 5 \\ 2~~if~~5 \leq x \leq 8 \\ 3~~if~~x > 8 \\ \end{cases}

E[Y] = 1*p_Y(1) + 2*p_Y(2) + 3*p_Y(3)

p_Y(1) = \frac{1}{36} + \frac{2}{36} + \frac{3}{36} = \frac{6}{36}

p_Y(2) = \frac{4}{36} + \frac{5}{36} + \frac{6}{36} + \frac{5}{36} = \frac{20}{36}

p_Y(3) = \frac{4}{36} + \frac{3}{36} + \frac{2}{36} + \frac{1}{36} = \frac{10}{36}

\therefore E[Y] = 1*(\frac{1}{36} + \frac{2}{36} + \frac{3}{36})

+ 2*(\frac{4}{36} + \frac{5}{36} + \frac{6}{36} + \frac{5}{36})

+3*(\frac{4}{36} + \frac{3}{36} + \frac{2}{36} + \frac{1}{36}) = \frac{76}{36}

Y = g(X)

\frac{6}{36}

\frac{20}{36}

\frac{30}{36}

p_X(x)

p_Y(y)

Y = g(X)

\therefore E[Y] = 1*(\frac{1}{36} + \frac{2}{36} + \frac{3}{36})

+3*(\frac{4}{36} + \frac{3}{36} + \frac{2}{36} + \frac{1}{36}) = \frac{76}{36}

+ 2*(\frac{4}{36} + \frac{5}{36} + \frac{6}{36} + \frac{5}{36})

\therefore E[Y] = 1*p_X(2) + 1 * p_X(3) + 1 * p_X(4)

+ 2*p_X(5) + 2 * p_X(6) + 2 * p_X(7) + 2 * p_X(8)

+ 3*p_X(9) + 3 * p_X(10) + 3 * p_X(11) + 3 * p_X(12)

Y = g(X)

\therefore E[Y] = 1*(\frac{1}{36} + \frac{2}{36} + \frac{3}{36})

+3*(\frac{4}{36} + \frac{3}{36} + \frac{2}{36} + \frac{1}{36}) = \frac{76}{36}

+ 2*(\frac{4}{36} + \frac{5}{36} + \frac{6}{36} + \frac{5}{36})

\therefore E[Y] = g(2)*p_X(2) + g(3) * p_X(3) + g(4) * p_X(4)

+ g(5)*p_X(5) + g(6) * p_X(6) + g(7) * p_X(7) + g(8) * p_X(8)

+ g(9)*p_X(9) + g(10) * p_X(10) + g(11) * p_X(11) + g(12) * p_X(12)

Y = \begin{cases} 1~~if~~x < 5 \\ 2~~if~~5 \leq x \leq 8 \\ 3~~if~~x > 8 \\ \end{cases}

\therefore E[Y] = \sum_x g(x)*p_X(x)

E[Y] = \sum_x g(x)*p_X(x)

E[Y] = \sum_y y*p_Y(y)

\equiv

Some properties of expectation

Linearity of expectation

Y = aX + b

E[Y] = \sum_{x \in \mathbb{R}_X} g(x)p_X(x)

= \sum_{x \in \mathbb{R}_X} a*x*p_X(x) +\sum_{x \in \mathbb{R}_X} b*p_X(x)

= a*E[X] + b*1~~(\because \sum_{x \in \mathbb{R}_X} p_X(x) = 1)

= a*\sum_{x \in \mathbb{R}_X} x*p_X(x) +b*\sum_{x \in \mathbb{R}_X} p_X(x)

= \sum_{x \in \mathbb{R}_X} (ax + b)p_X(x)

= aE[X] + b

Expectation of sum of RVs

Given a set of random variables

X_1, X_2, \dots, X_n

E[\sum_{i=1}^{n} X_i] = \sum_{i=1}^n E[X_i]

Expectation as mean of population

n students

\Omega

W: weights

30

40

50

60

E[W] = \sum_{i=1}^np_W(w_i)*w_i

p_W(w_i) = \frac{1}{n}

= \frac{1}{n}\sum_{i=1}^nw_i

(centre of gravity)

Expectation as centre of gravity

A patient needs a certain blood group which only 9% of the population has?

p = 0.09

E[X] = ?

= 1*0.09 + 2 * 0.91*0.09

+ 3 * 0.91 ^2 *.09 + 4 * 0.91^3*0.09 + \dots

= 0.09(\frac{a}{1-r} + \frac{dr}{(1-r)^2})

= \frac{1}{0.09} = \frac{1}{p}

= 0.09(1 + 2 * 0.91 + 3*0.91^2 + 4*0.91^3 + \dots

(a = 1, d = 1, r = 0.91)

= 11.11

Variance of a Random Variable

Variance of a RV

Expectation summarises a random variable

(but does not capture the spread of the RV)

E[X] = 0

E[Y] = 0

E[Z] = 0

-1

\frac{1}{2}

-100

-50

+100

+50

\frac{1}{4}

Recap

Variance

\sigma^2 = \frac{1}{n-1} \sum_{i=1}^n (x_i - \mu) ^ 2

-1

-100

-50

+100

+50

Variance

Var(X) = E[(X - E(X))^2]

-1

-100

-50

+100

+50

\sigma^2 = \frac{1}{n-1} \sum_{i=1}^n (x_i - \mu) ^ 2

Variance

Var(X) = E[(X - E(X))^2]

\sigma^2 = \frac{1}{n-1} \sum_{i=1}^n (x_i - \mu) ^ 2

\therefore Var(X) = E[X^2] -2\mu\cdot\mu + \mu^2

\therefore Var(X) = E[X^2] -2\mu E[ X] + \mu^2

\therefore Var(X) = E[X^2] - \mu^2 = E[X^2] - (E[X])^2

\therefore Var(X) = E[X^2] -E[2\mu X] + E[\mu^2]

\therefore Var(X) = E[(X - \mu)^2] = E[X^2 -2\mu X + \mu^2]

E[X] = \mu

Variance

Var(X) = E[X^2] - (E[X])^2

g(X) = X^2

\therefore E[X^2] = E[g(X)] = \sum_{x} p_X(x)g(x)

= 2^2 * \frac{1}{36} + 3^2 * \frac{2}{36} + 4^2 * \frac{3}{36} + 5^2 * \frac{4}{36} + 6^2 * \frac{5}{36}

+ 11^2 * \frac{2}{36}+ 12^2 * \frac{1}{36}

+ 7^2 * \frac{6}{36}+ 8^2 * \frac{5}{36}+ 9^2 * \frac{4}{36}+ 10^2 * \frac{3}{36}

= 54.83

Variance

Var(X) = E[X^2] - (E[X])^2

= 54.83 - 7^2

= 5.83

Variance: Mutual Funds

MF1 expected (average) returns: 8%

(12%, 2%, 25%, -9%, 10%)

MF2 expected (average) returns: 8%

(7%, 6%, 9%, 12%, 6%)

E[X] = \frac{1}{5}*12+\frac{1}{5}*2+\frac{1}{5}*25+\frac{1}{5}*(-9)+\frac{1}{5}*10 = 8

E[Y] = \frac{1}{5}*7+\frac{1}{5}*6+\frac{1}{5}*9+\frac{1}{5}*12+\frac{1}{5}*6 = 8

E[X^2] = \frac{1}{5}*12^2+\frac{1}{5}*2^2+\frac{1}{5}*25^2+\frac{1}{5}*(-9)^2+\frac{1}{5}*10^2

= 190.8

Variance: Mutual Funds

MF1 expected (average) returns: 8%

(12%, 2%, 25%, -9%, 10%)

MF2 expected (average) returns: 8%

(7%, 6%, 9%, 12%, 6%)

Var(X) = E[X^2] - (E[X])^2

Var(X) = 190.8 - 8^2 = 126.8

Var(Y) = 69.2 - 8^2 = 5.2

Properties of Variance

Var(aX + b) = E[(aX + b - E[aX + b])^2]

= E[(aX + b - aE[X] - b)^2]

= E[(a(X - E[X]))^2]

= E[a^2(X - E[X])^2]

= a^2E[(X - E[X])^2]

=a^2Var(X)

Properties of Variance

Var(X + X) = Var (2X) = 2^2Var(X)

Variance of sum of random variables

\neq Var(X) + Var(X)

In general, variance of the sum of random variables is not equal to the sum of the variances of the random variables

except......

Properties of Variance

P(X = x | Y = y) = P(X = x)~~\forall x \in \mathbb{R}_X, y \in \mathbb{R}_Y

Independent random variables

(X and Y are independent)

X: number~on~first~die

Y: sum~of~two~dice

P(Y = 8 ) = \frac{5}{36}

P(Y = 8 | X = 1) = 0 \neq P(Y = 8)

(X and Y are not independent)

Properties of Variance

We say that n random variables

X_1, X_2, \dots, X_n

are independent if

P(X_1 = x_1, X_2 = x_2, \dots, X_n = x_n)

= P(X_1=x_1)P (X_2=x_2) \dots P(X_n = x_n)

\forall x_1\in\mathbb{R}_{X_1}, x_2\in\mathbb{R}_{X_2}, \dots, x_n\in\mathbb{R}_{X_n}

Given such n random variables

Var(\sum_{i=1}^{n} X_i) = \sum_{i=1}^n Var(X_i)

Computing expectation and variance of some standard distributions

Bernoulli random variable

E[X] = \sum_{x=0}^1 x\cdot p_X(x)

p_X(x) = p^x(1-p)^{(1-x)}

= 0*(1-p) + 1 * p

Var(X) = E[X^2] - (E[X])^2

= \sum_{x=0}^1 x^2 \cdot p_X(x)

- p^2

= p - p^2 = p(1-p)

Geometric random variable: \(E[X]\)

E[X] = \sum_{x=1}^\infty x\cdot p_X(x)

p_X(x) = (1-p)^{(x-1)}p

E[X] = 1*p + 2(1-p)p + 3(1-p)^2p + \dots

(1-p)E[X] = ~~~~~~~~~ + 1(1-p)p + 2(1-p)^2p + \dots

Subtraction eqn 2 from eqn 1

pE[X] = p + p(1-p) + p(1-p)^2 + p(1-p)^3 + \dots

E[X] = 1 + (1-p) + (1-p)^2 + (1-p)^3 + \dots

E[X] = \frac{1}{1- (1-p)} = \frac{1}{p}

Geometric random variable: \(Var(X)\)

Var(X) = E[X^2] - E[X]^2

E[X^2] = \sum_{i=1}^{\infty} i^2 (1-p)^{(i-1)}p

= \sum_{i=1}^{\infty} i^2 q^{(i-1)}p

= \sum_{i=1}^{\infty} (i - 1 + 1)^2 q^{(i-1)}p

= \sum_{i=1}^{\infty} ((i - 1)^2 + 2(i-1) + 1) q^{(i-1)}p

Substitute j = i-1

= \sum_{j=0}^{\infty} j^2q^{j}p

+ 2\sum_{j=0}^{\infty} jq^{j}p

+ \sum_{j=0}^{\infty} q^{j}p

q = 1-p

= q\sum_{j=0}^{\infty} j^2q^{j-1}p

+ 2q\sum_{j=0}^{\infty} jq^{j-1}p

+ \sum_{j=0}^{\infty} q^{j}p

Geometric random variable: \(Var(X)\)

Var(X) = E[X^2] - E[X]^2

E[X^2] = \sum_{i=1}^{\infty} i^2 (1-p)^{(i-1)}p

= q\sum_{j=0}^{\infty} j^2q^{j-1}p

+ 2q\sum_{j=0}^{\infty} jq^{j-1}p

+ \sum_{j=0}^{\infty} q^{j}p

\dots

= qE[X^2] + 2qE[X]+p\frac{1}{1-q}

q = 1 - p,~~~p= 1-q

E[X^2]= qE[X^2] + 2qE[X]+1

(1-q)E[X^2]= \frac{2q}{p}+1

(steps on previous slide)

E[X^2]= \frac{q+1}{p^2}

Var(X) = E[X^2] - E[X]^2

= \frac{2q+p}{p}

= \frac{q + q + p}{p} = \frac{q + 1}{p}

=\frac{q+1}{p^2} - \frac{1}{p^2}

=\frac{q}{p^2}

=\frac{1-p}{p^2}