CS6015: Linear Algebra and Random Processes

Lecture 34: Joint distribution, conditional distribution and marginal distribution of multiple random variables

Learning Objectives

What are joint, conditional and marginal pmfs?

What is conditional expectation?

What is the expectation of a function of multiple random variables?

Multiple random variables

X_1

X_1

X_2

X_2

X_3

X_3

X_4

X_4

X_5

X_5

Salinity

Pressure

Temperature

Y

Y

Depth

Density

Oil

0/1

0: High 1: Low

Multiple random variables

Questions of Interest

P(Y=0|X_1=x_1, X_2=x_2, X_3=x_3, X_4=x_4, X_5=x_5, X_6=x_6)

P(Y=0|X_1=x_1, X_2=x_2, X_3=x_3, X_4=x_4, X_5=x_5, X_6=x_6)

What is the probability that we will find oil?

What is the probability that everything will be high?

P(X_1=1,X_2=1,X_3=1,X_4=1,X_5=1,Y=1)

P(X_1=1,X_2=1,X_3=1,X_4=1,X_5=1,Y=1)

joint probability

conditional probability

P(X_5=1)

P(X_5=1)

What is the probability that density will be high?

marginal probability

Understanding the notation

P(Y=0|X_1=x_1, X_2=x_2, X_3=x_3, X_4=x_4, X_5=x_5, X_6=x_6)

P(Y=0|X_1=x_1, X_2=x_2, X_3=x_3, X_4=x_4, X_5=x_5, X_6=x_6)

We have already discussed conditional distribution of events

The "event" notation

The "random variable" notation

P(\overbrace{Y=0}|\overbrace{X_1=x_1}, \overbrace{X_2=x_2}, \overbrace{X_3=x_3}, \overbrace{X_4=x_4}, \overbrace{X_5=x_5}, \overbrace{X_6=x_6})

P(\overbrace{Y=0}|\overbrace{X_1=x_1}, \overbrace{X_2=x_2}, \overbrace{X_3=x_3}, \overbrace{X_4=x_4}, \overbrace{X_5=x_5}, \overbrace{X_6=x_6})

events

p_{Y|X_1,X_2,X_3,X_4,X_5}(y|x_1,x_2,x_3,x_4,x_5)

p_{Y|X_1,X_2,X_3,X_4,X_5}(y|x_1,x_2,x_3,x_4,x_5)

\underbrace{~~~~~~~~~~~~~~~~~~~~~~~~~~~}

\underbrace{~~~~~~~~~~~~~~~~~~~~~~~~~~~}

given

i.e., the values of these random variables are fixed

random variables

This is not a new concept - just a change of notation

Understanding the notation

p_X(x) = P(X=x)

p_X(x) = P(X=x)

p_{X,Y}(x,y) = P(X=x, Y=y)

p_{X,Y}(x,y) = P(X=x, Y=y)

p_{X|Y}(x|y) = P(X=x| Y=y)

p_{X|Y}(x|y) = P(X=x| Y=y)

marginal

conditional

joint

We will soon see that if we know the joint pmf we can compute the marginal and the conditional

Understanding the notation

P(X_1=x_1, X_2=x_2, X_3=x_3, X_4=x_4, X_5=x_5, X_6=x_6,Y=0)

P(X_1=x_1, X_2=x_2, X_3=x_3, X_4=x_4, X_5=x_5, X_6=x_6,Y=0)

p_{X_1,X_2,X_3,X_4,X_5,Y}(x_1,x_2,x_3,x_4,x_5,y)

p_{X_1,X_2,X_3,X_4,X_5,Y}(x_1,x_2,x_3,x_4,x_5,y)

P(Y=0|X_1=x_1, X_2=x_2, X_3=x_3, X_4=x_4, X_5=x_5, X_6=x_6)

P(Y=0|X_1=x_1, X_2=x_2, X_3=x_3, X_4=x_4, X_5=x_5, X_6=x_6)

joint probability of multiple events

joint pmf: 2^n different inputs possible

conditional probability

P(Y=0)

P(Y=0)

probability of a single event

p_{Y|X_1,X_2,X_3,X_4,X_5}(y|x_1,x_2,x_3,x_4,x_5)

p_{Y|X_1,X_2,X_3,X_4,X_5}(y|x_1,x_2,x_3,x_4,x_5)

conditional pmf: function of y, other values fixed

p_{Y}(y)

p_{Y}(y)

marginal pmf

Example

X: number of heads

	-1	1	2	3
0	1/8	0	0	0
1	0	1/8	1/8	1/8
2	0	2/8	1/8	0
3	0	1/8	0	0

TTT\\ TTH\\ THT\\ THH\\ HTT\\ HTH\\ HHT\\ HHH

TTT\\ TTH\\ THT\\ THH\\ HTT\\ HTH\\ HHT\\ HHH

\Omega

\Omega

-1\\ 3\\ 2\\ 2\\ 1\\ 1\\ 1\\ 1

-1\\ 3\\ 2\\ 2\\ 1\\ 1\\ 1\\ 1

0\\ 1\\ 1\\ 2\\ 1\\ 2\\ 2\\ 3

0\\ 1\\ 1\\ 2\\ 1\\ 2\\ 2\\ 3

X

X

Y

Y

Y: position of first heads (-1 if no heads)

Y\\\overbrace{~~~~~~~~~~~~~~~~~~~~~~~}

Y\\\overbrace{~~~~~~~~~~~~~~~~~~~~~~~}

X\begin{cases} ~\\ ~\\ ~\\ \end{cases}

X\begin{cases} ~\\ ~\\ ~\\ \end{cases}

p_{X,Y}(x,y)

p_{X,Y}(x,y)

=P(X=x, Y=y)

=P(X=x, Y=y)

Can we compute the conditional and marginal distributions from the joint pmf?

Example

	-1	1	2	3
0	1/8	0	0	0
1	0	1/8	1/8	1/8
2	0	2/8	1/8	0
3	0	1/8	0	0

TTT\\ TTH\\ THT\\ THH\\ HTT\\ HTH\\ HHT\\ HHH

TTT\\ TTH\\ THT\\ THH\\ HTT\\ HTH\\ HHT\\ HHH

\Omega

\Omega

-1\\ 3\\ 2\\ 2\\ 1\\ 1\\ 1\\ 1

-1\\ 3\\ 2\\ 2\\ 1\\ 1\\ 1\\ 1

0\\ 1\\ 1\\ 2\\ 1\\ 2\\ 2\\ 3

0\\ 1\\ 1\\ 2\\ 1\\ 2\\ 2\\ 3

X

X

Y

Y

Y\\\overbrace{~~~~~~~~~~~~~~~~~~~~~~~}

Y\\\overbrace{~~~~~~~~~~~~~~~~~~~~~~~}

X\begin{cases} ~\\ ~\\ ~\\ \end{cases}

X\begin{cases} ~\\ ~\\ ~\\ \end{cases}

p_{X,Y}(x,y)

p_{X,Y}(x,y)

=P(X=x, Y=y)

=P(X=x, Y=y)

Can we compute the conditional and marginal distributions from the joint pmf?

p_{X}(x) = \sum_{y}p_{X,Y}(x,y)

p_{X}(x) = \sum_{y}p_{X,Y}(x,y)

summing over all the different ways in which $X$ can take the value $x$

Example

	-1	1	2	3
0	1/8	0	0	0
1	0	1/8	1/8	1/8
2	0	2/8	1/8	0
3	0	1/8	0	0

TTT\\ TTH\\ THT\\ THH\\ HTT\\ HTH\\ HHT\\ HHH

TTT\\ TTH\\ THT\\ THH\\ HTT\\ HTH\\ HHT\\ HHH

\Omega

\Omega

-1\\ 3\\ 2\\ 2\\ 1\\ 1\\ 1\\ 1

-1\\ 3\\ 2\\ 2\\ 1\\ 1\\ 1\\ 1

0\\ 1\\ 1\\ 2\\ 1\\ 2\\ 2\\ 3

0\\ 1\\ 1\\ 2\\ 1\\ 2\\ 2\\ 3

X

X

Y

Y

Y\\\overbrace{~~~~~~~~~~~~~~~~~~~~~~~}

Y\\\overbrace{~~~~~~~~~~~~~~~~~~~~~~~}

X\begin{cases} ~\\ ~\\ ~\\ \end{cases}

X\begin{cases} ~\\ ~\\ ~\\ \end{cases}

p_{X,Y}(x,y)

p_{X,Y}(x,y)

=P(X=x, Y=y)

=P(X=x, Y=y)

Can we compute the conditional and marginal distributions from the joint pmf?

p_{X|Y}(x|y) = P(X=x|Y=y)

p_{X|Y}(x|y) = P(X=x|Y=y)

= \frac{P(X=x,~Y=y)}{P(Y=y)}

= \frac{P(X=x,~Y=y)}{P(Y=y)}

= \frac{p_{X,Y}(x,y)}{p_{Y}(y)}

= \frac{p_{X,Y}(x,y)}{p_{Y}(y)}

= \frac{p_{X,Y}(x,y)}{\sum_x p_{X,Y}(x,y)}

= \frac{p_{X,Y}(x,y)}{\sum_x p_{X,Y}(x,y)}

Revisiting the laws

Multiplication/Chain Rule

p_{X,Y}(x,y)

p_{X,Y}(x,y)

=p_{X|Y}(x|y) p_Y(y)

=p_{X|Y}(x|y) p_Y(y)

P(X=x, Y=y) = P(X=y|Y=y)P(Y=y)

P(X=x, Y=y) = P(X=y|Y=y)P(Y=y)

Total Probability Theorem

p_{X}(x)

p_{X}(x)

=\sum_{y} p_{X|Y}(x|y) p_Y(y)

=\sum_{y} p_{X|Y}(x|y) p_Y(y)

P(X=x) = \sum_i P(X=x|Y=y_i)P(Y=y_i)

P(X=x) = \sum_i P(X=x|Y=y_i)P(Y=y_i)

= \sum_{y}p_{X,Y}(x,y)

= \sum_{y}p_{X,Y}(x,y)

Bayes' Theorem

p_{X|Y}(x|y)

p_{X|Y}(x|y)

= \frac{p_{X,Y}(x,y)}{p_Y(y)}

= \frac{p_{X,Y}(x,y)}{p_Y(y)}

= \frac{p_{X,Y}(x,y)}{\sum_x p_{X,Y}(x,y)}

= \frac{p_{X,Y}(x,y)}{\sum_x p_{X,Y}(x,y)}

= \frac{p_{Y|X}(y|x) p_X(x)}{\sum_x p_{Y|X}(y|x) p_X(x)}

= \frac{p_{Y|X}(y|x) p_X(x)}{\sum_x p_{Y|X}(y|x) p_X(x)}

A_1

A_1

A_5

A_5

A_4

A_4

A_3

A_3

A_2

A_2

A_6

A_6

A_7

A_7

B

B

\Omega

\Omega

Revisiting the laws

Bayes' Theorem

\overbrace{p_{X|Y}(x|y)}

\overbrace{p_{X|Y}(x|y)}

= \frac{\overbrace{p_{Y|X}(y|x)} \overbrace{p_X(x)}}{\sum_x p_{Y|X}(y|x) p_X(x)}

= \frac{\overbrace{p_{Y|X}(y|x)} \overbrace{p_X(x)}}{\sum_x p_{Y|X}(y|x) p_X(x)}

Prior

Likelihood

Posterior

Revisiting the laws

\sum_{x}\sum_{y}p_{X,Y}(x,y) = 1

\sum_{x}\sum_{y}p_{X,Y}(x,y) = 1

\sum_{x}p_{X}(x) = 1

\sum_{x}p_{X}(x) = 1

\sum_{x}p_{X|Y}(x|y) = 1

\sum_{x}p_{X|Y}(x|y) = 1

\sum_{y}p_{X|Y}(x|y) \neq 1

\sum_{y}p_{X|Y}(x|y) \neq 1

Generalising to more variables

p_{X,Y,Z}(x,y,z)

p_{X,Y,Z}(x,y,z)

=p_X(x) p_{Y|X}(y|x) p_{Z|X,Y}(z|x,y)

=p_X(x) p_{Y|X}(y|x) p_{Z|X,Y}(z|x,y)


0	0	1/4	3/4
0	1	1/8	7/8
1	0	2/5	3/5
1	1	1/2	1/2

p_{Z|X,Y}(z|x,y)

p_{Z|X,Y}(z|x,y)

Z

Z

X

X

Y

Y

p_Z(z) = \sum_x\sum_y p_{X,Y,Z}(x,y,z)

p_Z(z) = \sum_x\sum_y p_{X,Y,Z}(x,y,z)

Conditional distribution

Z=0

Z=0

Z=1

Z=1

Joint distribution

Marginal distribution


0	0	0	1/21
0	0	1	3/21
0	1	0	1/21
0	1	1	7/21
1	0	0	2/21
1	0	1	3/21
1	1	0	2/21
1	1	1	2/21

X

X

Y

Y

p_{X,Y,Z}

p_{X,Y,Z}


0	6/21
1	15/21

Z

Z

p_Z(z)

p_Z(z)

Independence

p_{X,Y,Z}(x,y,z)

p_{X,Y,Z}(x,y,z)

=p_X(x) p_{Y|X}(y|x) p_{Z|X,Y}(z|x,y)

=p_X(x) p_{Y|X}(y|x) p_{Z|X,Y}(z|x,y)

$X,Y,Z$ are independent if

p_{X,Y,Z}(x,y,z)

p_{X,Y,Z}(x,y,z)

=p_X(x) p_{Y}(y) p_{Z}(z)

=p_X(x) p_{Y}(y) p_{Z}(z)

\forall x,y,z

\forall x,y,z

Z

Z


0	0	0	1/20
0	0	1	3/20
0	1	0	2/20
0	1	1	6/20
1	0	0	1/20
1	0	1	3/20
1	1	0	1/20
1	1	1	3/20

X

X

Y

Y

p_{X,Y,Z}

p_{X,Y,Z}


0	5/20
1	15/20

Z

Z

p_Z(z)

p_Z(z)


0	0	1/4	3/4
0	1	1/4	3/4
1	0	1/4	3/4
1	1	1/4	3/4

p_{Z|X,Y}(z|x,y)

p_{Z|X,Y}(z|x,y)

X

X

Y

Y

Z=0

Z=0

Z=1

Z=1

1/4
3/4

1/20
3/20
2/20
6/20
2/20
2/20
0/20
4/20


0	0	1/4	3/4
0	1	1/4	3/4
1	0	1/2	1/2
1	1	0	1

X

X

Y

Y

Z=0

Z=0

Z=1

Z=1

p_{Z|X,Y}(z|x,y)

p_{Z|X,Y}(z|x,y)

Independence

$X_1,X_2,X_3, \dots, X_n$ are independent if

p_{X_1,X_2,X_3, \dots, X_n}(x_1,x_2,x_3, \dots, x_n)

p_{X_1,X_2,X_3, \dots, X_n}(x_1,x_2,x_3, \dots, x_n)

=p_{X1}(x_1) p_{X2}(x_2) p_{X3}(x_3)\dots p_{Xn}(x_n)

=p_{X1}(x_1) p_{X2}(x_2) p_{X3}(x_3)\dots p_{Xn}(x_n)

\forall x_1,x_2,x_3, \dots, x_n

\forall x_1,x_2,x_3, \dots, x_n

Expectation: Recap

E[X] = \sum_x xp_X(x)

E[X] = \sum_x xp_X(x)

If we interpret $p_X(x)$ as the long term relative frequency then $E[X]$ is the long term average value of $X$

E[g(X)] = \sum_x g(x)p_X(x)

E[g(X)] = \sum_x g(x)p_X(x)

What if we have a function of multiple random variables?

Conditional Expectation

E[X|A]

E[X|A]

What is the expected value of the sum of two die given that the second die shows an even number

X:

X:

random variable indicating sum of the dice

A:

A:

event that the second die shows an even no.

What are we interested in?

E[X] = \sum_x xp_X(x)

E[X] = \sum_x xp_X(x)

= \sum_x xp_{X|A}(x)

= \sum_x xp_{X|A}(x)

(1 , 1)	(1 , 2)	(1 , 3)	(1 , 4)	(1 , 5)	(1 , 6)
(2, 1)	(2, 2)	(2, 3)	(2, 4)	(2, 5)	(2, 6)
(3, 1)	(3, 2)	(3, 3)	(3, 4)	(3, 5)	(3, 6)
(4, 1)	(4, 2)	(4, 3)	(4, 4)	(4, 5)	(4, 6)
(5, 1)	(5, 2)	(5, 3)	(5, 4)	(5, 5)	(5, 6)
(6, 1)	(6, 2)	(6, 3)	(6, 4)	(6, 5)	(6, 6)

(1 , 2)	(1 , 4)	(1 , 6)
(2, 2)	(2, 4)	(2, 6)
(3, 2)	(3, 4)	(3, 6)
(4, 2)	(4, 4)	(4, 6)
(5, 2)	(5, 4)	(5, 6)
(6, 2)	(6, 4)	(6, 6)

A

A

\Omega

\Omega

\mathbb{R}_X ={3,4,5,6,7,8,9,10,11,12}

\mathbb{R}_X ={3,4,5,6,7,8,9,10,11,12}

p_{X|A} ={\frac{1}{18},\frac{1}{18},\frac{2}{18},\frac{2}{18},\frac{3}{18},\frac{3}{18},\frac{2}{18},\frac{2}{18},\frac{1}{18},\frac{1}{18}}

p_{X|A} ={\frac{1}{18},\frac{1}{18},\frac{2}{18},\frac{2}{18},\frac{3}{18},\frac{3}{18},\frac{2}{18},\frac{2}{18},\frac{1}{18},\frac{1}{18}}

= 7.5

= 7.5

Conditional Expectation

E[g(X)|A]

E[g(X)|A]

E[X] = \sum_x xp_X(x)

E[X] = \sum_x xp_X(x)

= \sum_x g(x)p_{X|A}(x)

= \sum_x g(x)p_{X|A}(x)

E[X|A]

E[X|A]

= \sum_x xp_{X|A}(x)

= \sum_x xp_{X|A}(x)

Instead of conditioning on events we can condition on random variables

E[X|Y=y]

E[X|Y=y]

= \sum_x xp_{X|Y}(x|y)

= \sum_x xp_{X|Y}(x|y)

E[g(X)|Y=y]

E[g(X)|Y=y]

= \sum_x g(x)p_{X|Y}(x|y)

= \sum_x g(x)p_{X|Y}(x|y)

Total Expectation Theorem

E[X] = \sum_x xp_X(x)

E[X] = \sum_x xp_X(x)

A_1

A_1

A_5

A_5

A_4

A_4

A_3

A_3

A_2

A_2

A_6

A_6

A_7

A_7

B

B

\Omega

\Omega

p_{X}(x) = \sum_{i=1}^n P(A_i)p_{X|A_i}(x)

p_{X}(x) = \sum_{i=1}^n P(A_i)p_{X|A_i}(x)

Multiply by $x$ on both sides and sum over $x$

\sum_x xp_{X}(x) = \sum_x x\sum_{i=1}^n P(A_i)p_{X|A_i}(x)

\sum_x xp_{X}(x) = \sum_x x\sum_{i=1}^n P(A_i)p_{X|A_i}(x)

= \sum_{i=1}^n P(A_i)\sum_x x p_{X|A_i}(x)

= \sum_{i=1}^n P(A_i)\sum_x x p_{X|A_i}(x)

= \sum_{i=1}^n P(A_i)E[X|A_i]

= \sum_{i=1}^n P(A_i)E[X|A_i]

E[X]

E[X]

Instead of conditioning on events we can also condition on random variables

E[X]

E[X]

= \sum_{y} p_Y(y)E[X|Y=y]

= \sum_{y} p_Y(y)E[X|Y=y]

Total Expectation Theorem

E[X] = \sum_x xp_X(x)

E[X] = \sum_x xp_X(x)

time taken

0.5

0.5

0.3

0.3

0.2

0.2

X:

X:

E[X|A_1] = 60 mins

E[X|A_1] = 60 mins

E[X|A_2] = 30 mins

E[X|A_2] = 30 mins

E[X|A_3] = 45 mins

E[X|A_3] = 45 mins

E[X] = ?

E[X] = ?

\sum_{i=1}^{3}P(A_i)E[X|A_i]

\sum_{i=1}^{3}P(A_i)E[X|A_i]

Expectation: Mult. rand. variables

Example: You lose INR 1 if the number on die 1 is less than that on die 2 and win INR 1 otherwise

E[g(X)] = \sum_x g(x)p_X(x)

E[g(X)] = \sum_x g(x)p_X(x)

g(X,Y) = \begin{cases} -1~if X < Y\\ +1~if X \geq Y \end{cases}

g(X,Y) = \begin{cases} -1~if X < Y\\ +1~if X \geq Y \end{cases}

E[g(X,Y)] = ?

E[g(X,Y)] = ?

How do you compute this without computing the distribution of $g(X,Y)$ ?

Expectation: Mult. rand. variables

E[g(X)] = \sum_x g(x)p_X(x)

E[g(X)] = \sum_x g(x)p_X(x)

E[g(X,Y)] = \sum_{y} p_Y(y) E [g(X,Y)|Y=y]

E[g(X,Y)] = \sum_{y} p_Y(y) E [g(X,Y)|Y=y]

= \sum_{y} p_Y(y) E [g(X,y)|Y=y]

= \sum_{y} p_Y(y) E [g(X,y)|Y=y]

= \sum_{y} p_Y(y) \sum_{x} g(x,y)p_{X|Y}(x|y)

= \sum_{y} p_Y(y) \sum_{x} g(x,y)p_{X|Y}(x|y)

= \sum_{x} \sum_{y} p_Y(y) g(x,y)p_{X|Y}(x|y)

= \sum_{x} \sum_{y} p_Y(y) g(x,y)p_{X|Y}(x|y)

= \sum_{x} \sum_{y} g(x,y)p_{X,Y}(x,y)

= \sum_{x} \sum_{y} g(x,y)p_{X,Y}(x,y)

Expectation: Mult. rand. variables

Example: You lose INR 1 if the number on die 1 is less than that on die 2 and win INR 1 otherwise

g(X,Y) = \begin{cases} -1~if X < Y\\ +1~if X \geq Y \end{cases}

g(X,Y) = \begin{cases} -1~if X < Y\\ +1~if X \geq Y \end{cases}

E[g(X,Y)] = \sum_x\sum_y g(x,y)p_{X,Y}(x,y)

E[g(X,Y)] = \sum_x\sum_y g(x,y)p_{X,Y}(x,y)

p_X(x,y) = \frac{1}{36} \forall x,y

p_X(x,y) = \frac{1}{36} \forall x,y

= \frac{1}{6}

= \frac{1}{6}

Expectation: Mult. rand. variables

In general,

E[g(X,Y)] \neq g(E[X],E[Y])

E[g(X,Y)] \neq g(E[X],E[Y])

Exception 1

E[g(X,Y)]

E[g(X,Y)]

g(X,Y) = aX + bY

g(X,Y) = aX + bY

E[g(X,Y)] = \sum_x\sum_y g(x,y)p_X(x,y)

E[g(X,Y)] = \sum_x\sum_y g(x,y)p_X(x,y)

= \sum_x\sum_y (ax + by)p_X(x,y)

= \sum_x\sum_y (ax + by)p_X(x,y)

= a \sum_x x \sum_y p_X(x,y) + b \sum_y y \sum_x p_X(x,y)

= a \sum_x x \sum_y p_X(x,y) + b \sum_y y \sum_x p_X(x,y)

\underbrace{~~~~~~~~~~~~~~~~~~~~}

\underbrace{~~~~~~~~~~~~~~~~~~~~}

\underbrace{~~~~~~~~~~~~~~~~~~~~~}

\underbrace{~~~~~~~~~~~~~~~~~~~~~}

= a \sum_x x p_X(x) + b \sum_y y p_Y(y)

= a \sum_x x p_X(x) + b \sum_y y p_Y(y)

= a E[X] + b E[Y]

= a E[X] + b E[Y]

= g(E[X], E[Y])

= g(E[X], E[Y])

= \sum_x\sum_y g(x,y)p_X(x,y)

= \sum_x\sum_y g(x,y)p_X(x,y)

\underbrace{~~~~~~~~~~}

\underbrace{~~~~~~~~~~}

\underbrace{~~~~~~~~~~}

\underbrace{~~~~~~~~~~}

Expectation: Mult. rand. variables

In general,

E[g(X,Y)] \neq g(E[X],E[Y])

E[g(X,Y)] \neq g(E[X],E[Y])

Exception 2

E[g(X,Y)]

E[g(X,Y)]

g(X,Y) = XY

g(X,Y) = XY

E[g(X,Y)] = \sum_x\sum_y g(x,y)p_X(x,y)

E[g(X,Y)] = \sum_x\sum_y g(x,y)p_X(x,y)

= \sum_x\sum_y g(x,y)p_X(x,y)

= \sum_x\sum_y g(x,y)p_X(x,y)

$X,Y$ are independent

= \sum_x\sum_y xyp_X(x)p_Y(y)

= \sum_x\sum_y xyp_X(x)p_Y(y)

= \sum_x xp_X(x) \sum_y yp_Y(y)

= \sum_x xp_X(x) \sum_y yp_Y(y)

= E[X]E[Y]

= E[X]E[Y]

= g(E[X],E[Y])

= g(E[X],E[Y])

\underbrace{~~~~~~~~~~~~~~~~~~~}

\underbrace{~~~~~~~~~~~~~~~~~~~}

\underbrace{~~~~~~~~~~~~~~~~~~~}

\underbrace{~~~~~~~~~~~~~~~~~~~}

Variances: Mult. rand. variables

Recap,

Var(aX) = a^2 Var(X)

Var(aX) = a^2 Var(X)

E[g(X,Y)] = \sum_x\sum_y g(x,y)p_X(x,y)

E[g(X,Y)] = \sum_x\sum_y g(x,y)p_X(x,y)

Var(X+a) = Var(X)

Var(X+a) = Var(X)

In general,

Var(X+Y) \neq Var(X) + Var(Y)

Var(X+Y) \neq Var(X) + Var(Y)

Examples, X = Y, X = -Y

Examples, X = Y, X = -Y

Exception: If $X$ and $Y$ are independent

Var(X+Y) = E [(X+Y)^2] - (E[X+Y])^2

Var(X+Y) = E [(X+Y)^2] - (E[X+Y])^2

Variances: Mult. rand. variables

Proof: (given: $X~and~Y$ are independent)

E[g(X,Y)] = \sum_x\sum_y g(x,y)p_X(x,y)

E[g(X,Y)] = \sum_x\sum_y g(x,y)p_X(x,y)

Var(X+Y) = E [(X+Y)^2] - (E[X+Y])^2

Var(X+Y) = E [(X+Y)^2] - (E[X+Y])^2

= E [X^2 + 2XY + Y^2] - (E[X] + E[Y])^2

= E [X^2 + 2XY + Y^2] - (E[X] + E[Y])^2

= E [X^2] + 2E[XY] + E[Y^2] - (E[X]^2 + 2E[X]E[Y] + E[Y]^2)

= E [X^2] + 2E[XY] + E[Y^2] - (E[X]^2 + 2E[X]E[Y] + E[Y]^2)

= E [X^2] + 2E[X]E[Y] + E[Y^2] - E[X]^2 - 2E[X]E[Y] - E[Y]^2

= E [X^2] + 2E[X]E[Y] + E[Y^2] - E[X]^2 - 2E[X]E[Y] - E[Y]^2

= E [X^2] - E[X]^2 + E[Y^2] - E[Y]^2

= E [X^2] - E[X]^2 + E[Y^2] - E[Y]^2

= Var(X) + Var(Y)

= Var(X) + Var(Y)

Where did we use the independence property?

Summary of main results

X

X

X|Y

X|Y

X, Y

X, Y

E[X] = \sum_x xp_X(x)

E[X] = \sum_x xp_X(x)

"long term" average

E[g(X)] = \sum_x g(x)p_X(x)

E[g(X)] = \sum_x g(x)p_X(x)

function of RV

E[a X + b] = a E[X] + b

E[a X + b] = a E[X] + b

linearity of expectation

Var(X) = E[(X - E[X])^2]

Var(X) = E[(X - E[X])^2]

spread in the data

Var(a X + b) = a^2 Var(X)

Var(a X + b) = a^2 Var(X)

E[X|A] = \sum_x xp_{X|A}(x)

E[X|A] = \sum_x xp_{X|A}(x)

conditioned on event

E[X|Y] = \sum_x xp_{X|Y}(x|y)

E[X|Y] = \sum_x xp_{X|Y}(x|y)

conditioned on RV

E[g(X)|A] = \sum_x g(x)p_{X|A}(x)

E[g(X)|A] = \sum_x g(x)p_{X|A}(x)

E[g(X)|Y=y] = \sum_x g(x)p_{X|Y}(x|y)

E[g(X)|Y=y] = \sum_x g(x)p_{X|Y}(x|y)

E[X] = \sum_{i=1}^n P(A_i)E[X|A_i]

E[X] = \sum_{i=1}^n P(A_i)E[X|A_i]

E[X] = \sum_{y} p_Y(y)E[X|Y=y]

E[X] = \sum_{y} p_Y(y)E[X|Y=y]

total expectation theorem

E[g(X,Y)] = \sum_x\sum_y g(x,y)p_X(x,y)

E[g(X,Y)] = \sum_x\sum_y g(x,y)p_X(x,y)

function of multiple RVs

E[g(X,Y)] \neq g(E[X],E[Y])

E[g(X,Y)] \neq g(E[X],E[Y])

in general, not equal but

E[aX+bY)] = aE[X] + b E[Y]

E[aX+bY)] = aE[X] + b E[Y]

E[XY)] = E[X]E[Y]

E[XY)] = E[X]E[Y]

if $X$ and $Y$ are independent

Var(X+Y) = Var(X) + Var(Y)

Var(X+Y) = Var(X) + Var(Y)

CS6015: Linear Algebra and Random Processes

Lecture 34: Joint distribution, conditional distribution and marginal distribution of multiple random variables

CS6015: Lecture 34

More from Mitesh Khapra