CS6015: Linear Algebra and Random Processes

Lecture 38: Multiple continuous random variables, Bayes' theorem for continuous random variables

 A significant part of this lecture is based on the online course offered by Prof. John Tsitsikilis

Learning Objectives

What is joint pdf?

What is conditional pdf?

What happens when two continuous random variables are independent?

What is the Bayes Rule for the continuous case?

P(x \leq X \leq x + \delta, y \leq Y \leq y + \delta)
x
f_X(x)
P(x \leq X \leq x + \delta) \approx f_X(x)\cdot\delta

Two continuous rand. variables

Recap

x
x+\delta
y
y+\delta
P(x \leq X \leq x + \delta, y \leq Y \leq y + \delta)
x
f_X(x)
P(x \leq X \leq x + \delta) \approx f_X(x)\cdot\delta
=f_{X,Y}(x,y)\cdot\delta\cdot\delta
P((X,Y) \in S) = \int \int_S f_{X,Y}(x,y)~dx~dy

arbitrary region on the XY plane

Two continuous rand. variables

Recap

P(x \leq X \leq x + \delta, y \leq Y \leq y + \delta)
x
f_X(x)
P(x \leq X \leq x + \delta) \approx f_X(x)\cdot\delta
\approx f_{X,Y}(x,y)\cdot\delta\cdot\delta

Joint density function

Recap

f_{X,Y}(x,y) \approx \frac{P(x \leq X \leq x + \delta, y \leq Y \leq y + \delta)}{\delta\cdot\delta}

joint density function

probability per unit area

\int_{-\infty}^{\infty}\int_{-\infty}^{\infty} f_{X,Y}(x,y) dx dy= 1
f_{X,Y}(x,y) \geq 0

Laws of probability

x
f_X(x)
P(x \leq X \leq x + \delta) \approx f_X(x)\cdot\delta

From joint density to marg. density

Recap

x
f_X(x)
P(x \leq X \leq x + \delta) \approx f_X(x)\cdot\delta

From joint density to marg. density

Recap

x
f_X(x)
P(x \leq X \leq x + \delta) \approx f_X(x)\cdot\delta

From joint density to marg. density

Recap

x
f_X(x)
P(x \leq X \leq x + \delta) \approx f_X(x)\cdot\delta

From joint density to marg. density

Recap

x
f_X(x)
P(x \leq X \leq x + \delta) \approx f_X(x)\cdot\delta

From joint density to marg. density

Recap

x
f_X(x)
P(x \leq X \leq x + \delta) \approx f_X(x)\cdot\delta

From joint density to marg. density

Recap

f_X(x) = \int_{-\infty}^{\infty} f_{X,Y}(x,y) dy

Recap

p_X(x) = \sum_{y} p_{X,Y}(x,y)

Independence

Two continuous random variables are said to be independent if

f_{X,Y}(x,y) = f_X(x)f_Y(y)
f_X(x) = \frac{1}{\sqrt{2 \pi}\sigma_1}e^{\frac{-(x-\mu_1)^2}{2\sigma_1^2}}
f_Y(y) = \frac{1}{\sqrt{2 \pi}\sigma_2}e^{\frac{-(y-\mu_2)^2}{2\sigma_2^2}}
\mathbf{x} = \begin{bmatrix} x \\ y \end{bmatrix}
\mathbf{\mu} = \begin{bmatrix} \mu_1 \\ \mu_2 \end{bmatrix}
\mathbf{\Sigma} = \begin{bmatrix} \sigma_1^2 & 0 \\ 0 & \sigma_2^2 \end{bmatrix}
f_{X,Y}(x,y) = \frac{1}{2 \pi |\Sigma|^{\frac{1}{2}}}e^{-\frac{1}{2}(\mathbf{x} - \mu)^T\Sigma^{-1}(\mathbf{x} - \mu)}

Example:

Independence

f_X(x) = \frac{1}{\sqrt{2 \pi}\sigma_1}e^{\frac{-(x-\mu_1)^2}{2\sigma_1^2}}
f_Y(y) = \frac{1}{\sqrt{2 \pi}\sigma_2}e^{\frac{-(y-\mu_2)^2}{2\sigma_2^2}}
\mathbf{x} = \begin{bmatrix} x \\ y \end{bmatrix}
\mathbf{\mu} = \begin{bmatrix} \mu_1 \\ \mu_2 \end{bmatrix}
\mathbf{\Sigma} = \begin{bmatrix} \sigma_1^2 & 0 \\ 0 & \sigma_2^2 \end{bmatrix}
f_{X,Y}(x,y) = \frac{1}{2 \pi |\Sigma|^{\frac{1}{2}}}e^{-\frac{1}{2}(\mathbf{x} - \mu)^T\Sigma^{-1}(\mathbf{x} - \mu)}
f_{X,Y}(x,y) = \frac{1}{2 \pi \begin{vmatrix} \sigma_1^2 & 0 \\ 0 & \sigma_2^2 \end{vmatrix}^\frac{1}{2}}e^{-\frac{1}{2}\begin{bmatrix} x - \mu_1 \\ y - \mu_2 \end{bmatrix}^\top\begin{bmatrix} \sigma_1^2 & 0 \\ 0 & \sigma_2^2 \end{bmatrix}^{-1}\begin{bmatrix} x - \mu_1 \\ y - \mu_2 \end{bmatrix}}

Suppose, \(\mu_1 = \mu_2 =0\) and \(\sigma_1 = \sigma_2 = 1 \)

f_X(x) = \frac{1}{\sqrt{2 \pi}}e^{-\frac{x^2}{2}}
f_Y(y) = \frac{1}{\sqrt{2 \pi}}e^{-\frac{y^2}{2}}
f_{X,Y}(x,y) = \frac{1}{2 \pi \begin{vmatrix} 1 & 0 \\ 0 & 1 \end{vmatrix}^\frac{1}{2}}e^{-\frac{1}{2}\begin{bmatrix} x \\ y \end{bmatrix}^\top\begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}\begin{bmatrix} x \\ y \end{bmatrix}}
f_{X,Y}(x,y) = \frac{1}{2 \pi }e^{-\frac{1}{2}\begin{bmatrix} x \\ y \end{bmatrix}^\top\begin{bmatrix} x \\ y \end{bmatrix}}
= \frac{1}{2 \pi }e^{-\frac{1}{2} (x^2 + y^2)}
= f_x(x)f_Y(y)

Independence

f_X(x) = \frac{1}{\sqrt{2 \pi}\sigma_1}e^{\frac{-(x-\mu_1)^2}{2\sigma_1^2}}
f_Y(y) = \frac{1}{\sqrt{2 \pi}\sigma_2}e^{\frac{-(y-\mu_2)^2}{2\sigma_2^2}}
\mathbf{x} = \begin{bmatrix} x \\ y \end{bmatrix}
\mathbf{\mu} = \begin{bmatrix} \mu_1 \\ \mu_2 \end{bmatrix}
\mathbf{\Sigma} = \begin{bmatrix} \sigma_1^2 & 0 \\ 0 & \sigma_2^2 \end{bmatrix}
f_{X,Y}(x,y) = \frac{1}{2 \pi \begin{vmatrix} \sigma_1^2 & 0 \\ 0 & \sigma_2^2 \end{vmatrix}^\frac{1}{2}}e^{-\frac{1}{2}\begin{bmatrix} x - \mu_1 \\ y - \mu_2 \end{bmatrix}^\top\begin{bmatrix} \sigma_1^2 & 0 \\ 0 & \sigma_2^2 \end{bmatrix}^{-1}\begin{bmatrix} x - \mu_1 \\ y - \mu_2 \end{bmatrix}}

Suppose, \(\mu_1,\mu_2 \neq0\) and \(\sigma_1,\sigma_2 \neq 1 \)

f_{X,Y}(x,y) = \frac{1}{2 \pi (\sigma_1^2\cdot\sigma_2^2 - 0\cdot 0)^\frac{1}{2}}e^{-\frac{1}{2}\begin{bmatrix} x - \mu_1 \\ y - \mu_2 \end{bmatrix}^\top\begin{bmatrix} \frac{1}{\sigma_1^2} & 0 \\ 0 & \frac{1}{\sigma_2^2} \end{bmatrix}\begin{bmatrix} x - \mu_1 \\ y - \mu_2 \end{bmatrix}}
f_{X,Y}(x,y) = \frac{1}{2 \pi \sigma_1\cdot\sigma_2}e^{-\frac{1}{2}\begin{bmatrix} x - \mu_1 \\ y - \mu_2 \end{bmatrix}^\top\begin{bmatrix} \frac{1}{\sigma_1^2}(x - \mu_1) \\ \frac{1}{\sigma_2^2}(y - \mu_2) \end{bmatrix}}
f_{X,Y}(x,y) = \frac{1}{\sqrt{2 \pi} \sqrt{2 \pi} \sigma_1\cdot\sigma_2}e^{-\frac{1}{2} (\frac{(x - \mu_1)^2}{\sigma_1^2} + \frac{(y - \mu_2)^2}{\sigma_2^2}) }
=f_{X}(x)f_{Y}(y)

Independence

x
y
f_Y(y)
f_{XY}(x,y)
f_{X}(x)

Revisiting the normal distribution

f_X(x) = \frac{1}{\sqrt{2 \pi}\sigma_1}e^{\frac{-(x-\mu_1)^2}{2\sigma_1^2}}
\mathbf{x} = \begin{bmatrix} x \\ y \end{bmatrix}
\mathbf{\mu} = \begin{bmatrix} \mu_1 \\ \mu_2 \end{bmatrix}
\mathbf{\Sigma} = covariance~matrix
f_{X,Y}(x,y) = \frac{1}{2 \pi |\Sigma|^{\frac{1}{2}}}e^{-\frac{1}{2}(\mathbf{x} - \mu)^T\Sigma^{-1}(\mathbf{x} - \mu)}

univariate normal distribution, \(n = 1\) random var.

bivariate normal distribution, \(n = 2\) random var.

f_{X_1,X_2, \dots, X_n}(x_1,x_2, \dots, x_n) = ?

multivariate normal distribution, \(n > 2\) rand. var.

when would the two random variables be independent?

Conditional Probability Density Functions

P(A|B) = \frac{P(A,B)}{P(B)}

Recap

p_{X|Y}(x|y) = \frac{p_{X,Y}(x,y)}{p_Y(y)}
(P(B) > 0)
(p_{Y}(y) > 0)

Conditional PDFs

P(x \leq X \leq x + \delta) \approx f_X(x)\cdot\delta

By analogy, we would want

P(x \leq X \leq x + \delta | Y \approx y) \approx f_{X|Y}(x|y)\cdot\delta
P(x \leq X \leq x + \delta | y \leq Y \leq y + \delta)

We cannot have \(Y = y\) as \(P(Y=y) = 0\)

\frac{P(x \leq X \leq x + \delta , y \leq Y \leq y + \delta)}{P(y \leq Y \leq y + \delta)}
\approx \frac{f_{X,Y}(x,y)\cdot\delta\cdot\delta}{f_Y(y)\cdot\delta}
f_{X|Y}(x|y) = \frac{f_{X,Y}(x,y)}{f_Y(y)}
(f_Y(y) > 0)

Conditional PDFs

f_{X|Y}(x|y) = \frac{f_{X,Y}(x,y)}{f_Y(y)}
(f_Y(y) > 0)
y
x

Conditional PDFs

f_{X|Y}(x|y) = \frac{f_{X,Y}(x,y)}{f_Y(y)}
(f_Y(y) > 0)
y
x

fixed/given/constant

Conditional PDFs

f_{X|Y}(x|y) = \frac{f_{X,Y}(x,y)}{f_Y(y)}
(f_Y(y) > 0)
y
x

fixed/given/constant

f_Y(y)
f_{X,Y}(x, y)

Examples/Exercises

Buffon's needle problem

d

Question: A needle of length \(l\) is dropped on a wooden plank of width \(d (l<d)\). What is the probability that it will intersect with one of the edges? (assume it does not fall completely outside)

Buffon's needle problem

Three possibilities

Due to the symmetry of the problem, case 2 and 3 are similar (you can just turn the plank around)

Buffon's needle problem

\(X\): distance of centre of needle to the nearest edge

\(\Theta\): angle between the needle and the plank edges

X
X
\theta
\theta
\frac{l}{2}sin\theta
\frac{l}{2}sin\theta

\(X \sim~Uniform(0,\frac{d}{2})\)

\(\Theta \sim~Uniform(0,\frac{\pi}{2})\)

\(f_X(x)=\frac{2}{d} \)

\(f_\Theta(\theta)=\frac{2}{\pi} \)

\(X, \Theta  independent\)

\(f_{X\Theta}(x,\theta)= \frac{2}{d} . \frac{2}{\pi}\)

X \leq \frac{l}{2}sin \theta

The needle will intersect if

Buffon's needle problem

X
X
\theta
\theta
\frac{l}{2}sin\theta
\frac{l}{2}sin\theta

The needle will intersect if

X \leq \frac{l}{2}sin \theta
\int_0^{\frac{\pi}{2}}\int_0^{\frac{l}{2}sin\theta} f_{X,\Theta}(x,\theta) dx d\theta

\(f_{X\Theta}(x,\theta)= \frac{2}{d} . \frac{2}{\pi}\)

= \int_0^{\frac{\pi}{2}}\int_0^{\frac{l}{2}sin\theta} \frac{2}{d} \cdot \frac{2}{\pi} dx d\theta
= \frac{4}{d\pi} \int_0^{\frac{\pi}{2}}\int_0^{\frac{l}{2}sin\theta} dx d\theta
= \frac{4}{d\pi} \int_0^{\frac{\pi}{2}}\frac{l}{2}sin\theta d\theta
=\frac{2l}{d\pi}
P(X \leq \frac{l}{2}sin \theta) =

Breaking a stick

l

Break it twice

\(X\): first break point

\(Y\): second break point

\(\sim~Uniform(0,l)\)

\(\sim~Uniform(0,X)\)

\(f_X(x)=\frac{1}{l} \)

\(f_{Y|X}(y|x)=\frac{1}{x} \)

\(f_{XY}(x,y)= f_{X}(x)f_{Y|X}(y|x)=\frac{1}{l} . \frac{1}{x}\)

(\(0 \leq y \leq x \leq l\))

Questions of interest?

\(E[Y|X=x]\)

\(f_Y(y)\)

\(E[Y]\)

Breaking a stick

l

Questions of interest?

\(E[Y|X=x]\)

\(f_Y(y)\)

\(E[Y]\)

E[Y|X=x] = \int y f_{Y|X}(y|X=x) dy
=\int_{0}^{x} y f_{Y|X}(y|X=x) dy
=\int_{0}^{x} y \frac{1}{x} dy
=\frac{x}{2}

Breaking a stick

l

Questions of interest?

\(E[Y|X=x]\)

\(f_Y(y)\)

\(E[Y]\)

f_Y(y) = \int f_{X,Y}(x,y) dx
=\frac{1}{l} \log \frac{l}{y}
f_Y(y) = \int_y^l f_{X,Y}(x,y) dx
0\leq y \leq l

Breaking a stick

l

Questions of interest?

\(E[Y|X=x]\)

\(f_Y(y)\)

\(E[Y]\)

E[Y] = \int yf_{Y}(y) dy
= \int_0^l yf_{Y}(y) dy
= \int_0^l y\frac{1}{l} \log \frac{l}{y} dy
= \frac{l}{4}

Summary

p_X(x)
p_{X,Y}(x,y)
p_{X|Y}(x|y) = \frac{p_{X,Y}(x,y)}{p_{Y}(y)}
p_{X}(x) = \sum_y p_{X,Y}(x,y)
f_X(x)
f_{X,Y}(x,y)
f_{X|Y}(x|y) = \frac{f_{X,Y}(x,y)}{f_{Y}(y)}
f_{X}(x) = \sum_y f_{X,Y}(x,y)
F_X(x) \leq P(X\leq x)
E[X], Var(X)

Revisiting Bayes' Rule

Case 1: two discrete random variables

p_{X|Y}(x|y) = \frac{p_{Y|X}(y|x)p_X(x)}{p_{Y}(y)}
X
Y
p_X(x)
p_{Y|X}(y|x)

The problem of inference

Having observed \(Y\) make inferences about \(X\)

Answer: \(p_{X|Y}(x|y)\)

Example

X: signal transmitted (0,1)

Y: signal received (0,1)

Case 2: two continuous random variables

f_{X|Y}(x|y) = \frac{f_{Y|X}(y|x)f_X(x)}{f_{Y}(y)}
X
Y
f_X(x)
f_{Y|X}(y|x)

The problem of inference

Having observed \(Y\) make inferences about \(X\)

Answer: \(f_{X|Y}(x|y)\)

Example

X: rainfall received

Y: water collected in a reservoir

Case 3: X: discrete, Y: continuous

p_{X|Y}(x|y) = \frac{f_{Y|X}(y|x)p_X(x)}{f_{Y}(y)}
X
Y
p_X(x)
f_{Y|X}(y|x)

Example

X: signal transmitted (0, 2)

Y: noisy signal recieved

f_{Y|X=2}
f_{Y|X=0}
P(X=x, y\leq Y \leq y+\delta)
=P(y\leq Y \leq y+\delta)P(X=x| y\leq Y \leq y+\delta)
=P(X=x)P(y\leq Y \leq y+\delta|X=x)
=p_X(x)f_{Y|X}(y|x)\cdot\delta
=f_{Y}(y)\cdot\delta~p_{X|Y}(x|y)

Case 4: X: continuous, Y: discrete

f_{X|Y}(x|y) = \frac{p_{Y|X}(y|x)f_X(x)}{p_{Y}(y)}
X
Y
f_X(x)
p_{Y|X}(y|x)

Example

X: rainfall received

Y: number of trees fallen

What would the joint density/distribution look like?

Learning Objectives

What is joint pdf?

What is conditional pdf?

What happens when two continuous random variables are independent?

What is the Bayes Rule for the continuous case?