CS6015: Linear Algebra and Random Processes

Lecture 38: Multiple continuous random variables, Bayes' theorem for continuous random variables

A significant part of this lecture is based on the online course offered by Prof. John Tsitsikilis

Learning Objectives

What is joint pdf?

What is conditional pdf?

What happens when two continuous random variables are independent?

What is the Bayes Rule for the continuous case?

P(x \leq X \leq x + \delta, y \leq Y \leq y + \delta)

f_X(x)

P(x \leq X \leq x + \delta) \approx f_X(x)\cdot\delta

Two continuous rand. variables

Recap

x+\delta

y+\delta

P(x \leq X \leq x + \delta, y \leq Y \leq y + \delta)

f_X(x)

P(x \leq X \leq x + \delta) \approx f_X(x)\cdot\delta

=f_{X,Y}(x,y)\cdot\delta\cdot\delta

P((X,Y) \in S) = \int \int_S f_{X,Y}(x,y)~dx~dy

arbitrary region on the XY plane

Two continuous rand. variables

Recap

P(x \leq X \leq x + \delta, y \leq Y \leq y + \delta)

f_X(x)

P(x \leq X \leq x + \delta) \approx f_X(x)\cdot\delta

\approx f_{X,Y}(x,y)\cdot\delta\cdot\delta

Joint density function

Recap

f_{X,Y}(x,y) \approx \frac{P(x \leq X \leq x + \delta, y \leq Y \leq y + \delta)}{\delta\cdot\delta}

joint density function

probability per unit area

\int_{-\infty}^{\infty}\int_{-\infty}^{\infty} f_{X,Y}(x,y) dx dy= 1

f_{X,Y}(x,y) \geq 0

Laws of probability

f_X(x)

P(x \leq X \leq x + \delta) \approx f_X(x)\cdot\delta

From joint density to marg. density

Recap

f_X(x)

P(x \leq X \leq x + \delta) \approx f_X(x)\cdot\delta

From joint density to marg. density

Recap

f_X(x)

P(x \leq X \leq x + \delta) \approx f_X(x)\cdot\delta

From joint density to marg. density

Recap

f_X(x)

P(x \leq X \leq x + \delta) \approx f_X(x)\cdot\delta

From joint density to marg. density

Recap

f_X(x)

P(x \leq X \leq x + \delta) \approx f_X(x)\cdot\delta

From joint density to marg. density

Recap

f_X(x)

P(x \leq X \leq x + \delta) \approx f_X(x)\cdot\delta

From joint density to marg. density

Recap

f_X(x) = \int_{-\infty}^{\infty} f_{X,Y}(x,y) dy

Recap

p_X(x) = \sum_{y} p_{X,Y}(x,y)

Independence

Two continuous random variables are said to be independent if

f_{X,Y}(x,y) = f_X(x)f_Y(y)

f_X(x) = \frac{1}{\sqrt{2 \pi}\sigma_1}e^{\frac{-(x-\mu_1)^2}{2\sigma_1^2}}

f_Y(y) = \frac{1}{\sqrt{2 \pi}\sigma_2}e^{\frac{-(y-\mu_2)^2}{2\sigma_2^2}}

\mathbf{x} = \begin{bmatrix} x \\ y \end{bmatrix}

\mathbf{\mu} = \begin{bmatrix} \mu_1 \\ \mu_2 \end{bmatrix}

\mathbf{\Sigma} = \begin{bmatrix} \sigma_1^2 & 0 \\ 0 & \sigma_2^2 \end{bmatrix}

f_{X,Y}(x,y) = \frac{1}{2 \pi |\Sigma|^{\frac{1}{2}}}e^{-\frac{1}{2}(\mathbf{x} - \mu)^T\Sigma^{-1}(\mathbf{x} - \mu)}

Example:

Independence

f_X(x) = \frac{1}{\sqrt{2 \pi}\sigma_1}e^{\frac{-(x-\mu_1)^2}{2\sigma_1^2}}

f_Y(y) = \frac{1}{\sqrt{2 \pi}\sigma_2}e^{\frac{-(y-\mu_2)^2}{2\sigma_2^2}}

\mathbf{x} = \begin{bmatrix} x \\ y \end{bmatrix}

\mathbf{\mu} = \begin{bmatrix} \mu_1 \\ \mu_2 \end{bmatrix}

\mathbf{\Sigma} = \begin{bmatrix} \sigma_1^2 & 0 \\ 0 & \sigma_2^2 \end{bmatrix}

f_{X,Y}(x,y) = \frac{1}{2 \pi |\Sigma|^{\frac{1}{2}}}e^{-\frac{1}{2}(\mathbf{x} - \mu)^T\Sigma^{-1}(\mathbf{x} - \mu)}

f_{X,Y}(x,y) = \frac{1}{2 \pi \begin{vmatrix} \sigma_1^2 & 0 \\ 0 & \sigma_2^2 \end{vmatrix}^\frac{1}{2}}e^{-\frac{1}{2}\begin{bmatrix} x - \mu_1 \\ y - \mu_2 \end{bmatrix}^\top\begin{bmatrix} \sigma_1^2 & 0 \\ 0 & \sigma_2^2 \end{bmatrix}^{-1}\begin{bmatrix} x - \mu_1 \\ y - \mu_2 \end{bmatrix}}

Suppose, \(\mu_1 = \mu_2 =0\) and \(\sigma_1 = \sigma_2 = 1 \)

f_X(x) = \frac{1}{\sqrt{2 \pi}}e^{-\frac{x^2}{2}}

f_Y(y) = \frac{1}{\sqrt{2 \pi}}e^{-\frac{y^2}{2}}

f_{X,Y}(x,y) = \frac{1}{2 \pi \begin{vmatrix} 1 & 0 \\ 0 & 1 \end{vmatrix}^\frac{1}{2}}e^{-\frac{1}{2}\begin{bmatrix} x \\ y \end{bmatrix}^\top\begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}\begin{bmatrix} x \\ y \end{bmatrix}}

f_{X,Y}(x,y) = \frac{1}{2 \pi }e^{-\frac{1}{2}\begin{bmatrix} x \\ y \end{bmatrix}^\top\begin{bmatrix} x \\ y \end{bmatrix}}

= \frac{1}{2 \pi }e^{-\frac{1}{2} (x^2 + y^2)}

= f_x(x)f_Y(y)

Independence

f_X(x) = \frac{1}{\sqrt{2 \pi}\sigma_1}e^{\frac{-(x-\mu_1)^2}{2\sigma_1^2}}

f_Y(y) = \frac{1}{\sqrt{2 \pi}\sigma_2}e^{\frac{-(y-\mu_2)^2}{2\sigma_2^2}}

\mathbf{x} = \begin{bmatrix} x \\ y \end{bmatrix}

\mathbf{\mu} = \begin{bmatrix} \mu_1 \\ \mu_2 \end{bmatrix}

\mathbf{\Sigma} = \begin{bmatrix} \sigma_1^2 & 0 \\ 0 & \sigma_2^2 \end{bmatrix}

Suppose, \(\mu_1,\mu_2 \neq0\) and \(\sigma_1,\sigma_2 \neq 1 \)

f_{X,Y}(x,y) = \frac{1}{2 \pi (\sigma_1^2\cdot\sigma_2^2 - 0\cdot 0)^\frac{1}{2}}e^{-\frac{1}{2}\begin{bmatrix} x - \mu_1 \\ y - \mu_2 \end{bmatrix}^\top\begin{bmatrix} \frac{1}{\sigma_1^2} & 0 \\ 0 & \frac{1}{\sigma_2^2} \end{bmatrix}\begin{bmatrix} x - \mu_1 \\ y - \mu_2 \end{bmatrix}}

f_{X,Y}(x,y) = \frac{1}{2 \pi \sigma_1\cdot\sigma_2}e^{-\frac{1}{2}\begin{bmatrix} x - \mu_1 \\ y - \mu_2 \end{bmatrix}^\top\begin{bmatrix} \frac{1}{\sigma_1^2}(x - \mu_1) \\ \frac{1}{\sigma_2^2}(y - \mu_2) \end{bmatrix}}

f_{X,Y}(x,y) = \frac{1}{\sqrt{2 \pi} \sqrt{2 \pi} \sigma_1\cdot\sigma_2}e^{-\frac{1}{2} (\frac{(x - \mu_1)^2}{\sigma_1^2} + \frac{(y - \mu_2)^2}{\sigma_2^2}) }

=f_{X}(x)f_{Y}(y)

Independence

f_Y(y)

f_{XY}(x,y)

f_{X}(x)

Revisiting the normal distribution

f_X(x) = \frac{1}{\sqrt{2 \pi}\sigma_1}e^{\frac{-(x-\mu_1)^2}{2\sigma_1^2}}

\mathbf{x} = \begin{bmatrix} x \\ y \end{bmatrix}

\mathbf{\mu} = \begin{bmatrix} \mu_1 \\ \mu_2 \end{bmatrix}

\mathbf{\Sigma} = covariance~matrix

f_{X,Y}(x,y) = \frac{1}{2 \pi |\Sigma|^{\frac{1}{2}}}e^{-\frac{1}{2}(\mathbf{x} - \mu)^T\Sigma^{-1}(\mathbf{x} - \mu)}

univariate normal distribution, \(n = 1\) random var.

bivariate normal distribution, \(n = 2\) random var.

f_{X_1,X_2, \dots, X_n}(x_1,x_2, \dots, x_n) = ?

multivariate normal distribution, \(n > 2\) rand. var.

when would the two random variables be independent?

Conditional Probability Density Functions

P(A|B) = \frac{P(A,B)}{P(B)}

Recap

p_{X|Y}(x|y) = \frac{p_{X,Y}(x,y)}{p_Y(y)}

(P(B) > 0)

(p_{Y}(y) > 0)

Conditional PDFs

P(x \leq X \leq x + \delta) \approx f_X(x)\cdot\delta

By analogy, we would want

P(x \leq X \leq x + \delta | Y \approx y) \approx f_{X|Y}(x|y)\cdot\delta

P(x \leq X \leq x + \delta | y \leq Y \leq y + \delta)

We cannot have \(Y = y\) as \(P(Y=y) = 0\)

\frac{P(x \leq X \leq x + \delta , y \leq Y \leq y + \delta)}{P(y \leq Y \leq y + \delta)}

\approx \frac{f_{X,Y}(x,y)\cdot\delta\cdot\delta}{f_Y(y)\cdot\delta}

f_{X|Y}(x|y) = \frac{f_{X,Y}(x,y)}{f_Y(y)}

(f_Y(y) > 0)

Conditional PDFs

f_{X|Y}(x|y) = \frac{f_{X,Y}(x,y)}{f_Y(y)}

(f_Y(y) > 0)

Conditional PDFs

f_{X|Y}(x|y) = \frac{f_{X,Y}(x,y)}{f_Y(y)}

(f_Y(y) > 0)

fixed/given/constant

Conditional PDFs

f_{X|Y}(x|y) = \frac{f_{X,Y}(x,y)}{f_Y(y)}

(f_Y(y) > 0)

fixed/given/constant

f_Y(y)

f_{X,Y}(x, y)

Examples/Exercises

Buffon's needle problem

Question: A needle of length \(l\) is dropped on a wooden plank of width \(d (l<d)\). What is the probability that it will intersect with one of the edges? (assume it does not fall completely outside)

Buffon's needle problem

Three possibilities

Due to the symmetry of the problem, case 2 and 3 are similar (you can just turn the plank around)

Buffon's needle problem

\(X\): distance of centre of needle to the nearest edge

\(\Theta\): angle between the needle and the plank edges

\theta

\frac{l}{2}sin\theta

\(X \sim~Uniform(0,\frac{d}{2})\)

\(\Theta \sim~Uniform(0,\frac{\pi}{2})\)

\(f_X(x)=\frac{2}{d} \)

\(f_\Theta(\theta)=\frac{2}{\pi} \)

\(X, \Theta independent\)

\(f_{X\Theta}(x,\theta)= \frac{2}{d} . \frac{2}{\pi}\)

X \leq \frac{l}{2}sin \theta

The needle will intersect if

Buffon's needle problem

\theta

\frac{l}{2}sin\theta

The needle will intersect if

X \leq \frac{l}{2}sin \theta

\int_0^{\frac{\pi}{2}}\int_0^{\frac{l}{2}sin\theta} f_{X,\Theta}(x,\theta) dx d\theta

\(f_{X\Theta}(x,\theta)= \frac{2}{d} . \frac{2}{\pi}\)

= \int_0^{\frac{\pi}{2}}\int_0^{\frac{l}{2}sin\theta} \frac{2}{d} \cdot \frac{2}{\pi} dx d\theta

= \frac{4}{d\pi} \int_0^{\frac{\pi}{2}}\int_0^{\frac{l}{2}sin\theta} dx d\theta

= \frac{4}{d\pi} \int_0^{\frac{\pi}{2}}\frac{l}{2}sin\theta d\theta

=\frac{2l}{d\pi}

P(X \leq \frac{l}{2}sin \theta) =

Breaking a stick

Break it twice

\(X\): first break point

\(Y\): second break point

\(\sim~Uniform(0,l)\)

\(\sim~Uniform(0,X)\)

\(f_X(x)=\frac{1}{l} \)

\(f_{Y|X}(y|x)=\frac{1}{x} \)

\(f_{XY}(x,y)= f_{X}(x)f_{Y|X}(y|x)=\frac{1}{l} . \frac{1}{x}\)

(\(0 \leq y \leq x \leq l\))

Questions of interest?

\(E[Y|X=x]\)

\(f_Y(y)\)

\(E[Y]\)

Breaking a stick

Questions of interest?

\(E[Y|X=x]\)

\(f_Y(y)\)

\(E[Y]\)

E[Y|X=x] = \int y f_{Y|X}(y|X=x) dy

=\int_{0}^{x} y f_{Y|X}(y|X=x) dy

=\int_{0}^{x} y \frac{1}{x} dy

=\frac{x}{2}

Breaking a stick

Questions of interest?

\(E[Y|X=x]\)

\(f_Y(y)\)

\(E[Y]\)

f_Y(y) = \int f_{X,Y}(x,y) dx

=\frac{1}{l} \log \frac{l}{y}

f_Y(y) = \int_y^l f_{X,Y}(x,y) dx

0\leq y \leq l

Breaking a stick

Questions of interest?

\(E[Y|X=x]\)

\(f_Y(y)\)

\(E[Y]\)

E[Y] = \int yf_{Y}(y) dy

= \int_0^l yf_{Y}(y) dy

= \int_0^l y\frac{1}{l} \log \frac{l}{y} dy

= \frac{l}{4}

Summary

p_X(x)

p_{X,Y}(x,y)

p_{X|Y}(x|y) = \frac{p_{X,Y}(x,y)}{p_{Y}(y)}

p_{X}(x) = \sum_y p_{X,Y}(x,y)

f_X(x)

f_{X,Y}(x,y)

f_{X|Y}(x|y) = \frac{f_{X,Y}(x,y)}{f_{Y}(y)}

f_{X}(x) = \sum_y f_{X,Y}(x,y)

F_X(x) \leq P(X\leq x)

E[X], Var(X)

Revisiting Bayes' Rule

Case 1: two discrete random variables

p_{X|Y}(x|y) = \frac{p_{Y|X}(y|x)p_X(x)}{p_{Y}(y)}

p_X(x)

p_{Y|X}(y|x)

The problem of inference

Having observed \(Y\) make inferences about \(X\)

Answer: \(p_{X|Y}(x|y)\)

Example

X: signal transmitted (0,1)

Y: signal received (0,1)

Case 2: two continuous random variables

f_{X|Y}(x|y) = \frac{f_{Y|X}(y|x)f_X(x)}{f_{Y}(y)}

f_X(x)

f_{Y|X}(y|x)

The problem of inference

Having observed \(Y\) make inferences about \(X\)

Answer: \(f_{X|Y}(x|y)\)

Example

X: rainfall received

Y: water collected in a reservoir

Case 3: X: discrete, Y: continuous

p_{X|Y}(x|y) = \frac{f_{Y|X}(y|x)p_X(x)}{f_{Y}(y)}

p_X(x)

f_{Y|X}(y|x)

Example

X: signal transmitted (0, 2)

Y: noisy signal recieved

f_{Y|X=2}

f_{Y|X=0}

P(X=x, y\leq Y \leq y+\delta)

=P(y\leq Y \leq y+\delta)P(X=x| y\leq Y \leq y+\delta)

=P(X=x)P(y\leq Y \leq y+\delta|X=x)

=p_X(x)f_{Y|X}(y|x)\cdot\delta

=f_{Y}(y)\cdot\delta~p_{X|Y}(x|y)

Case 4: X: continuous, Y: discrete

f_{X|Y}(x|y) = \frac{p_{Y|X}(y|x)f_X(x)}{p_{Y}(y)}

f_X(x)

p_{Y|X}(y|x)