CS6015: Linear Algebra and Random Processes
Lecture 38: Multiple continuous random variables, Bayes' theorem for continuous random variables
A significant part of this lecture is based on the online course offered by Prof. John Tsitsikilis
Learning Objectives
What is joint pdf?
What is conditional pdf?
What happens when two continuous random variables are independent?
What is the Bayes Rule for the continuous case?
P(x \leq X \leq x + \delta, y \leq Y \leq y + \delta)
x
f_X(x)
P(x \leq X \leq x + \delta) \approx f_X(x)\cdot\delta
Two continuous rand. variables
Recap
x
x+\delta
y
y+\delta
P(x \leq X \leq x + \delta, y \leq Y \leq y + \delta)
x
f_X(x)
P(x \leq X \leq x + \delta) \approx f_X(x)\cdot\delta
=f_{X,Y}(x,y)\cdot\delta\cdot\delta
P((X,Y) \in S) = \int \int_S f_{X,Y}(x,y)~dx~dy
arbitrary region on the XY plane
Two continuous rand. variables
Recap
P(x \leq X \leq x + \delta, y \leq Y \leq y + \delta)
x
f_X(x)
P(x \leq X \leq x + \delta) \approx f_X(x)\cdot\delta
\approx f_{X,Y}(x,y)\cdot\delta\cdot\delta
Joint density function
Recap
f_{X,Y}(x,y) \approx \frac{P(x \leq X \leq x + \delta, y \leq Y \leq y + \delta)}{\delta\cdot\delta}
joint density function
probability per unit area
\int_{-\infty}^{\infty}\int_{-\infty}^{\infty} f_{X,Y}(x,y) dx dy= 1
f_{X,Y}(x,y) \geq 0
Laws of probability
x
f_X(x)
P(x \leq X \leq x + \delta) \approx f_X(x)\cdot\delta
From joint density to marg. density
Recap
x
f_X(x)
P(x \leq X \leq x + \delta) \approx f_X(x)\cdot\delta
From joint density to marg. density
Recap
x
f_X(x)
P(x \leq X \leq x + \delta) \approx f_X(x)\cdot\delta
From joint density to marg. density
Recap
x
f_X(x)
P(x \leq X \leq x + \delta) \approx f_X(x)\cdot\delta
From joint density to marg. density
Recap
x
f_X(x)
P(x \leq X \leq x + \delta) \approx f_X(x)\cdot\delta
From joint density to marg. density
Recap
x
f_X(x)
P(x \leq X \leq x + \delta) \approx f_X(x)\cdot\delta
From joint density to marg. density
Recap
f_X(x) = \int_{-\infty}^{\infty} f_{X,Y}(x,y) dy
Recap
p_X(x) = \sum_{y} p_{X,Y}(x,y)
Independence
Two continuous random variables are said to be independent if
f_{X,Y}(x,y) = f_X(x)f_Y(y)
f_X(x) = \frac{1}{\sqrt{2 \pi}\sigma_1}e^{\frac{-(x-\mu_1)^2}{2\sigma_1^2}}
f_Y(y) = \frac{1}{\sqrt{2 \pi}\sigma_2}e^{\frac{-(y-\mu_2)^2}{2\sigma_2^2}}
\mathbf{x} = \begin{bmatrix} x \\ y \end{bmatrix}
\mathbf{\mu} = \begin{bmatrix} \mu_1 \\ \mu_2 \end{bmatrix}
\mathbf{\Sigma} = \begin{bmatrix} \sigma_1^2 & 0 \\ 0 & \sigma_2^2 \end{bmatrix}
f_{X,Y}(x,y) = \frac{1}{2 \pi |\Sigma|^{\frac{1}{2}}}e^{-\frac{1}{2}(\mathbf{x} - \mu)^T\Sigma^{-1}(\mathbf{x} - \mu)}
Example:
Independence
f_X(x) = \frac{1}{\sqrt{2 \pi}\sigma_1}e^{\frac{-(x-\mu_1)^2}{2\sigma_1^2}}
f_Y(y) = \frac{1}{\sqrt{2 \pi}\sigma_2}e^{\frac{-(y-\mu_2)^2}{2\sigma_2^2}}
\mathbf{x} = \begin{bmatrix} x \\ y \end{bmatrix}
\mathbf{\mu} = \begin{bmatrix} \mu_1 \\ \mu_2 \end{bmatrix}
\mathbf{\Sigma} = \begin{bmatrix} \sigma_1^2 & 0 \\ 0 & \sigma_2^2 \end{bmatrix}
f_{X,Y}(x,y) = \frac{1}{2 \pi |\Sigma|^{\frac{1}{2}}}e^{-\frac{1}{2}(\mathbf{x} - \mu)^T\Sigma^{-1}(\mathbf{x} - \mu)}
f_{X,Y}(x,y) = \frac{1}{2 \pi \begin{vmatrix} \sigma_1^2 & 0 \\ 0 & \sigma_2^2 \end{vmatrix}^\frac{1}{2}}e^{-\frac{1}{2}\begin{bmatrix} x - \mu_1 \\ y - \mu_2 \end{bmatrix}^\top\begin{bmatrix} \sigma_1^2 & 0 \\ 0 & \sigma_2^2 \end{bmatrix}^{-1}\begin{bmatrix} x - \mu_1 \\ y - \mu_2 \end{bmatrix}}
Suppose, \(\mu_1 = \mu_2 =0\) and \(\sigma_1 = \sigma_2 = 1 \)
f_X(x) = \frac{1}{\sqrt{2 \pi}}e^{-\frac{x^2}{2}}
f_Y(y) = \frac{1}{\sqrt{2 \pi}}e^{-\frac{y^2}{2}}
f_{X,Y}(x,y) = \frac{1}{2 \pi \begin{vmatrix} 1 & 0 \\ 0 & 1 \end{vmatrix}^\frac{1}{2}}e^{-\frac{1}{2}\begin{bmatrix} x \\ y \end{bmatrix}^\top\begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}\begin{bmatrix} x \\ y \end{bmatrix}}
f_{X,Y}(x,y) = \frac{1}{2 \pi }e^{-\frac{1}{2}\begin{bmatrix} x \\ y \end{bmatrix}^\top\begin{bmatrix} x \\ y \end{bmatrix}}
= \frac{1}{2 \pi }e^{-\frac{1}{2} (x^2 + y^2)}
= f_x(x)f_Y(y)
Independence
f_X(x) = \frac{1}{\sqrt{2 \pi}\sigma_1}e^{\frac{-(x-\mu_1)^2}{2\sigma_1^2}}
f_Y(y) = \frac{1}{\sqrt{2 \pi}\sigma_2}e^{\frac{-(y-\mu_2)^2}{2\sigma_2^2}}
\mathbf{x} = \begin{bmatrix} x \\ y \end{bmatrix}
\mathbf{\mu} = \begin{bmatrix} \mu_1 \\ \mu_2 \end{bmatrix}
\mathbf{\Sigma} = \begin{bmatrix} \sigma_1^2 & 0 \\ 0 & \sigma_2^2 \end{bmatrix}
f_{X,Y}(x,y) = \frac{1}{2 \pi \begin{vmatrix} \sigma_1^2 & 0 \\ 0 & \sigma_2^2 \end{vmatrix}^\frac{1}{2}}e^{-\frac{1}{2}\begin{bmatrix} x - \mu_1 \\ y - \mu_2 \end{bmatrix}^\top\begin{bmatrix} \sigma_1^2 & 0 \\ 0 & \sigma_2^2 \end{bmatrix}^{-1}\begin{bmatrix} x - \mu_1 \\ y - \mu_2 \end{bmatrix}}
Suppose, \(\mu_1,\mu_2 \neq0\) and \(\sigma_1,\sigma_2 \neq 1 \)
f_{X,Y}(x,y) = \frac{1}{2 \pi (\sigma_1^2\cdot\sigma_2^2 - 0\cdot 0)^\frac{1}{2}}e^{-\frac{1}{2}\begin{bmatrix} x - \mu_1 \\ y - \mu_2 \end{bmatrix}^\top\begin{bmatrix} \frac{1}{\sigma_1^2} & 0 \\ 0 & \frac{1}{\sigma_2^2} \end{bmatrix}\begin{bmatrix} x - \mu_1 \\ y - \mu_2 \end{bmatrix}}
f_{X,Y}(x,y) = \frac{1}{2 \pi \sigma_1\cdot\sigma_2}e^{-\frac{1}{2}\begin{bmatrix} x - \mu_1 \\ y - \mu_2 \end{bmatrix}^\top\begin{bmatrix} \frac{1}{\sigma_1^2}(x - \mu_1) \\ \frac{1}{\sigma_2^2}(y - \mu_2) \end{bmatrix}}
f_{X,Y}(x,y) = \frac{1}{\sqrt{2 \pi} \sqrt{2 \pi} \sigma_1\cdot\sigma_2}e^{-\frac{1}{2} (\frac{(x - \mu_1)^2}{\sigma_1^2} + \frac{(y - \mu_2)^2}{\sigma_2^2}) }
=f_{X}(x)f_{Y}(y)
Independence
x
y
f_Y(y)
f_{XY}(x,y)
f_{X}(x)
Revisiting the normal distribution
f_X(x) = \frac{1}{\sqrt{2 \pi}\sigma_1}e^{\frac{-(x-\mu_1)^2}{2\sigma_1^2}}
\mathbf{x} = \begin{bmatrix} x \\ y \end{bmatrix}
\mathbf{\mu} = \begin{bmatrix} \mu_1 \\ \mu_2 \end{bmatrix}
\mathbf{\Sigma} = covariance~matrix
f_{X,Y}(x,y) = \frac{1}{2 \pi |\Sigma|^{\frac{1}{2}}}e^{-\frac{1}{2}(\mathbf{x} - \mu)^T\Sigma^{-1}(\mathbf{x} - \mu)}
univariate normal distribution, \(n = 1\) random var.
bivariate normal distribution, \(n = 2\) random var.
f_{X_1,X_2, \dots, X_n}(x_1,x_2, \dots, x_n) = ?
multivariate normal distribution, \(n > 2\) rand. var.
when would the two random variables be independent?
Conditional Probability Density Functions
P(A|B) = \frac{P(A,B)}{P(B)}
Recap
p_{X|Y}(x|y) = \frac{p_{X,Y}(x,y)}{p_Y(y)}
(P(B) > 0)
(p_{Y}(y) > 0)
Conditional PDFs
P(x \leq X \leq x + \delta) \approx f_X(x)\cdot\delta
By analogy, we would want
P(x \leq X \leq x + \delta | Y \approx y) \approx f_{X|Y}(x|y)\cdot\delta
P(x \leq X \leq x + \delta | y \leq Y \leq y + \delta)
We cannot have \(Y = y\) as \(P(Y=y) = 0\)
\frac{P(x \leq X \leq x + \delta , y \leq Y \leq y + \delta)}{P(y \leq Y \leq y + \delta)}
\approx \frac{f_{X,Y}(x,y)\cdot\delta\cdot\delta}{f_Y(y)\cdot\delta}
f_{X|Y}(x|y) = \frac{f_{X,Y}(x,y)}{f_Y(y)}
(f_Y(y) > 0)
Conditional PDFs
f_{X|Y}(x|y) = \frac{f_{X,Y}(x,y)}{f_Y(y)}
(f_Y(y) > 0)
y
x
Conditional PDFs
f_{X|Y}(x|y) = \frac{f_{X,Y}(x,y)}{f_Y(y)}
(f_Y(y) > 0)
y
x
fixed/given/constant
Conditional PDFs
f_{X|Y}(x|y) = \frac{f_{X,Y}(x,y)}{f_Y(y)}
(f_Y(y) > 0)
y
x
fixed/given/constant
f_Y(y)
f_{X,Y}(x, y)
Examples/Exercises
Buffon's needle problem
d
Question: A needle of length \(l\) is dropped on a wooden plank of width \(d (l<d)\). What is the probability that it will intersect with one of the edges? (assume it does not fall completely outside)
Buffon's needle problem
Three possibilities
Due to the symmetry of the problem, case 2 and 3 are similar (you can just turn the plank around)
Buffon's needle problem
\(X\): distance of centre of needle to the nearest edge
\(\Theta\): angle between the needle and the plank edges
X
X
\theta
\theta
\frac{l}{2}sin\theta
\frac{l}{2}sin\theta
\(X \sim~Uniform(0,\frac{d}{2})\)
\(\Theta \sim~Uniform(0,\frac{\pi}{2})\)
\(f_X(x)=\frac{2}{d} \)
\(f_\Theta(\theta)=\frac{2}{\pi} \)
\(X, \Theta independent\)
\(f_{X\Theta}(x,\theta)= \frac{2}{d} . \frac{2}{\pi}\)
X \leq \frac{l}{2}sin \theta
The needle will intersect if
Buffon's needle problem
X
X
\theta
\theta
\frac{l}{2}sin\theta
\frac{l}{2}sin\theta
The needle will intersect if
X \leq \frac{l}{2}sin \theta
\int_0^{\frac{\pi}{2}}\int_0^{\frac{l}{2}sin\theta} f_{X,\Theta}(x,\theta) dx d\theta
\(f_{X\Theta}(x,\theta)= \frac{2}{d} . \frac{2}{\pi}\)
= \int_0^{\frac{\pi}{2}}\int_0^{\frac{l}{2}sin\theta} \frac{2}{d} \cdot \frac{2}{\pi} dx d\theta
= \frac{4}{d\pi} \int_0^{\frac{\pi}{2}}\int_0^{\frac{l}{2}sin\theta} dx d\theta
= \frac{4}{d\pi} \int_0^{\frac{\pi}{2}}\frac{l}{2}sin\theta d\theta
=\frac{2l}{d\pi}
P(X \leq \frac{l}{2}sin \theta) =
Breaking a stick
l
Break it twice
\(X\): first break point
\(Y\): second break point
\(\sim~Uniform(0,l)\)
\(\sim~Uniform(0,X)\)
\(f_X(x)=\frac{1}{l} \)
\(f_{Y|X}(y|x)=\frac{1}{x} \)
\(f_{XY}(x,y)= f_{X}(x)f_{Y|X}(y|x)=\frac{1}{l} . \frac{1}{x}\)
(\(0 \leq y \leq x \leq l\))
Questions of interest?
\(E[Y|X=x]\)
\(f_Y(y)\)
\(E[Y]\)
Breaking a stick
l
Questions of interest?
\(E[Y|X=x]\)
\(f_Y(y)\)
\(E[Y]\)
E[Y|X=x] = \int y f_{Y|X}(y|X=x) dy
=\int_{0}^{x} y f_{Y|X}(y|X=x) dy
=\int_{0}^{x} y \frac{1}{x} dy
=\frac{x}{2}
Breaking a stick
l
Questions of interest?
\(E[Y|X=x]\)
\(f_Y(y)\)
\(E[Y]\)
f_Y(y) = \int f_{X,Y}(x,y) dx
=\frac{1}{l} \log \frac{l}{y}
f_Y(y) = \int_y^l f_{X,Y}(x,y) dx
0\leq y \leq l
Breaking a stick
l
Questions of interest?
\(E[Y|X=x]\)
\(f_Y(y)\)
\(E[Y]\)
E[Y] = \int yf_{Y}(y) dy
= \int_0^l yf_{Y}(y) dy
= \int_0^l y\frac{1}{l} \log \frac{l}{y} dy
= \frac{l}{4}
Summary
p_X(x)
p_{X,Y}(x,y)
p_{X|Y}(x|y) = \frac{p_{X,Y}(x,y)}{p_{Y}(y)}
p_{X}(x) = \sum_y p_{X,Y}(x,y)
f_X(x)
f_{X,Y}(x,y)
f_{X|Y}(x|y) = \frac{f_{X,Y}(x,y)}{f_{Y}(y)}
f_{X}(x) = \sum_y f_{X,Y}(x,y)
F_X(x) \leq P(X\leq x)
E[X], Var(X)
Revisiting Bayes' Rule
Case 1: two discrete random variables
p_{X|Y}(x|y) = \frac{p_{Y|X}(y|x)p_X(x)}{p_{Y}(y)}
X
Y
p_X(x)
p_{Y|X}(y|x)
The problem of inference
Having observed \(Y\) make inferences about \(X\)
Answer: \(p_{X|Y}(x|y)\)
Example
X: signal transmitted (0,1)
Y: signal received (0,1)
Case 2: two continuous random variables
f_{X|Y}(x|y) = \frac{f_{Y|X}(y|x)f_X(x)}{f_{Y}(y)}
X
Y
f_X(x)
f_{Y|X}(y|x)
The problem of inference
Having observed \(Y\) make inferences about \(X\)
Answer: \(f_{X|Y}(x|y)\)
Example
X: rainfall received
Y: water collected in a reservoir
Case 3: X: discrete, Y: continuous
p_{X|Y}(x|y) = \frac{f_{Y|X}(y|x)p_X(x)}{f_{Y}(y)}
X
Y
p_X(x)
f_{Y|X}(y|x)
Example
X: signal transmitted (0, 2)
Y: noisy signal recieved
f_{Y|X=2}
f_{Y|X=0}
P(X=x, y\leq Y \leq y+\delta)
=P(y\leq Y \leq y+\delta)P(X=x| y\leq Y \leq y+\delta)
=P(X=x)P(y\leq Y \leq y+\delta|X=x)
=p_X(x)f_{Y|X}(y|x)\cdot\delta
=f_{Y}(y)\cdot\delta~p_{X|Y}(x|y)
Case 4: X: continuous, Y: discrete
f_{X|Y}(x|y) = \frac{p_{Y|X}(y|x)f_X(x)}{p_{Y}(y)}
X
Y
f_X(x)
p_{Y|X}(y|x)
Example
X: rainfall received
Y: number of trees fallen
What would the joint density/distribution look like?
Learning Objectives
What is joint pdf?
What is conditional pdf?
What happens when two continuous random variables are independent?
What is the Bayes Rule for the continuous case?
CS6015: Lecture 38
By Mitesh Khapra
CS6015: Lecture 38
Lecture 38: Multiple continuous random variables, Bayes' theorem for continuous random variables
- 1,618