# CS6015: Linear Algebra and Random Processes

## Lecture 38: Multiple continuous random variables, Bayes' theorem for continuous random variables

### What is the Bayes Rule for the continuous case?

P(x \leq X \leq x + \delta, y \leq Y \leq y + \delta)
x
f_X(x)
P(x \leq X \leq x + \delta) \approx f_X(x)\cdot\delta

### Recap

x
x+\delta
y
y+\delta
P(x \leq X \leq x + \delta, y \leq Y \leq y + \delta)
x
f_X(x)
P(x \leq X \leq x + \delta) \approx f_X(x)\cdot\delta
=f_{X,Y}(x,y)\cdot\delta\cdot\delta
P((X,Y) \in S) = \int \int_S f_{X,Y}(x,y)~dx~dy

### Recap

P(x \leq X \leq x + \delta, y \leq Y \leq y + \delta)
x
f_X(x)
P(x \leq X \leq x + \delta) \approx f_X(x)\cdot\delta
\approx f_{X,Y}(x,y)\cdot\delta\cdot\delta

### Recap

f_{X,Y}(x,y) \approx \frac{P(x \leq X \leq x + \delta, y \leq Y \leq y + \delta)}{\delta\cdot\delta}

### probability per unit area

\int_{-\infty}^{\infty}\int_{-\infty}^{\infty} f_{X,Y}(x,y) dx dy= 1
f_{X,Y}(x,y) \geq 0

### Laws of probability

x
f_X(x)
P(x \leq X \leq x + \delta) \approx f_X(x)\cdot\delta

### Recap

x
f_X(x)
P(x \leq X \leq x + \delta) \approx f_X(x)\cdot\delta

### Recap

x
f_X(x)
P(x \leq X \leq x + \delta) \approx f_X(x)\cdot\delta

### Recap

x
f_X(x)
P(x \leq X \leq x + \delta) \approx f_X(x)\cdot\delta

### Recap

x
f_X(x)
P(x \leq X \leq x + \delta) \approx f_X(x)\cdot\delta

### Recap

x
f_X(x)
P(x \leq X \leq x + \delta) \approx f_X(x)\cdot\delta

### Recap

f_X(x) = \int_{-\infty}^{\infty} f_{X,Y}(x,y) dy

### Recap

p_X(x) = \sum_{y} p_{X,Y}(x,y)

### Two continuous random variables are said to be independent if

f_{X,Y}(x,y) = f_X(x)f_Y(y)
f_X(x) = \frac{1}{\sqrt{2 \pi}\sigma_1}e^{\frac{-(x-\mu_1)^2}{2\sigma_1^2}}
f_Y(y) = \frac{1}{\sqrt{2 \pi}\sigma_2}e^{\frac{-(y-\mu_2)^2}{2\sigma_2^2}}
\mathbf{x} = \begin{bmatrix} x \\ y \end{bmatrix}
\mathbf{\mu} = \begin{bmatrix} \mu_1 \\ \mu_2 \end{bmatrix}
\mathbf{\Sigma} = \begin{bmatrix} \sigma_1^2 & 0 \\ 0 & \sigma_2^2 \end{bmatrix}
f_{X,Y}(x,y) = \frac{1}{2 \pi |\Sigma|^{\frac{1}{2}}}e^{-\frac{1}{2}(\mathbf{x} - \mu)^T\Sigma^{-1}(\mathbf{x} - \mu)}

### Independence

f_X(x) = \frac{1}{\sqrt{2 \pi}\sigma_1}e^{\frac{-(x-\mu_1)^2}{2\sigma_1^2}}
f_Y(y) = \frac{1}{\sqrt{2 \pi}\sigma_2}e^{\frac{-(y-\mu_2)^2}{2\sigma_2^2}}
\mathbf{x} = \begin{bmatrix} x \\ y \end{bmatrix}
\mathbf{\mu} = \begin{bmatrix} \mu_1 \\ \mu_2 \end{bmatrix}
\mathbf{\Sigma} = \begin{bmatrix} \sigma_1^2 & 0 \\ 0 & \sigma_2^2 \end{bmatrix}
f_{X,Y}(x,y) = \frac{1}{2 \pi |\Sigma|^{\frac{1}{2}}}e^{-\frac{1}{2}(\mathbf{x} - \mu)^T\Sigma^{-1}(\mathbf{x} - \mu)}
f_{X,Y}(x,y) = \frac{1}{2 \pi \begin{vmatrix} \sigma_1^2 & 0 \\ 0 & \sigma_2^2 \end{vmatrix}^\frac{1}{2}}e^{-\frac{1}{2}\begin{bmatrix} x - \mu_1 \\ y - \mu_2 \end{bmatrix}^\top\begin{bmatrix} \sigma_1^2 & 0 \\ 0 & \sigma_2^2 \end{bmatrix}^{-1}\begin{bmatrix} x - \mu_1 \\ y - \mu_2 \end{bmatrix}}

### Suppose, $$\mu_1 = \mu_2 =0$$ and $$\sigma_1 = \sigma_2 = 1$$

f_X(x) = \frac{1}{\sqrt{2 \pi}}e^{-\frac{x^2}{2}}
f_Y(y) = \frac{1}{\sqrt{2 \pi}}e^{-\frac{y^2}{2}}
f_{X,Y}(x,y) = \frac{1}{2 \pi \begin{vmatrix} 1 & 0 \\ 0 & 1 \end{vmatrix}^\frac{1}{2}}e^{-\frac{1}{2}\begin{bmatrix} x \\ y \end{bmatrix}^\top\begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}\begin{bmatrix} x \\ y \end{bmatrix}}
f_{X,Y}(x,y) = \frac{1}{2 \pi }e^{-\frac{1}{2}\begin{bmatrix} x \\ y \end{bmatrix}^\top\begin{bmatrix} x \\ y \end{bmatrix}}
= \frac{1}{2 \pi }e^{-\frac{1}{2} (x^2 + y^2)}
= f_x(x)f_Y(y)

### Independence

f_X(x) = \frac{1}{\sqrt{2 \pi}\sigma_1}e^{\frac{-(x-\mu_1)^2}{2\sigma_1^2}}
f_Y(y) = \frac{1}{\sqrt{2 \pi}\sigma_2}e^{\frac{-(y-\mu_2)^2}{2\sigma_2^2}}
\mathbf{x} = \begin{bmatrix} x \\ y \end{bmatrix}
\mathbf{\mu} = \begin{bmatrix} \mu_1 \\ \mu_2 \end{bmatrix}
\mathbf{\Sigma} = \begin{bmatrix} \sigma_1^2 & 0 \\ 0 & \sigma_2^2 \end{bmatrix}
f_{X,Y}(x,y) = \frac{1}{2 \pi \begin{vmatrix} \sigma_1^2 & 0 \\ 0 & \sigma_2^2 \end{vmatrix}^\frac{1}{2}}e^{-\frac{1}{2}\begin{bmatrix} x - \mu_1 \\ y - \mu_2 \end{bmatrix}^\top\begin{bmatrix} \sigma_1^2 & 0 \\ 0 & \sigma_2^2 \end{bmatrix}^{-1}\begin{bmatrix} x - \mu_1 \\ y - \mu_2 \end{bmatrix}}

### Suppose, $$\mu_1,\mu_2 \neq0$$ and $$\sigma_1,\sigma_2 \neq 1$$

f_{X,Y}(x,y) = \frac{1}{2 \pi (\sigma_1^2\cdot\sigma_2^2 - 0\cdot 0)^\frac{1}{2}}e^{-\frac{1}{2}\begin{bmatrix} x - \mu_1 \\ y - \mu_2 \end{bmatrix}^\top\begin{bmatrix} \frac{1}{\sigma_1^2} & 0 \\ 0 & \frac{1}{\sigma_2^2} \end{bmatrix}\begin{bmatrix} x - \mu_1 \\ y - \mu_2 \end{bmatrix}}
f_{X,Y}(x,y) = \frac{1}{2 \pi \sigma_1\cdot\sigma_2}e^{-\frac{1}{2}\begin{bmatrix} x - \mu_1 \\ y - \mu_2 \end{bmatrix}^\top\begin{bmatrix} \frac{1}{\sigma_1^2}(x - \mu_1) \\ \frac{1}{\sigma_2^2}(y - \mu_2) \end{bmatrix}}
f_{X,Y}(x,y) = \frac{1}{\sqrt{2 \pi} \sqrt{2 \pi} \sigma_1\cdot\sigma_2}e^{-\frac{1}{2} (\frac{(x - \mu_1)^2}{\sigma_1^2} + \frac{(y - \mu_2)^2}{\sigma_2^2}) }
=f_{X}(x)f_{Y}(y)

x
y
f_Y(y)
f_{XY}(x,y)
f_{X}(x)

### Revisiting the normal distribution

f_X(x) = \frac{1}{\sqrt{2 \pi}\sigma_1}e^{\frac{-(x-\mu_1)^2}{2\sigma_1^2}}
\mathbf{x} = \begin{bmatrix} x \\ y \end{bmatrix}
\mathbf{\mu} = \begin{bmatrix} \mu_1 \\ \mu_2 \end{bmatrix}
\mathbf{\Sigma} = covariance~matrix
f_{X,Y}(x,y) = \frac{1}{2 \pi |\Sigma|^{\frac{1}{2}}}e^{-\frac{1}{2}(\mathbf{x} - \mu)^T\Sigma^{-1}(\mathbf{x} - \mu)}

### bivariate normal distribution, $$n = 2$$ random var.

f_{X_1,X_2, \dots, X_n}(x_1,x_2, \dots, x_n) = ?

### Conditional Probability Density Functions

P(A|B) = \frac{P(A,B)}{P(B)}

### Recap

p_{X|Y}(x|y) = \frac{p_{X,Y}(x,y)}{p_Y(y)}
(P(B) > 0)
(p_{Y}(y) > 0)

### Conditional PDFs

P(x \leq X \leq x + \delta) \approx f_X(x)\cdot\delta

### By analogy, we would want

P(x \leq X \leq x + \delta | Y \approx y) \approx f_{X|Y}(x|y)\cdot\delta
P(x \leq X \leq x + \delta | y \leq Y \leq y + \delta)

### We cannot have $$Y = y$$ as $$P(Y=y) = 0$$

\frac{P(x \leq X \leq x + \delta , y \leq Y \leq y + \delta)}{P(y \leq Y \leq y + \delta)}
\approx \frac{f_{X,Y}(x,y)\cdot\delta\cdot\delta}{f_Y(y)\cdot\delta}
f_{X|Y}(x|y) = \frac{f_{X,Y}(x,y)}{f_Y(y)}
(f_Y(y) > 0)

### Conditional PDFs

f_{X|Y}(x|y) = \frac{f_{X,Y}(x,y)}{f_Y(y)}
(f_Y(y) > 0)
y
x

### Conditional PDFs

f_{X|Y}(x|y) = \frac{f_{X,Y}(x,y)}{f_Y(y)}
(f_Y(y) > 0)
y
x

### Conditional PDFs

f_{X|Y}(x|y) = \frac{f_{X,Y}(x,y)}{f_Y(y)}
(f_Y(y) > 0)
y
x

f_Y(y)
f_{X,Y}(x, y)

d

### $$\Theta$$: angle between the needle and the plank edges

X
X
\theta
\theta
\frac{l}{2}sin\theta
\frac{l}{2}sin\theta

### $$f_{X\Theta}(x,\theta)= \frac{2}{d} . \frac{2}{\pi}$$

X \leq \frac{l}{2}sin \theta

### Buffon's needle problem

X
X
\theta
\theta
\frac{l}{2}sin\theta
\frac{l}{2}sin\theta

### The needle will intersect if

X \leq \frac{l}{2}sin \theta
\int_0^{\frac{\pi}{2}}\int_0^{\frac{l}{2}sin\theta} f_{X,\Theta}(x,\theta) dx d\theta

### $$f_{X\Theta}(x,\theta)= \frac{2}{d} . \frac{2}{\pi}$$

= \int_0^{\frac{\pi}{2}}\int_0^{\frac{l}{2}sin\theta} \frac{2}{d} \cdot \frac{2}{\pi} dx d\theta
= \frac{4}{d\pi} \int_0^{\frac{\pi}{2}}\int_0^{\frac{l}{2}sin\theta} dx d\theta
= \frac{4}{d\pi} \int_0^{\frac{\pi}{2}}\frac{l}{2}sin\theta d\theta
=\frac{2l}{d\pi}
P(X \leq \frac{l}{2}sin \theta) =

l

l

### $$E[Y]$$

E[Y|X=x] = \int y f_{Y|X}(y|X=x) dy
=\int_{0}^{x} y f_{Y|X}(y|X=x) dy
=\int_{0}^{x} y \frac{1}{x} dy
=\frac{x}{2}

l

### $$E[Y]$$

f_Y(y) = \int f_{X,Y}(x,y) dx
=\frac{1}{l} \log \frac{l}{y}
f_Y(y) = \int_y^l f_{X,Y}(x,y) dx
0\leq y \leq l

l

### $$E[Y]$$

E[Y] = \int yf_{Y}(y) dy
= \int_0^l yf_{Y}(y) dy
= \int_0^l y\frac{1}{l} \log \frac{l}{y} dy
= \frac{l}{4}

### Summary

p_X(x)
p_{X,Y}(x,y)
p_{X|Y}(x|y) = \frac{p_{X,Y}(x,y)}{p_{Y}(y)}
p_{X}(x) = \sum_y p_{X,Y}(x,y)
f_X(x)
f_{X,Y}(x,y)
f_{X|Y}(x|y) = \frac{f_{X,Y}(x,y)}{f_{Y}(y)}
f_{X}(x) = \sum_y f_{X,Y}(x,y)
F_X(x) \leq P(X\leq x)
E[X], Var(X)

### Case 1: two discrete random variables

p_{X|Y}(x|y) = \frac{p_{Y|X}(y|x)p_X(x)}{p_{Y}(y)}
X
Y
p_X(x)
p_{Y|X}(y|x)

### Case 2: two continuous random variables

f_{X|Y}(x|y) = \frac{f_{Y|X}(y|x)f_X(x)}{f_{Y}(y)}
X
Y
f_X(x)
f_{Y|X}(y|x)

### Case 3: X: discrete, Y: continuous

p_{X|Y}(x|y) = \frac{f_{Y|X}(y|x)p_X(x)}{f_{Y}(y)}
X
Y
p_X(x)
f_{Y|X}(y|x)

### Y: noisy signal recieved

f_{Y|X=2}
f_{Y|X=0}
P(X=x, y\leq Y \leq y+\delta)
=P(y\leq Y \leq y+\delta)P(X=x| y\leq Y \leq y+\delta)
=P(X=x)P(y\leq Y \leq y+\delta|X=x)
=p_X(x)f_{Y|X}(y|x)\cdot\delta
=f_{Y}(y)\cdot\delta~p_{X|Y}(x|y)

### Case 4: X: continuous, Y: discrete

f_{X|Y}(x|y) = \frac{p_{Y|X}(y|x)f_X(x)}{p_{Y}(y)}
X
Y
f_X(x)
p_{Y|X}(y|x)