Recap: List of Topics

Descriptive Statistics

Probability Theory

Inferential Statistics

Different types of data

Different types of plots

Measures of centrality and spread

Counting, Sample spaces, events

Conditional Prob. (3 laws)

RVs, Expectation and Distributions

Sampling strategies

Distributions of Sampling Statistics

Hypothesis testing (z-test, t-test)

ANOVA, Chi-square test

Interval and Point Estimators

Learning Objectives

What are random variables?

What is a probability mass function ?

What are Expectation and Variance (and some of their properties) ?

What is a probability density function?

What is a normal distribution?

What are some standard probability mass functions?

Random Variable

Recap

Experiments, sample spaces, events

Axioms and Laws of Probability

This chapter: Focus on numerical quantities associated with the outcomes of experiments

Mapping outcomes to

(1,1)

(1,2)~(2,1)

(1,3)~(2,2)~(3,1)

(1,4)~(2,3)~(3,2)~(4,1)

(1,5)~(2,4)~(3,3)~(4,2)~(5,1)

(1,6)~(2,5)~(3,4)~(4,3)~(5,2)~(6,1)

(2,6)~(3,5)~(4,4)~(5,3)~(6,2)

(3,6)~(4,5)~(5,4)~(6,3)

(4,6)~(5,5)~(6,4)

(5,6)~(6,5)

(6,6)

\Omega

\mathbb{R}

In board games we care about the sum and not the outcomes which lead to the sum

\mathbb{R}

Q of Interest: What is the probability that the sum will be 10?

Mapping outcomes to

\Omega

\mathbb{R}

In board games we care about the sum and not the outcomes which lead to the sum

\mathbb{R}

Q of Interest: What is the probability that the sum will be 10?

\Omega

Q of Interest: What is the probability that a student's CGPA is 4.5?

Mapping outcomes to

\mathbb{R}

4.25

4.5

\mathbb{R}

CGPA

Height

Weight

Vit. D3

Age

\Omega

Qs of Interest:

What is the probability that an employee has 2 children?

What is the probability that an employee's monthly salary is greater than 50K?

Mapping outcomes to

\mathbb{R}

Experiment: Randomly select an employee

: All employees of the organisation

: Number of years of experience, number of projects, salary, income tax, num. children

\Omega

Qs of Interest:

What is the probability that the size of the farm is less than 2 acres?

What is the probability that the total yield is greater than 1 ton?

Mapping outcomes to

\mathbb{R}

Experiment: Randomly select a farm

: All farms in the state

: size of the farm, total yield, soil moisture, water content

Mapping outcomes to

\mathbb{R}

X: \Omega

\mathbb{R}

A random variable is a function from a set of possible outcomes to the set of real numbers

(could be a subset of R)

function

(random variable)

domain

range

Mapping outcomes to

\mathbb{R}

\Omega

\mathbb{R}

Multiple functions (random variables) are possible for the given domain (sample space)

students

(domain)

X_1: height

\mathbb{R}

X_1: weight

X_1: CGPA

Notation

X (n_1, n_2)

: sum of the numbers on two dice (n1 + n2)

Y (student)

: height of the student

Unlike functions we don't write brackets and arguments!

Questions

What are the values that the random variable can take?

What are the probabilities of the values that the random variable can take?

(discrete or continuous)

(we will return back to this later)

Types of random variables

Discrete

(finite or countably infinite)

the sum of the numbers on two dice

the outcome of a single die

the number of tosses after which a heads appears (countably infinite)

the number of children that an employee has

the number of cars in an image

Types of random variables

Continuous

the amount of rainfall in Chennai

the temperature of a surface

the density of a liquid

the height of a student

the haemoglobin level of a patient

Probability Mass Function

What are the probabilities of the values that the random variable can take?

Assigning probabilities

Q of Interest: What is the probability that the value of the random variable will be x?

P(X = x)~?

\forall x \in \{2,3,4,5,6,7,8,9,10,11,12\}

X: \Omega

\mathbb{R}

P(X=x)

[0,1]

\{2,3,4,5,6,7,8,9,10,11,12\}

An assignment of probabilities to all possible values that a discrete RV can take is called the distribution of the discrete random variable

Assigning probabilities

P(X = x)

\{1, 2,3,4,5,6\}

For discrete RVs we can think of the distribution as a table

X: \Omega

\mathbb{R}

P(X=x)

[0,1]

\frac{1}{6}

Assigning probabilities

\{2,3,4,5,6,7,8,9,10,11,12\}

X: \Omega

\mathbb{R}

P(X=x)

[0,1]

(1,1)

(1,2)~(2,1)

(1,3)~(2,2)~(3,1)

(1,4)~(2,3)~(3,2)~(4,1)

(1,5)~(2,4)~(3,3)~(4,2)~(5,1)

(1,6)~(2,5)~(3,4)~(4,3)~(5,2)~(6,1)

(2,6)~(3,5)~(4,4)~(5,3)~(6,2)

(3,6)~(4,5)~(5,4)~(6,3)

(4,6)~(5,5)~(6,4)

(5,6)~(6,5)

(6,6)

\Omega

Event: X = x

P(X = x)

\frac{1}{36}

\frac{2}{36}

\frac{3}{36}

\frac{4}{36}

\frac{5}{36}

\frac{6}{36}

\frac{5}{36}

\frac{4}{36}

\frac{3}{36}

\frac{2}{36}

\frac{1}{36}

Assigning probabilities

\{2,3,4,5,6,7,8,9,10,11,12\}

X: \Omega

\mathbb{R}

P(X=x)

[0,1]

P(X = x)

\frac{1}{36}

\frac{2}{36}

\frac{3}{36}

\frac{4}{36}

\frac{5}{36}

\frac{6}{36}

\frac{5}{36}

\frac{4}{36}

\frac{3}{36}

\frac{2}{36}

\frac{1}{36}

Assigning probabilities

\{2,3,4,5,6,7,8,9,10,11,12\}

X: \Omega

\mathbb{R}

p_X(x) = P(X=x) = P({\omega \in \Omega: X(\omega) = x})

[0,1]

Key idea:

Think of the event corresponding to

X = x

Once we know this event (subset of sample space) we know how to compute P(X=x)

(Probability distribution of the random variable X)

P(X=x)

Probability Mass Function

p_{X}(x)

Probability Mass Function (PMF)

Probability Distribution

Distribution

Properties of a PMF

p_X(x) \geq 0

p_X(x) = P(X = x) = P(\{\omega \in \Omega: X(\omega) = x\} ) \geq 0

\sum_{x \in \mathbb{R}_X} p_X(x) = 1

\mathbb{R}_X \subset \mathbb{R}

(the set of values that the RV can take)

(the support of the RV)

p_X(x) \geq 0

\sum_{x \in \mathbb{R}_X} p_X(x) = 1

Properties of a PMF

\sum_{x \in \mathbb{R}_X} p_X(x) = 1

\sum_{x\in \mathbb{R}_X} p_X(x) = \sum_{x\in \mathbb{R}_X} P(X =x)

RHS is the sum of the probabilities of disjoint events which partition

\Omega

RHS sums to 1

\therefore

Proof:

Discrete distributions

Probability Mass Functions for discrete random variables

Recap

Random variables

Distribution of a random variable

An assignment of probabilities to all possible values that a discrete RV can take

(can be tedious even in simple cases)

Can PMF be specified compactly?

p_X(x) = \begin{cases} \frac{1}{36} & if~x = 2 \\ \frac{2}{36} & if~x = 3 \\ \frac{3}{36} & if~x = 4 \\ \frac{4}{36} & if~x = 5 \\ \frac{5}{36} & if~x = 6 \\ \frac{6}{36} & if~x = 7 \\ \frac{5}{36} & if~x = 8 \\ \frac{4}{36} & if~x = 9 \\ \frac{3}{36} & if~x = 10 \\ \frac{2}{36} & if~x = 11 \\ \frac{1}{36} & if~x = 12 \\ \end{cases}

can be tedious to enumerate when the support of X is large

\mathbb{R}_X = \{1, 2, 3, 4, 5, 6, \dots, \infty\}

X: random variable indicating the number of tosses after which you observe the first heads

p_X(x) = \begin{cases} .. & if~x = 1 \\ .. & if~x = 2 \\ .. & if~x = 3 \\ .. & if~x = 4 \\ .. & if~x = 5 \\ .. & if~x = 6 \\ .. & .. \\ .. & .. \\ .. & if~x = \infty \\ \end{cases}

p_X(x) = (1-p)^{(x-1)}\cdot p

compact

easy to compute

no enumeration needed

but ... ....

Can PMF be specified compactly?

p: probability~of~heads

\mathbb{R}_X = \{1, 2, 3, 4, 5, 6, \dots, \infty\}

X: random variable indicating the number of tosses after which you observe the first heads

p_X(x) = (1-p)^{(x-1)}\cdot p

How did we arrive at the above formula?

Is it a valid PMF (satisfying propertied of a PMF)

What is the intuition behind it?

(we will return back to these Qs later)

Can PMF be specified compactly?

\mathbb{R}_X = \{1, 2, 3, 4, 5, 6, \dots, \infty\}

X: random variable indicating the number of tosses after which you observe the first heads

p_X(x) = (1-p)^{(x-1)}\cdot p

For now, the key point is

it is desirable to have the entire distribution be specified by one or few parameters

Can PMF be specified compactly?

Why is this important?

the entire distribution can be specified by some parameters

P(label = cat | image) ?

cat? dog? owl? lion?

p_X(x) = f(x)

A very complex function whose parameters are learnt from data!

Bernoulli Distribution

Experiments with only two outcomes

Outcome: {positive, negative}

Outcome: {pass, fail}

Outcome: {hit, flop}

Outcome: {spam, not spam}

Outcome: {approved, denied}

\{0, 1\}

X: \Omega

Bernoulli Random Variable

{failure, success}

\Omega:

Bernoulli trials

Bernoulli Distribution

\{0, 1\}

X: \Omega

Bernoulli Random Variable

{failure, success}

\Omega:

event that the outcome is success

Let~P(A) = P(success) = p

p_X(1) = p

p_X(0) = 1 - p

p_X(x) = p^x(1 - p)^{(1- x)}

Bernoulli Distribution

\{0, 1\}

X: \Omega

Bernoulli Random Variable

{failure, success}

\Omega:

p_X(x) \geq 0

\sum_{x \in\{0, 1\}}p_X(x) = p_X(0) + p_X(1)

= (1-p) + p = 1

\sum_{x \in\{0, 1\}}p_X(x) = 1 ?

Is Bernoulli distribution a valid distribution?

Binomial Distribution

Repeat a Bernoulli trial n times

independent

identical

(success/failure in one trial does not affect the outcome of other trials)

(probability of success 'p' in each trial is the same)

... n times

What is the probability of k successes in n trials?

(k \in [0, n])

Binomial Distribution (Examples)

Each ball bearing produced in a factory independently non-defective with probability p

... n times

If you select n ball bearings what is the probability that k of them will be defective?

Binomial Distribution (Examples)

The probability that a customer purchases something from your website is p

... n times

What is the probability that k out of the n customers will purchase something?

Assumption 1 : customers are identical (economic strata, interests, needs, etc)

Assumption 2 : customers are independent (one's decision does not influence another)

Binomial Distribution (Examples)

Marketing agency: The probability that a customer opens your email is p

... n times

If you send n emails what is the probability that the customer will open at least one of them?

Binomial Distribution

... n times

p_X(x) = ?

random variable indicating the the number of successes in n trials

x \in \{0, 1, 2, 3, \dots, n\}

Challenge:

n and k can be very large

difficlult to enumerate all probabilities

Binomial Distribution

... n times

p_X(x) = ?

random variable indicating the the number of successes in n trials

x \in \{0, 1, 2, 3, \dots, n\}

Desired:

Fully specify in terms of n and p

p_X(x)

(we will see how to do this)

Binomial Distribution

... n times

S, F

How many different outcomes can we have if we repeat a Bernoulli trial n times?

(sequence of length n from a given set of 2 objects)

2^n~outcomes

Binomial Distribution

... n times

TTT\\ TTH\\ THT\\ THH\\ HTT\\ HTH\\ HHT\\ HHH

Example: n = 3, k = 1

\Omega

0\\ 1\\ 2\\ 3\\

A = \{HTT, THT, TTH\}

p_X(1) = P(A)

P(A) = P(\{HTT\}) \\+ P(\{THT\}) \\+P(\{TTH\})

P(\{HTT\}) = p(1-p)(1-p)

P(\{THT\}) = (1-p)p(1-p)

P(\{TTH\}) = (1-p)(1-p)p

Binomial Distribution

... n times

Example: n = 3, k = 1

A = \{HTT, THT, TTH\}

= 3 (1-p)^{(3-1)}p^1

p_X(1) = P(A) = 3 (1-p)^2p

= {3 \choose 1} (1-p)^{(3-1)}p^1

Binomial Distribution

... n times

Example: n = 3, k = 2

B = \{HTH, HHT, THH\}

= 3 (1-p)^{(3-2)}p^2

p_X(2) = P(B) = 3 (1-p)p^2

= {3 \choose 2} (1-p)^{(3-2)}p^2

Binomial Distribution

... n times

Observations

n \choose k

favorable outcomes

each of the k successes occur independently with a probability p

each of the n-k failures occur independently with a probability 1 - p

terms in the summation

n \choose k

each term will have the factor

p^k

each term will have the factor

(1-p)^{(n-k)}

n \choose k

p^k

(1-p)^{(n-k)}

Binomial Distribution

... n times

terms in the summation

n \choose k

each term will have the factor

p^k

each term will have the factor

(1-p)^{(n-k)}

n \choose k

p^k

(1-p)^{(n-k)}

p_X(k) =

Parameters:

p, n

the entire distribution is full specified once the values of p and n are known

Example 1: Social distancing

... n times

Suppose 10% of your colleagues from workplace are infected with COVID-19 but are asymptomatic (hence come to office as usual)

Suppose you come in close proximity of 50 of your colleagues. What is the probability of you getting infected?

(Assume you will get infected if you come in close proximity of a person)

Trial: Come in close proximity of a person

p = 0.1 - probability of success/infection in a single trial

n = 50 trials

Example 1: Social distancing

... n times

Suppose 10% of your colleagues from workplace are infected with COVID-19 but are asymptomatic (hence come to office as usual)

Suppose you come in close proximity of 50 of your colleagues. What is the probability of you getting infected?

n = 50, p = 0.1

P(getting~infected) = P(at~least~one~success)

= 1 - P(0~successes)

= 1 - p_X(0)

= 1 - {50 \choose 0}p^0(1-p)^{50}

= 1 - 1*1*0.9^{50} = 0.9948

Stay at home!!

Example 1: Social distancing

... n times

Suppose 10% of your colleagues from workplace are infected with COVID-19 but are asymptomatic (hence come to office as usual)

What if you interact with only 10 colleagues instead of 50

n = 10, p = 0.1

P(getting~infected) = P(at~least~one~success)

= 1 - P(0~successes)

= 1 - p_X(0)

= 1 - {10 \choose 0}p^0(1-p)^{10}

= 1 - 1*1*0.9^{10} = 0.6513

Still stay at home!!

Example 1: Social distancing

... n times

Suppose 10% of your colleagues from workplace are infected with COVID-19 but are asymptomatic (hence come to office as usual)

What if only 2% of your colleagues are infected instead of 10% ?

n = 10, p = 0.02

P(getting~infected) = P(at~least~one~success)

= 1 - P(0~successes)

= 1 - p_X(0)

= 1 - {10 \choose 0}p^0(1-p)^{10}

= 1 - 1*1*0.98^{10} = 0.1829

Perhaps, still not worth taking a chance!!

Example 2: Linux users

... n times

10% of students in your class use linux. If you select 25 students at random

(a) What is the probability that exactly 3 of them are using linux

(b) What is the probability that between 2 to 6 of them are using linux ?

(c) How would the above probabilities change if instead of 10%, 90% were using linux ?

n = 25, p =0.1, k = 3

n = 25, p =0.1, k = {2,3,4,5,6}

n = 25, p =0.9, p = 0.5

Example 2: Linux users

... n times

10% of students in your class use linux. If you select 25 students at random

n = 25, p =0.1, k = 3

n = 25, p =0.1, k = {2,3,4,5,6}

n = 25, p =0.9, p = 0.5

import seaborn as sb
import numpy as np
from scipy.stats import binom

x = np.arange(0, 25)
n=25
p = 0.1

dist = binom(n, p)
ax = sb.barplot(x=x, y=dist.pmf(x))

n \choose k

p^k

(1-p)^{(n-k)}

p_X(k) =

2042975 * 0.1^9 *0.9^{16}

= 0.000378

Example 2: Linux users

... n times

10% of students in your class use linux. If you select 25 students at random

n = 25, p =0.1, k = 3

n = 25, p =0.1, k = {2,3,4,5,6}

n = 25, p =0.9, p = 0.5

import seaborn as sb
import numpy as np
from scipy.stats import binom

x = np.arange(0, 25)
n=25
p = 0.1

dist = binom(n, p)
ax = sb.barplot(x=x, y=dist.pmf(x))

n \choose k

p^k

(1-p)^{(n-k)}

p_X(k) =

2042975 * 0.1^{16} *0.9^{9}

= 7.1*10^{-11}

Example 2: Linux users

... n times

10% of students in your class use linux. If you select 25 students at random

n = 25, p =0.1, k = 3

n = 25, p =0.1, k = {2,3,4,5,6}

n = 25, p =0.9, p = 0.5

import seaborn as sb
import numpy as np
from scipy.stats import binom

x = np.arange(0, 25)
n=25
p = 0.1

dist = binom(n, p)
ax = sb.barplot(x=x, y=dist.pmf(x))

n \choose k

p^k

(1-p)^{(n-k)}

p_X(k) =

Binomial Distribution

p_X(x) \geq 0

\sum_{i=0}^n p_X(i) = 1 ?

Is Binomial distribution a valid distribution?

... n times

n \choose k

p^k

(1-p)^{(n-k)}

p_X(k) =

Binomial Distribution

\sum_{i=0}^n p_X(i) = 1 ?

... n times

\sum_{i=0}^n p_X(i)

= p_X(0) + p_X(1) + p_X(2) + \cdots + p_X(n)

= {n \choose 0} p^0(1-p)^{n} + {n \choose 1} p^1(1-p)^{(n - 1)} + {n \choose 2} p^2(1-p)^{(n - 2)} + \dots {n \choose n} p^n(1-p)^{0}

(a+b)^n = {n \choose 0} a^0b^{n} + {n \choose 1} a^1b^{(n - 1)} + {n \choose 2} a^2b^{(n - 2)} + \dots {n \choose n} a^n(b)^{0}

a = p, b = 1- p

Bernoulli (a special case of Binomial)

... n times

n \choose k

p^k

(1-p)^{(n-k)}

p_X(k) =

Binomial

Bernoulli

n = 1, k \in \{0, 1\}

p_X(0) = {1 \choose 0} p^0 (1-p)^1 = 1 - p

p_X(1) = {1 \choose 1} p^1 (1-p)^0 = p

Geometric Distribution

\dots \infty~times

The number of tosses until we see the first heads

\mathbb{R}_X = \{1,2,3,4,5, \dots\}

p_X(x) =?

Why would we be interested in such a distribution ?

Geometric Distribution

\dots \infty~times

Hawker selling belts outside a subway station

Why would we be interested in such a distribution ?

Salesman handing pamphlets to passersby

(chance that the first belt will be sold after k trials)

(chance that the k-th person will be the first person to actually read the pamphlet)

A digital marketing agency sending emails

(chance that the k-th person will be the first person to actually read the email)

Useful in any situation involving "waiting times"

independent trials

identical distribution

P(success) = p

Geometric Distribution

\dots \infty~times

Example: k = 5

p_X(5)

F F F F S

P(success) = p

(1-p)

\underbrace{~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~}_{(5-1)}

\underbrace{}_{1}

=(1-p)^{(5-1)}p

p_X(k)=(1-p)^{(k-1)}p

Geometric Distribution

\dots \infty~times

p=0.2

P(success) = p

import seaborn as sb
import numpy as np
from scipy.stats import geom


x = np.arange(0, 25)

p = 0.2
dist = geom(p)
ax = sb.barplot(x=x, y=dist.pmf(x))

Geometric Distribution

\dots \infty~times

p=0.9

P(success) = p

import seaborn as sb
import numpy as np
from scipy.stats import geom


x = np.arange(0, 25)

p = 0.9
dist = geom(p)
ax = sb.barplot(x=x, y=dist.pmf(x))

Geometric Distribution

\dots \infty~times

P(success) = p

p=0.5

import seaborn as sb
import numpy as np
from scipy.stats import geom


x = np.arange(0, 25)

p = 0.5
dist = geom(p)
ax = sb.barplot(x=x, y=dist.pmf(x))

p_X(k)=(1-p)^{(k-1)}p

p_X(k)=(0.5)^{(k-1)}0.5

p_X(k)=(0.5)^{k}

Geometric Distribution

p_X(x) \geq 0

\sum_{k=1}^\infty p_X(i) = 1 ?

Is Geometric distribution a valid distribution?

p_X(k) = (1 - p)^{(k-1)}p

P(success) = p

= (1 - p)^{0}p + (1 - p)^{1}p + (1 - p)^{2}p + \dots

= \sum_{k=0}^\infty (1 - p)^{k}p

= \frac{p}{1 - (1 - p)} = 1

a, ar, ar^2, ar^3, ar^4, \dots

a=p~and~r=1-p < 1

\dots \infty~times

Example: Donor List

A patient needs a certain blood group which only 9% of the population has?

P(success) = p

What is the probability that the 7th volunteer that the doctor contacts will be the first one to have a matching blood group?

What is the probability that at least one of the first 10 volunteers will have a matching blood type ?

\dots \infty~times

Example: Donor List

A patient needs a certain blood group which only 9% of the population has?

p = 0.09

P(X <=10)

p_X(7) = ?

= 1 - P(X > 10)

= 1 - (1-p)^{10}

\dots \infty~times

Uniform Distribution

Experiments with equally likely outcomes

p_X(x) = \frac{1}{6}~~~\forall x \in \{1,2,3,4,5,6\}

outcome of a die

Uniform Distribution

Experiments with equally likely outcomes

p_X(x) = \begin{cases} \frac{1}{b - a + 1}~~~a \leq x \leq b \\~\\ 0~~~~~~~~~otherwise \end{cases}

outcome of a bingo/housie draw

p_X(x) = \frac{1}{100}~~~1 \leq x \leq 100

\mathbb{R}_X = \{x: a \leq x \leq b\}

Uniform Distribution

Special cases

p_X(x) = \begin{cases} \frac{1}{b - a + 1} = \frac{1}{n}~~~1 \leq x \leq n \\~\\ 0~~~~~~~~~otherwise \end{cases}

a = 1 ~~~~ b = n

p_X(x) = \begin{cases} \frac{1}{b - a + 1} = 1~~~x = c \\~\\ 0~~~~~~~~~otherwise \end{cases}

a = 1 ~~~~ b = c

Uniform Distribution

p_X(x) \geq 0

\sum_{k=1}^\infty p_X(i) = 1 ?

Is Uniform distribution a valid distribution?

p_X(x) = \frac{1}{b - a + 1}

=\sum_{i=a}^b \frac{1}{b-a+1}

=(b-a+1) * \frac{1}{b-a+1} = 1

Expectation

00, 0, 1, 2,3, ..., 36

Does gambling pay off?

Standard pay off - 35:1 if the ball lands on your number

If you play this game a 1000 times how much do you expect to win on average ?

00

0

1

2

3 ...

36 Does gambling pay off?

X: profit

\Omega

-1

34

p_X(34) =

\frac{1}{38} = 0.026

p_X(-1) = 1 - \frac{1}{38} = 0.974

Does gambling pay off?

P(win) = \frac{\#wins}{\#games}

\frac{1}{36} = 0.026

p_X(-1) = 1 - \frac{1}{38} = 0.974

If you play this game a 1000 times how much do you expect to win on average ?

p_X(34) =

0.026 = \frac{\#wins}{1000}

Avg. gain = ~~~~~~~(26*34 + 974*(-1))

\#wins = 26

\frac{1}{1000}

= -64

(Stop gambling!!)

Expectation: the formula

E[X]

E[X] = \frac{1}{1000}(26*34 + 974*(-1))

E[X] = \frac{26}{1000} * 34 + \frac{974}{1000} * (-1)

E[X] = \sum_{x\in\{-1, 34\}}x*p_X(x)

E[X] = p_X(34)*34 + p_X(-1)*(-1)

E[X] = 0.026*34 + 0.974*(-1)

Expectation: the formula

E[X]

E[X] = \sum_{x\in\mathbb{R}_X}x*p_X(x)

The expected value or expectation of a discrete random variable X whose possible values are

x_1, x_2, \dots, x_n

is denoted by and computed as

E[X]

E[X] = \sum_{i=1}^n x_i P(X = x_i) = \sum_{i=1}^n x_i*p_X(x_i)

Expectation: Insurance

A person buys a car theft insurance policy of INR 200000 at an annual premium of INR 6000. There is a 2% chance that the car may get stolen.

What is the expected gain of the insurance company at the end of 1 year?

X: profit

X: \{6000, -194000\}

p_X(6000) = 0.98

p_X(-194000) = 0.02

E[X] = \sum_{x \in \mathbb{R}_x} x*p_X(x)

\therefore E[X] = 0.98*6000 + 0.02 * (-194000)

= 2000

Expectation: Insurance

A person buys a car theft insurance policy of INR 200000. Suppose there is a 10% chance that the car may get stolen

What should the premium be so that the expected gain in still INR 2000?

X: profit

X: \{x, -(200000 - x) \}

p_X(x) = 0.90

p_X(x - 200000) = 0.10

E[X] = \sum_{x \in \mathbb{R}_x} x*p_X(x)

\therefore E[X] = 0.9*x + 0.1 * (x - 200000)

\therefore 2000 = 0.9*x + 0.1 * (x - 200000)

\therefore x = 22000

X: \{x, x - 200000 \}

Function of a Random Variable

(1,1)

(1,2)~(2,1)

(1,3)~(2,2)~(3,1)

(1,4)~(2,3)~(3,2)~(4,1)

(1,5)~(2,4)~(3,3)~(4,2)~(5,1)

(1,6)~(2,5)~(3,4)~(4,3)~(5,2)~(6,1)

(2,6)~(3,5)~(4,4)~(5,3)~(6,2)

(3,6)~(4,5)~(5,4)~(6,3)

(4,6)~(5,5)~(6,4)

(5,6)~(6,5)

(6,6)

\Omega

Y = g(X)

E[Y] = ?

Y = \begin{cases} 1~~if~~x < 5 \\ 2~~if~~5 \leq x \leq 8 \\ 3~~if~~x > 8 \\ \end{cases}

Y = g(X)

Y = \begin{cases} 1~~if~~x < 5 \\ 2~~if~~5 \leq x \leq 8 \\ 3~~if~~x > 8 \\ \end{cases}

E[Y] = 1*p_Y(1) + 2*p_Y(2) + 3*p_Y(3)

p_Y(1) = \frac{1}{36} + \frac{2}{36} + \frac{3}{36} = \frac{6}{36}

p_Y(2) = \frac{4}{36} + \frac{5}{36} + \frac{6}{36} + \frac{5}{36} = \frac{20}{36}

p_Y(3) = \frac{4}{36} + \frac{3}{36} + \frac{2}{36} + \frac{1}{36} = \frac{10}{36}

\therefore E[Y] = 1*(\frac{1}{36} + \frac{2}{36} + \frac{3}{36})

+ 2*(\frac{4}{36} + \frac{5}{36} + \frac{6}{36} + \frac{5}{36})

+3*(\frac{4}{36} + \frac{3}{36} + \frac{2}{36} + \frac{1}{36}) = \frac{76}{36}

Y = g(X)

\frac{6}{36}

\frac{20}{36}

\frac{30}{36}

p_X(x)

p_Y(y)

Y = g(X)

\therefore E[Y] = 1*(\frac{1}{36} + \frac{2}{36} + \frac{3}{36})

+3*(\frac{4}{36} + \frac{3}{36} + \frac{2}{36} + \frac{1}{36}) = \frac{76}{36}

+ 2*(\frac{4}{36} + \frac{5}{36} + \frac{6}{36} + \frac{5}{36})

\therefore E[Y] = 1*p_X(2) + 1 * p_X(3) + 1 * p_X(4)

+ 2*p_X(5) + 2 * p_X(6) + 2 * p_X(7) + 2 * p_X(8)

+ 3*p_X(9) + 3 * p_X(10) + 3 * p_X(11) + 3 * p_X(12)

Y = g(X)

\therefore E[Y] = 1*(\frac{1}{36} + \frac{2}{36} + \frac{3}{36})

+3*(\frac{4}{36} + \frac{3}{36} + \frac{2}{36} + \frac{1}{36}) = \frac{76}{36}

+ 2*(\frac{4}{36} + \frac{5}{36} + \frac{6}{36} + \frac{5}{36})

\therefore E[Y] = g(2)*p_X(2) + g(3) * p_X(3) + g(4) * p_X(4)

+ g(5)*p_X(5) + g(6) * p_X(6) + g(7) * p_X(7) + g(8) * p_X(8)

+ g(9)*p_X(9) + g(10) * p_X(10) + g(11) * p_X(11) + g(12) * p_X(12)

Y = \begin{cases} 1~~if~~x < 5 \\ 2~~if~~5 \leq x \leq 8 \\ 3~~if~~x > 8 \\ \end{cases}

\therefore E[Y] = \sum_x g(x)*p_X(x)

E[Y] = \sum_x g(x)*p_X(x)

E[Y] = \sum_y y*p_Y(y)

\equiv

Some properties of expectation

Linearity of expectation

Y = aX + b

E[Y] = \sum_{x \in \mathbb{R}_X} g(x)p_X(x)

= \sum_{x \in \mathbb{R}_X} a*x*p_X(x) +\sum_{x \in \mathbb{R}_X} b*p_X(x)

= a*E[X] + b*1~~(\because \sum_{x \in \mathbb{R}_X} p_X(x) = 1)

= a*\sum_{x \in \mathbb{R}_X} x*p_X(x) +b*\sum_{x \in \mathbb{R}_X} p_X(x)

= \sum_{x \in \mathbb{R}_X} (ax + b)p_X(x)

= aE[X] + b

Expectation of sum of RVs

Given a set of random variables

X_1, X_2, \dots, X_n

E[\sum_{i=1}^{n} X_i] = \sum_{i=1}^n E[X_i]

Expectation as mean of population

n students

\Omega

W: weights

30

40

50

60

E[W] = \sum_{i=1}^np_W(w_i)*w_i

p_W(w_i) = \frac{1}{n}

= \frac{1}{n}\sum_{i=1}^nw_i

(centre of gravity)

Expectation as centre of gravity

A patient needs a certain blood group which only 9% of the population has?

p = 0.09

E[X] = ?

= 1*0.09 + 2 * 0.91*0.09

+ 3 * 0.91 ^2 *.09 + 4 * 0.91^3*0.09 + \dots

= 0.09(\frac{a}{1-r} + \frac{dr}{(1-r)^2})

= \frac{1}{0.09} = \frac{1}{p}

= 0.09(1 + 2 * 0.91 + 3*0.91^2 + 4*0.91^3 + \dots

(a = 1, d = 1, r = 0.91)

= 11.11

Variance of a Random Variable

Variance of a RV

Expectation summarises a random variable

(but does not capture the spread of the RV)

E[X] = 0

E[Y] = 0

E[Z] = 0

-1

\frac{1}{2}

-100

-50

+100

+50

\frac{1}{4}

Recap

Variance

\sigma^2 = \frac{1}{n-1} \sum_{i=1}^n (x_i - \mu) ^ 2

-1

-100

-50

+100

+50

Variance

Var(X) = E[(X - E(X))^2]

-1

-100

-50

+100

+50

\sigma^2 = \frac{1}{n-1} \sum_{i=1}^n (x_i - \mu) ^ 2

Variance

Var(X) = E[(X - E(X))^2]

\sigma^2 = \frac{1}{n-1} \sum_{i=1}^n (x_i - \mu) ^ 2

\therefore Var(X) = E[X^2] -2\mu\cdot\mu + \mu^2

\therefore Var(X) = E[X^2] -2\mu E[ X] + \mu^2

\therefore Var(X) = E[X^2] - \mu^2 = E[X^2] - (E[X])^2

\therefore Var(X) = E[X^2] -E[2\mu X] + E[\mu^2]

\therefore Var(X) = E[(X - \mu)^2] = E[X^2 -2\mu X + \mu^2]

E[X] = \mu

Variance

Var(X) = E[X^2] - (E[X])^2

g(X) = X^2

\therefore E[X^2] = E[g(X)] = \sum_{x} p_X(x)g(x)

= 2^2 * \frac{1}{36} + 3^2 * \frac{2}{36} + 4^2 * \frac{3}{36} + 5^2 * \frac{4}{36} + 6^2 * \frac{5}{36}

+ 11^2 * \frac{2}{36}+ 12^2 * \frac{1}{36}

+ 7^2 * \frac{6}{36}+ 8^2 * \frac{5}{36}+ 9^2 * \frac{4}{36}+ 10^2 * \frac{3}{36}

= 54.83

Variance

Var(X) = E[X^2] - (E[X])^2

= 54.83 - 7^2

= 5.83

Variance: Mutual Funds

MF1 expected (average) returns: 8%

(12%, 2%, 25%, -9%, 10%)

MF2 expected (average) returns: 8%

(7%, 6%, 9%, 12%, 6%)

E[X] = \frac{1}{5}*12+\frac{1}{5}*2+\frac{1}{5}*25+\frac{1}{5}*(-9)+\frac{1}{5}*10 = 8

E[Y] = \frac{1}{5}*7+\frac{1}{5}*6+\frac{1}{5}*9+\frac{1}{5}*12+\frac{1}{5}*6 = 8

E[X^2] = \frac{1}{5}*12^2+\frac{1}{5}*2^2+\frac{1}{5}*25^2+\frac{1}{5}*(-9)^2+\frac{1}{5}*10^2

= 190.8

Variance: Mutual Funds

MF1 expected (average) returns: 8%

(12%, 2%, 25%, -9%, 10%)

MF2 expected (average) returns: 8%

(7%, 6%, 9%, 12%, 6%)

Var(X) = E[X^2] - (E[X])^2

Var(X) = 190.8 - 8^2 = 126.8

Var(Y) = 69.2 - 8^2 = 5.2

Properties of Variance

Var(aX + b) = E[(aX + b - E[aX + b])^2]

= E[(aX + b - aE[X] - b)^2]

= E[(a(X - E[X]))^2]

= E[a^2(X - E[X])^2]

= a^2E[(X - E[X])^2]

=a^2Var(X)

Properties of Variance

Var(X + X) = Var (2X) = 2^2Var(X)

Variance of sum of random variables

\neq Var(X) + Var(X)

In general, variance of the sum of random variables is not equal to the sum of the variances of the random variables

except......

Properties of Variance

P(X = x | Y = y) = P(X = x)~~\forall x \in \mathbb{R}_X, y \in \mathbb{R}_Y

Independent random variables

(X and Y are independent)

X: number~on~first~die

Y: sum~of~two~dice

P(Y = 8 ) = \frac{5}{36}

P(Y = 8 | X = 1) = 0 \neq P(Y = 8)

(X and Y are not independent)

Properties of Variance

We say that n random variables

X_1, X_2, \dots, X_n

are independent if

P(X_1 = x_1, X_2 = x_2, \dots, X_n = x_n)

= P(X_1=x_1)P (X_2=x_2) \dots P(X_n = x_n)

\forall x_1\in\mathbb{R}_{X_1}, x_2\in\mathbb{R}_{X_2}, \dots, x_n\in\mathbb{R}_{X_n}

Given such n random variables

Var(\sum_{i=1}^{n} X_i) = \sum_{i=1}^n Var(X_i)

Summary

Probability Mass Function

\dots n~times

\dots \infty~times

p_X(x) = p^x(1 - p)^{(1- x)}

n \choose k

p^k

(1-p)^{(n-k)}

p_X(k) =

p_X(k) = (1 - p)^{(k-1)}p

p_X(x) = \begin{cases} \frac{1}{b - a + 1}~~~a \leq x \leq b \\~\\ 0~~~~~~~~~otherwise \end{cases}

E[X] = \sum_{x\in\mathbb{R}_X}x*p_X(x)

E[Y] = \sum_x g(x)*p_X(x)

E[Y] = \sum_y y*p_Y(y)

\equiv

Var(X) = E[X^2] - (E[X])^2

Y = aX + b

E[Y]= aE[X] + b

Given n RVs

Var(\sum_{i=1}^{n} X_i) \\= \sum_{i=1}^n Var(X_i)

Var(Y) =a^2Var(X)

Continuous Random Variables

Recap

Probability Mass Function

p_X(x)

1~~~2~~~3~~~4~~~5~~~6

0.2

0.4

0.6

0.8

1.0

Cumulative Distribution Function

p_X(x)

1~~~2~~~3~~~4~~~5~~~6

0.2

0.4

0.6

0.8

1.0

F_X(x)

1~~~2~~~3~~~4~~~5~~~6

0.2

0.4

0.6

0.8

1.0

P(X \leq x)

Cumulative Distribution Function

F_X(x)

p_X(x)

P(X \leq x)

\frac{1}{36}

\frac{2}{36}

\frac{3}{36}

\frac{4}{36}

\frac{5}{36}

\frac{6}{36}

\frac{5}{36}

\frac{4}{36}

\frac{3}{36}

\frac{2}{36}

\frac{1}{36}

\frac{1}{6}

\frac{2}{6}

\frac{3}{6}

\frac{4}{6}

\frac{5}{6}

1 = \frac{6}{6}

\frac{1}{36}

\frac{3}{36}

\frac{6}{36}

\frac{10}{36}

\frac{15}{36}

\frac{21}{36}

\frac{1}{6}

\frac{2}{6}

\frac{3}{6}

\frac{4}{6}

\frac{5}{6}

1 = \frac{6}{6}

\frac{26}{36}

\frac{30}{36}

\frac{33}{36}

\frac{35}{36}

\frac{36}{36}

Recap

p_X(x)

\frac{1}{36}

\frac{2}{36}

\frac{3}{36}

\frac{4}{36}

\frac{5}{36}

\frac{6}{36}

\frac{5}{36}

\frac{4}{36}

\frac{3}{36}

\frac{1}{36}

\frac{1}{6}

\frac{2}{6}

\frac{3}{6}

\frac{4}{6}

\frac{5}{6}

1 = \frac{6}{6}

Probability Mass Function

Total Probability = 1 (unit mass)

PMF: What share of this unit mass does each value take?

What if the random variable can take infinite values?

(continuous random variables)

Continuous Random Variables

Total Probability = 1 (unit mass)

What share of the unit probability mass does each value take?

Rainfall in Chennai:

2 cm ?

2.01, 2.001, 1.99, 1.999

0 Infinite values possible

Continuous Random Variables

Total Probability = 1 (unit mass)

What share of the unit probability mass does each value take?

Rainfall in Chennai:

2 cm ?

2.01, 2.001, 1.99, 1.999

0 Infinite values possible

Continuous Random Variables

Your water intake (in litres )

Does not make sense to ask about the number of days on which you drank exactly 2 litres?

Instead it makes sense to ask about the number of days on which 1.9 < x < 2.1

1.0 1.5 2.5 2.5 3.0 3.5

\Omega

Continuous Random Variables

Your water intake (in litres )

Does not make sense to ask about the number of days on which you drank exactly 2 litres?

Instead it makes sense to ask about the number of days on which 1.9 < x < 2.1

1.0 1.5 2.0 2.5 3.0 3.5

Continuous Random Variables

Your water intake (in litres )

1.0 1.5 2.0 2.5 3.0 3.5

probability density function

FDS_Random_Variables

By One Fourth Labs

FDS_Random_Variables

PadhAI One: FDS Week 3 (MK)

One Fourth Labs

We deliver courseware in AI and related areas