Why Learn About a population?

Example: When Learning is Costly

Maybe you are a hedge fund that wants to identify who might be a good trader.

Hire everyone that applies

One Idea

Allow everyone you hired to trade for one year

Fire everyone who is not good at trading

Evaluate everyone's trading performance

Why is this not a good idea?

Letting people trade who are not good at trading is expensive

Firing people is expensive

(Bloomberg)

The avg. severance pay at Meta for the 11,000 employees laid off in 2022 was $88,000

Maybe some people are good traders but they have a bad year. They happened to buy a lot of First Republic and then the stock went to zero

Initially, hire a small random subset of people who apply and evaluate them in a simulated trading environment

Allow everyone you hired to trade

In the future, hire only those who have a high estimated likelihood of real world trading success

Learn the relationship between individual characteristics (like their score in the simulated trading environment) and real-world trading success

What's a Better Idea?

10 Immediate Data Questions

(1) What is the Unit of Observations

(2) Am I working with the population? If not, how was the sample formed?

(3) What types of variable do I have? Continuous/ Discrete/ Categorical?

(A) How are they distributed?

(B) Should I be concerned about outliers?

(4) Is the task prediction or causal?

Data Analytics

Programming

Statistics

Probability Spaces

Probability Space

A probability space is a mathematical structure for representing uncertainty

One can think of it as a list with three elements

\Big(\Omega, \mathcal{F}, \mathbb{P}\Big)

Starting Point

What are you uncertain about?

Example of rolling a die

Uncertain about which of the 6 values

Example of hiring a trader

How good a trader are they?

Sample Space

Set of possible outcomes that could happen

Example of rolling a die

Example of hiring a trader

[1, 2, 3, 4, 5, 6]

[0,1]

\Omega

Random Variables

Set of possible outcomes that could happen

Example of rolling a die

Example of hiring a trader

X(\omega) = 2000000 \omega, \quad \omega \in [0, 1]

\Omega \to \ ?

X(\omega) = \begin{cases} 1 & \omega \in \{2, 4, 6\} \\ -1 & \text{o.w } \end{cases}

In Python

import jax 
import jax.numpy as jnp

Rolling a Die

def f(x):
  return 1 if x % 2 == 0 else -1

Define the Random Variable

sample_space = jnp.array([1, 2, 3, 4, 5, 6])

Sample Space

Key

outcome = jax.random.choice(jax.random.PRNGKey(0), sample_space)

Experiment/ Simulation

Rolling a Die

def f(x):
  return 1 if x % 2 == 0 else -1

sample_space = jnp.array([1, 2, 3, 4, 5, 6])

outcome = jax.random.choice(jax.random.PRNGKey(0), sample_space)
print(f(outcome))

Winnings/ Losings

Hiring a Trader

def f(x):
  return 2_000_000
outcome = jax.random.uniform(jax.random.PRNGKey(0), minval=0.0, maxval=1.0)
print(f(outcome))

Winnings/ Losings

The Sample Space is Not Countable

Composition

Float

String

stringify

Float

percentage_grade

Float

String

\textrm{stringify} \ \circ \ \textrm{percentage\_grade}

Set

Subset

\{1, 2, 3, 4, 5, 6\}

\{2, 4, 6\}

\{\textrm{AI Companies}\}

\{\textrm{OpenAI}\}

\{\textrm{BU Atheltic Teams}\}

\{\textrm{Soccer, Basketball, Tennis}\}

\{\textrm{Programming Languages}\}

\{\textrm{Julia, Python, Haskell}\}

Filter

Expectation

\circ

Filter

Standard Deviation

\circ

Probability Spaces (Continued)

Dart Board

Sample Space

Subset

Dart Board

Sample Space

\{0, 1\}

\mathcal{X}

Dart Board

Sample Space

A

\Big( \Omega, \mathcal{F}_{\Omega}, \mathbb{P}\Big) \overset{X}{\longmapsto} \Big( \{0, 1\}, \mathcal{F}_{\{0, 1\}}, ?\Big)

\Big( \Omega, \mathcal{F}_{\Omega}, \mathbb{P}\Big) \overset{X}{\longmapsto} \Big( \{0, 1\}, \mathcal{F}_{\{0, 1\}}, \mathbb{P} \circ X^{-1}\Big)

W

X^{-1}

\mathbb{P}

A

0.3

X^{-1}(W) = \{ \omega \in \Omega \vert X(\omega) \in W\}

Element

In

Such that

Get mapped into

Big Picture

Motivation

Composition is a fundamental way in which we can build new ideas using our existing ideas

By learning about probability spaces, ideas in statistics become composable, which means we can build with them

Programming

Statistics

Data Manipulation

Insight

Dart Board

Sample Space

5

4

3

2

1

Set of all WNBA Players

Points

Assists

X

Y

Set of all WNBA Players

Points

X

Set of all WNBA Players

Teams

X

["Let's", "go", "to", "the", "beach"]

Set of all WNBA Players

Points

X

\mathbb{P} \circ X^{-1}

Assigns probability to subsets of Points

Set of all WNBA Players

Points

X

Points Demeaned

Points Demeaned Squared

\tilde{X}

\tilde{X}^2

Set of all WNBA Players

Points

Probability Space

A probability space is a mathematical structure for representing uncertainty

One can think of it as a list with three elements

\Big(\Omega, \mathcal{F}, \mathbb{P}\Big)

The set of all possible outcomes

Sample Space

The set of all events that we can assign probability to

Event Space

A Function that assigns the probabilities

Probability Measure

Probability Space Example

Let's say that we are considering hiring some unknown individual to be a trader, how can we use a probability space to represent our uncertainty?

\Big(\Omega, \mathcal{F}, \mathbb{P}\Big)

The set of all people we could hire (finite set)

The set of all subsets of people that we could hire

Sample Space

Event Space

A Function that assigns 1/n

probability to each {i}

Probability Measure

Big Picture

(Continued)

Y

X

All U.S. Houses

Price

Size

\mathbb{E}[Y \vert X]

Y_i = \mathbb{E}[Y \vert X=X_i] + \varepsilon_i

Price of House

Average Price Given Its Size

Error Term

X

\Omega

\mathcal{R}

A

X^{-1}(A)

X^{-1}

Basics of Python

Data Manipulation

Probability Theory

Variables

Functions

For-loops

Lists

Dictionaries

Frequency Tables

Scatter Plots

Correlations

Filtering

Map

Sample Space

Random Variables

Probability Measures

Subsets

Expected Value

Variance

Variables

Functions

For-loops

Lists

Dictionaries

Frequency Tables

Scatter Plots

Correlations

Filtering

Map

Sample Space

Random Variables

Probability Measures

Subsets

Expected Value

Variance

Set of all WNBA Players

All American Voters

Candidate

X

\Big(\{\textrm{Candidate Names}\}, \mathcal{F}_{\text{names}}, \mathbb{P} \circ X^{-1}\Big)

\Big( \Omega, \mathcal{F}, \mathbb{P}\Big)

All Surveys

Candidate

X_1

X

X_2

X_{3385}

\vdots

\Big( \Omega_n, \mathcal{F}_n, \mathbb{P}_n\Big)

\Big(\{\textrm{Candidate Names}\}, \mathcal{F}_{\text{names}}, \mathbb{P}_n \circ X_n^{-1}\Big)

Fraction Voting for Specific Candidate

\bar{X}

All Surveys

\Big( \Omega_n, \mathcal{F}_n, \mathbb{P}_n\Big)

\Big([0, 1], \mathcal{F}_{[0,1]}, \mathbb{P}_n \circ \bar{X}^{-1}\Big)

Fraction Voting for Specific Candidate

\bar{X}

All Surveys

Set of all Keys

Fraction Voting for Specific Candidate

\frac{1}{N}\sum \tilde{X}_i

Set of All Possible Midterm Questions

Subset of Questions on the Midterm

Set of All Possible Wednesdays

Specific Wednesday

\times

Set of All Possible Wednesdays

Set of All Possible Midterm Questions

\{0, 1\}

X

\Omega_1

\Omega_2

\{0, 1\}

X_1

\Omega

Set of all Midterm Experiences

X_2

\vdots

X_{18}

[0,1]

\bar{X}

\Omega

Set of all Midterm Experiences

Function Space

Set of all Linear Functions

Function Space

Set of all Linear Functions

Parameter Space

\beta_0

\beta_1

Parameter Space

Objective Function

Function Space

Set of all Linear Functions

Parameter Space

Function Space

Parameter Space

\frac{1}{n}\ \sum \big(y_i - f(x_i)\big)^2

Objective Function

\frac{1}{n} \sum \big(y_i - \beta_0 - \beta_1 x_i \big)^2

How Does the Sample Mean Differ From the Population Mean?

How Do the Possible Sample Means Differ From Each Other?

Can We Estimate this "Variation"?

All US Voters

Age Group

Vote

X

Y

All Phone Calls to Credit Card Customer Services

Day of the Month

Number of Calls

X

Y

All listening sessions of podcasts

Podcast Type

Amount of Commericals listened to

X

Y

All US Adults

Age

Income

X

Y

All US Adults

\mathbb{E}[Y \vert X]

Income

All Possible Surveys

W_1

W_2

W_{144,000}

Age

Income

\times

\vdots

All listening sessions of podcasts

Podcast Type

Amount of Commericals listened to

X

Y

Statistical Inference

Business Analytics

Why Learn About a population?

Probability Spaces

Composition

Probability Spaces (Continued)

Big Picture

Big Picture

Step-By-Step Walkthrough