Measure Theory in Machine Learning

Measure Theory Basics

Probability Space

(Ω, Σ, P)

Probability Space:
Set of all possible outcomes of probability experiments (or sample space)
The state of partial knowledge about the outcome. (sigma-algebra)
A measure on the space

Algebra

A collection "F" is called algebra if:

-Algebra

A collection "F" is called sigma-algebra if:
Example for {a, b, c}:

\sigma

\{\{Φ\},\{a\},\{b\},\{c\},\{a,b\},\{b,c\},\{a,c\},\{a,b,c\}\}

-Algebra

For a set "X", the intersection of all sigma-algebras "Ai" containing "X" is called the sigma-algebra generated by "X".

\sigma

\sigma(X) = \cap_{i=1}^n A_i \\ \{A_1, A_2, \cdots A_n\}: \text{All } \sigma-\text{algebras containing } X

Sample Space

Set of all possible outcomes of the experiment we have.
For instance: rolling a die with three sides of {a, b, c}.
The sigma-algebra of the sample space is the set of all possible events from the experiments. (As an example, null is the event associated with not tossing the die!)

Topological Space

A pair is called a topological space if:
Here, is called a topology.
Each element of a topology is called an open set.

(X, \tau)

\tau

Topological Space

Examples:

Borel sigma-algebra

The borel -algebra of a topological space
denoted as is the sigma-algebra generated by the family of the open sets.
The elements of are called the borel sets.
Lemma: let , then
is the Borel field generated by the family of all open intervals C.

\sigma

C = \{(a; b): a < b\}

\sigma(C) = B_R

Measurable Space

A pair is called a measurable space if X is a set and is Σ a non-empty sigma-algebra of X.
A measurable space allows us to define a function that assigns real-numbered values to the abstract elements of Σ.

(X, \Sigma)

Measure

Let (X, Σ) be a measurable space. A set function μ defined on Σ is called a measure iff:

Measure

A measure on a set, S, is a systematic way to assign a positive number to each suitable subset of that set, intuitively interpreted as its "size" (for the subset). In this sense, it generalizes the concepts of length, area, volume.

Measure space

A triplet (X, Σ, μ) is a measure space if (X, Σ) is a measurable space and the μ is a measure such that:
Note: if μ(X) = 1, then μ is a probability measure and the measurable space is a probability space.

\mu: \Sigma \rightarrow [0; \infty)

Lebesgue Measure

There is a unique measure λ defined on
which satisfies:
This is called the Lebesgue Measure. You can probably guess the Lebesgue measure for the set of real-numbers of higher dimensions!

\lambda([a, b]) = b-a

(R, B_R)

Measure Properties

A measure theory application in machine learning

Analytical generalization bounds for ML models

Paper:
Generalization in Machine Learning via Analytical Learning Theory
(Kawaguchi et al. 2019)

Screw Statistical Learning Theory!

In SLT:
- Training datasets are random variables
- Generalization bounds are based on the family of all models learned on the dataset, not according to the specific dataset
- Pessimist generalizations.
In this work:
- Analytical solutions for each problem!

Screw Statistical Learning Theory!

Why treating each problem separately?
- Once a dataset is actually specified, there is no randomness remaining over the dataset.
- Thus, test errors can be small despite the large capacity of the hypothesis space and possible instability of the learning algorithm.