Outline

Introduction
MERCS
Anomaly Detection
Outlook

Introduction

What is the relationship of

Machine Learning to other sciences?

Standard problem of ML

f: X \rightarrow Y

f: X \rightarrow Y

Given:

Derive:

D = \{x^1, \cdots, x^n \}

D = \{x^1, \cdots, x^n \}

x^i = \{x_1, \cdots, x_m \}

x^i = \{x_1, \cdots, x_m \}

A = \{A_1, \cdots, A_m \}

A = \{A_1, \cdots, A_m \}

\forall i,j: x^i_j \in Dom(A_j)

\forall i,j: x^i_j \in Dom(A_j)

X, Y \subset A

X, Y \subset A

X \cap Y = \emptyset

X \cap Y = \emptyset

Standard problem of ML

D

D

f

f

X

X

\rightarrow

\rightarrow

Y

Y

N.b.:

Some of the red entries have to be known in order to achieve this!

Different flavours of ML

f: X \rightarrow Y

f: X \rightarrow Y

This encompasses many kinds of ML

Function approximation
(cf. optimization)
Probabilistic learning
Explicit modeling

Correspond to various kinds of f

Different flavours of ML

f: X \rightarrow Y

f: X \rightarrow Y

This encompasses many kinds of ML

Predictive models
Generative models

Correspond to various X, Y

Different flavours of ML

f: X \rightarrow Y

f: X \rightarrow Y

This encompasses many kinds of ML

Supervised learning
Semi-supervised learning
Unsupervised learning

Correspond to

feedback available to the algorithm

while learning f

How does this relate to the rest?

Machine Learning is just another kind of mathematical tool, that can be used to address questions in science and engineering

We start from data and rely on algorithms, rather than searching for a reasonable f ourselves.

4 ways of doing science

By developing theories
By performing experiments
By performing simulations
By looking at the data

Where does anomaly detection fit in?

Nowhere?

Anomaly detection is hard,

precisely because

f is not easily defined

Where does anomaly detection fit in?

Many approaches exist,

2 key ideas:

Learn a model of the normal data (~generative approach), flag things that do not correspond to this as anomalies
Convert to the canonical form,
learn to
detect known anomalies explicitly

Outline

Introduction
MERCS
Anomaly Detection
Outlook

MERCS

Motivation

``Flexibility matters.

Any truly intelligent system must not only possess the ability to solve a task, but must also exhibit considerable flexibility with regard to the task itself.''

Research Goal

f: X \rightarrow Y

f: X \rightarrow Y

But, now:

\(X\) is not given at training time
\(Y\) is not given at training time

This is what we call a versatile model

Motivation

Can we lift an ensemble of predictive models to reveal the general structure of a given dataset?

Motivation

Discovering general structure is typically the domain of probabilistic methods,

i.e. Bayesian Networks.

Bayesian Network

Motivation

Can it be done differently?

Can it be done by methods that are more suitable for big data?

The model

Some properties

+

-

Interpretable
Scalable
Numeric and nominal attributes

Exponential: Including every possible f (~tree) is impossible
Predictive: Decision trees do not offer actual probabilistic reasoning (\(\leftrightarrow\) Bayesian Networks)

Main challenge

Including every possible f is infeasible

1. We include a sample of possible trees in our ensemble

2. When presented with a prediction task,

we combine those trees appropriately

1. Make a selection of lego blocks

2. Build what you need with those lego blocks

Predictions in MERCS

There are 2 ways of building:

Build ensembles of trees
Build chains of trees

Example:

MAFI prediction strategy

Select most appropriate trees, based on feature importance

Example:

MAFI prediction strategy

Select most appropriate trees, based on feature importance

Example:

MAFI prediction strategy

Select most appropriate trees, based on feature importance