Detecting Anomalous Groups

Special Topics in Machine Learning

Rishav Chakravarti

Agenda

Background & Motivation

Three Categories of Anomaly Detection

Point-based

Aggregation-based

Distribution-based

Individual Anomalies

"Identify individual data points that are rare due to particular combinations
of features"

-- Wong et. al 2002

Anomalous Groups

The idea that interesting patterns only emerge due to groupings.

Point-based

Find individual points which are already anomalous, then group them.

-- EPD Lecture 2

Point-based

Pros

Leverages tried & tested methods for individual anomaly detection as well as clustering techniques.

Cons

This only works when the 'anomaly' presents itself at the individual.

Group detection can be sensitive to detection algorithm parameters.

Ultimately continues to be useful in many domains.

Aggregation-based

Aggregates 'counts' of events into groups and flag groups where the aggregated count is anomalous.

Rule Based Anomaly Pattern Detection

Domain is detection of emerging disease outbreaks

Records emergency department cases (events)

Groups based on one & two component rule learning

(gender=male) and (age_decile=9)

Scores each group by comparing against historical counts

Significance tests based on randomization

Results: Reduces time to detection with minimal rise in FPR
(caveat for small p-values)

Distribution-based

Required to find anomalous groups where the points are relatively normal, but as a whole they are unusual. And no obvious aggregation.

Distribution-based

Option 1

Define a set of features that make up a group
Learn distribution of these features
Use traditional anomaly detection techniques

E.g. customer segment analysis

What are the pros/cons of such an approach?

Distribution-based

Option 2

Anomaly Detection for Astronomical Data (1)

Domain is detection of 'interesting' galaxy clusters

Records spectral data from sky observations (500 dimensions)

Define distributions:

Each observation, x, grouped into M galaxy clusters.
Each observation, x, also assigned to one galaxy types.
There is an overall distribution, M(Θ), over galaxy clusters.
For each galaxy cluster, there is an expected distribution, Dir(Χ), over galaxy types.
Each galaxy type has an expected distribution, N(β), over the spectral observations.

Dirichlet Genre Model

Learn using variation of Expectation Maximization. Use to calculate anomalousness

Anomaly Detection for Astronomical Data (3)

Results: Performed better than 'single point' anomaly detection schemes for simulated/labelled data. Corroborated 'interesting' findings with experts.

What are some of the pros/cons of this approach?

Some additional application domains?

Questions?

References


David M. Blei, Andrew Y. Ng, and Michael I. Jordan. Latent Dirichlet allocation. JMLR, 3:993–1022, 2003

Philip K. Chan and Matthew V. Mahoney. Modeling multiple time series for anomaly detection. In IEEE International Conference on Data Mining, 2005

Kaustav Das, Jeff Schneider, and Daniel Neill. Anomaly pattern detection in categorical datasets. In Knowledge Discovery and Data Mining (KDD), 2008

Rupali Kandhari, Shilpa Dhange, Archana Bansod, and Dr. P.K. Deshmukh. Anamoly Detection. International Conference on Computer Science & Engineering (ICCSE), 17th March-2013

Weng-Keen Wong, Andrew Moore, Gregory Cooper, and Michael Wagner. Rule-based anomaly pattern detection for detecting disease outbreaks. From American Association for Artificial Intelligence-02 Proceedings, 2002

Liang Xiong, Barnabas Poczos, Andrew Connolly, and Jeff Schneider.Anomaly Detection for Astronomical Data. Data Analysis Project, Machine Learning Department, Carnegie Mellon University, 2011

Liang Xiong, Barnabas Poczos, and Jeff Schneider. Group anomaly detection using Flexible Genre Models. In Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011, NIPS 2011

Liang Xiong, Barnabas Poczos, and Jeff Schneider. Hierarchical probabilistic models for group anomaly detection. In International conference on Artificial Intelligence and Statistics (AISTATS), 2011

Images


http://www.anorak.co.uk/wp-content/uploads/2013/02/sheep-france-wolf1.jpg
http://www.hopesteadhillfarm.com/photos/sheep-on-hill.jpg
http://www.susanstevenson.com/Journal/2010/August/1749GrayWolfP.jpg
http://upload.wikimedia.org/wikipedia/en/f/f1/Down_Arrow_Icon.png

deck

By myman

Detecting Anomalous Groups

Agenda

Individual Anomalies

Anomalous Groups

Point-based

Point-based

Pros

Cons

Ultimately continues to be useful in many domains.

Aggregation-based

Rule Based Anomaly Pattern Detection

Distribution-based

Distribution-based

Distribution-based

Anomaly Detection for Astronomical Data (1)

Dirichlet Genre Model

Anomaly Detection for Astronomical Data (3)

Questions?

deck

More from myman