Detecting Anomalous Groups


Special Topics in Machine Learning


Rishav Chakravarti

Agenda


  • Background & Motivation

  • Three Categories of Anomaly Detection

    • Point-based

    • Aggregation-based

    • Distribution-based

Individual Anomalies

"Identify individual data points that are rare due to particular combinations
of features"

-- Wong et. al 2002


Anomalous Groups

The idea that interesting patterns only emerge due to groupings.

Point-based

Find individual points which are already anomalous, then group them.

-- EPD Lecture 2

Point-based


Pros

Leverages tried & tested methods for individual anomaly detection as well as clustering techniques.


Cons

This only works when the 'anomaly' presents itself at the individual.

Group detection can be sensitive to detection algorithm parameters.


Ultimately continues to be useful in many domains.

Aggregation-based

Aggregates 'counts' of events into groups and flag groups where the aggregated count is anomalous.


Rule Based Anomaly Pattern Detection


Domain is detection of emerging disease outbreaks

Records emergency department cases (events)

Groups based on one & two component rule learning

(gender=male) and (age_decile=9)

Scores each group by comparing against historical counts

Significance tests based on randomization


Results: Reduces time to detection with minimal rise in FPR
(caveat for small p-values)

Distribution-based

Required to find anomalous groups where the points are relatively normal, but as a whole they are unusual. And no obvious aggregation.

Distribution-based 


Option 1

  1. Define a set of features that make up a group 
  2. Learn distribution of these features
  3. Use traditional anomaly detection techniques


E.g. customer segment analysis


What are the pros/cons of such an approach?

Distribution-based

Option 2

Anomaly Detection for Astronomical Data (1)

Domain is detection of 'interesting' galaxy clusters

Records  spectral data from sky observations (500 dimensions)

Define distributions:

    • Each observation, x, grouped into M galaxy clusters.
    • Each observation, x, also assigned to one galaxy types.
    • There is an overall distribution, M(Θ), over galaxy clusters.
    • For each galaxy cluster, there is an expected distribution, Dir(Χ), over galaxy types.
    • Each galaxy type has an expected distribution, N(β), over the spectral observations.

Dirichlet Genre Model

Learn using variation of Expectation Maximization. Use to calculate anomalousness

Anomaly Detection for Astronomical Data (3)


Results:  Performed better than 'single point' anomaly detection schemes for simulated/labelled data.  Corroborated 'interesting' findings with experts.


What are some of the pros/cons of this approach?


Some additional application domains?

Questions?

      References


David M. Blei, Andrew Y. Ng, and Michael I. Jordan. Latent Dirichlet allocation. JMLR, 3:993–1022, 2003

Philip K. Chan and Matthew V. Mahoney. Modeling multiple time series for anomaly detection. In IEEE International Conference on Data Mining, 2005
Kaustav Das, Jeff Schneider, and Daniel Neill. Anomaly pattern detection in categorical datasets. In Knowledge Discovery and Data Mining (KDD), 2008
Rupali Kandhari, Shilpa Dhange, Archana Bansod, and Dr. P.K. Deshmukh. Anamoly Detection. International Conference on Computer Science & Engineering (ICCSE), 17th March-2013
Weng-Keen Wong, Andrew Moore, Gregory Cooper, and Michael Wagner. Rule-based anomaly pattern detection for detecting disease outbreaks. From American Association for Artificial Intelligence-02 Proceedings, 2002
Liang Xiong, Barnabas Poczos, Andrew Connolly, and Jeff Schneider.Anomaly Detection for Astronomical Data. Data Analysis Project, Machine Learning Department, Carnegie Mellon University, 2011
Liang Xiong, Barnabas Poczos, and Jeff Schneider. Group anomaly detection using Flexible Genre Models. In Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011, NIPS 2011
Liang Xiong, Barnabas Poczos, and Jeff Schneider. Hierarchical probabilistic models for group anomaly detection. In International conference on Artificial Intelligence and Statistics (AISTATS), 2011

      Images


http://www.anorak.co.uk/wp-content/uploads/2013/02/sheep-france-wolf1.jpg
http://www.hopesteadhillfarm.com/photos/sheep-on-hill.jpg
http://www.susanstevenson.com/Journal/2010/August/1749GrayWolfP.jpg
http://upload.wikimedia.org/wikipedia/en/f/f1/Down_Arrow_Icon.png

deck

By myman

deck

  • 295