Malicious Profile Identification in Online Social Networks

Dima Kagan

Supervisors: MICHAEL FIRE, YUVAL ELOVICI

Complex Networks

Related Work

Reputation based filtering [Golbeck and Hendler].
Topoplogy based identification [Fire et al.].
Graph centrality measure based spammer identification [DeBarr and Wechsler].
Spammers detection in social networks by using “honey-profiles" [Stringhini et al.].
Clustering groups of accounts that act similarly at around the same time for a sustained period of time [Cao et al.].

Link Prediction

+

Crowd Wisdom

Supervised Fake Profile
Identification in Online Social Networks

Fake profiles dataset - Recommended restricted links set + All unrestricted links set.
Friends restriction dataset - Alphabetically restricted links set + All unrestricted links set.
All links dataset - contains all the links.

Collected Datasets

Collected Data

	Users	Restricted	Unrestricted
Fake-Profiles	434	2,860	138,286
Friends Restrictions	355	6,145	138,286
All Links	527	9,005	138,286

Features

Labeling Data is Hard

Unsupervised Anomaly Detection in Graphs Utilizing a Link Prediction Algorithm

Malicious Users Tend to Connect to Other Profiles Randomly

Topology Based

Feature Extraction

16 feautres

for directed

graphs

8 feautres for

undirected

graphs

◦ For undirected graphs:

Common Friends
Total Friends
Jaccard’s-Coefficent

\frac{|\Gamma(v) \cap \Gamma(u)|}{|\Gamma(v) \cup \Gamma(u)|}

\frac{|\Gamma(v) \cap \Gamma(u)|}{|\Gamma(v) \cup \Gamma(u)|}

|\Gamma(v) \cup \Gamma(u)|

|\Gamma(v) \cup \Gamma(u)|

|\Gamma(v) \cap \Gamma(u)|

|\Gamma(v) \cap \Gamma(u)|

|\Gamma(v)_{in}| \cap |\Gamma_{out}(u)|

|\Gamma(v)_{in}| \cap |\Gamma_{out}(u)|

\begin{cases} 1, & \text{if}\ (u,v)\in E \\ 0, & \text{otherwise} \end{cases}

\begin{cases} 1, & \text{if}\ (u,v)\in E \\ 0, & \text{otherwise} \end{cases}

◦ For directed graphs:

Transitive Friends
Opposite Direction Friends

Link Classification

Aggregation of The Results

\sum_{}

\sum_{}

Meta Feature Exteraction

AbnormalityVertexProbability(v) := \frac{1}{|\Gamma(v)|}\sum\nolimits_{u \in \Gamma(v)}p(v,u)

AbnormalityVertexProbability(v) := \frac{1}{|\Gamma(v)|}\sum\nolimits_{u \in \Gamma(v)}p(v,u)

We extracted 9 features

- the confidence that an edge is fake.

p(v,u)

p(v,u)

Datasets

Fully Simulated Networks

Semi Simulated Networks

Real World Networks

Kids Friendship Network

AUC - 0.93

AUC - 0.93

TPR - 0.91

TPR - 0.91

FPR- 0.15

FPR- 0.15

Twitter

https://github.com/Kagandi/anomalous-vertices-detection

Malicious Profile Identification in Online Social Networks

Dima Kagan

Supervisors: MICHAEL FIRE, YUVAL ELOVICI

Complex Networks

Related Work

Link Prediction

+

Crowd Wisdom

Supervised Fake Profile
Identification in Online Social Networks

Collected Datasets

Collected Data

Features

Labeling Data is Hard

Unsupervised Anomaly Detection in Graphs Utilizing a Link Prediction Algorithm

Malicious Users Tend to Connect to Other Profiles Randomly

Topology Based

Feature Extraction

Link Classification

Aggregation of The Results

Meta Feature Exteraction

Datasets

Fully Simulated Networks

Semi Simulated Networks

Real World Networks

Kids Friendship Network

Twitter

Questions?

Thesis

Thesis

Dima Kagan

Malicious Profile Identification in Online Social Networks

Dima Kagan

Supervisors: MICHAEL FIRE, YUVAL ELOVICI

Complex Networks

Related Work

Link Prediction

+

Crowd Wisdom

Supervised Fake Profile Identification in Online Social Networks

Collected Datasets

Collected Data

Features

Labeling Data is Hard

Unsupervised Anomaly Detection in Graphs Utilizing a Link Prediction Algorithm

Malicious Users Tend to Connect to Other Profiles Randomly

Topology Based

Feature Extraction

Link Classification

Aggregation of The Results

Meta Feature Exteraction

Datasets

Fully Simulated Networks

Semi Simulated Networks

Real World Networks

Kids Friendship Network

Twitter

Questions?

Thesis

More from Dima Kagan

Supervised Fake Profile
Identification in Online Social Networks