Brief Overview and Historical Background
(Freeman 2004)
Perspective
Chief Assumption
Relationships between interacting social units matter
Additional Assumptions
- Interdependence among actors and their actions
- Relationships between actors allow resource flows
- Network structure offers individuals opportunities & constraints
- Structure emerges from patterned relationships between actors
(Wasserman and Faust 1994:4)
Features of Contemporary Social Network Analysis
- Intuition of social structure as ties bonding social actors
- Informed by systematic empirical data
- Visualization plays a substantial role
- Requires mathematical and/or computational models
Fields that develop and apply social network analysis
Anthropology, business fields, communications, computer science, ecology, economics, epidemiology, ethology, history, informatics, mathematics, physics, political science, psychology, sociology, statistics
(Freeman 2004:3, 5)
Historical Overview
(Freeman 2004)
- Prehistory
- Birth
- Moreno & Sociometry (1930s)
- Harvard
- Dark Ages (1940s-1960s)
- Harvard Renaissance
- Organizational integration
Further Developments
- Network science
- Social media
- Big data
What is a social network?
"A finite set or sets of actors and the relation or relations defined on them" (W&F 1994:20)
What are actors?
Actors are social entities
Actors do not necessarily have the ability to act
Actors (typically) are all of the same type
Formal terms for actors
Examples?
Actors may also have attributes (e.g., age, sex, ethnicity)
What are relations?
Social ties link pairs of actors
Relations collect a specific set of ties among group members
Related formal terms
What are relations?
Conceptual considerations
-
Directed undirected?
- Weighted or unweighted?
- Nominal, ordinal, interval, or ratio scale?
- Signed or unsigned?
- Loops?
- Time sensitivity?
- Static
- Moving window
- Real-time
- Accumulation and decay
Relations may also have attributes
Two Basic Measurements
Degree
...
Density
...
Two Basic Measurements
Degree
Number of edges incident upon a node
-
Undirected
-
Directed
- Indegree
- Outdegree
- Total (Freeman) Degree
Density
...
Two Basic Measurements
Degree
...
Density
Proportion of observed edges, e, in a graph of n actors
- Undirected
- Without loops: e / ((n * (n - 1)) / 2)
- With loops: e / ((n^2)/2)
- Directed
- Without loops: e / (n * (n - 1))
- With loops: e / (n^2)
What are some different types of networks?
What are some different types of networks?
What are some different types of networks?
What are some different types of networks?
- Simple graph
- Multigraph
- Hypergraph
What are some different types of networks?
- Simple graph
- Multigraph
- Hypergraph
- Directed Acyclic Graph
What are some different types of networks?
- Simple graph
- Multigraph
- Hypergraph
- Directed acyclic graph
- Two-Mode Network
What are some different types of networks?
- Simple graph
- Multigraph
- Hypergraph
- Directed acyclic graph
- Two-mode network
- Ego Networks
How can we express a social network?
How can we express a social network?
How can we express a social network?
How can we express a social network?
- Matrix
- Edgelist
- Set Notation
ℕ = {n1,n2,n3,n4,n5}
𝕃 = {l1,l2,l3,l4}
l1 = (n1,n3)
l2 = (n1,n5)
l3 = (n2,n4)
l4 = (n3,n5)
𝔾 = (ℕ,𝕃)
How can we express a social network?
- Matrix
- Edgelist
- Set notation
- Sociogram
Walks
"A walk is a sequence of nodes and lines, starting and ending with nodes, in which each node is incident with the lines following and proceeding it in the sequence." - Wasserman and Faust (1994:105)
Walks
Trail
A walk such that every edge traversed is unique
(yet not necessarily every node)
Path
A trail such that every vertex traversed is distinct
There could be zero, one, or multiple walks, trails, and paths between any two vertices!
Seven Bridges of Königsberg
Problem: Walk must cross every bridge only once
Euler (1735) proved there is no solution for the walk
-
Land masses are nodes, bridges are edges
- Would need zero or two nodes of odd degree
Measurements of Distance
Pairwise
Path length
Number of edges traversed between two nodes
Geodesic
Shortest path between two nodes
Geodesic distance
Length of the shortest path between two nodes
Graph and Subgraph
Average path length
Mean geodesic distance
Diameter: Longest geodesic distance
Application: Erdös Numbers
A measurement of collaborative distance
Application: 6 Degrees of Bacon
Measurement of geodesic distance
Bacon Number | # of Actors (van der Hofstad, 13 May 2013:8)
-
0 | 1
-
1 | 1902
-
2 | 160463
-
3 | 457231
-
4 | 111310
-
5 | 8168
-
6 | 810
-
7 | 81
-
8 | 14
Cycles
A walk "that begins and ends at the same node" and has "at least three nodes in which all lines are distinct, and all nodes except the beginning and ending node are distinct." (Wasserman and Faust 1994:107-8)
Cycles have a length
Connectivity and Components
If a path exists between each pair of vertices in a graph, then the graph is connected
- Strong connectivity: preserves path directionality
- Weak connectivity: ignores path directionality
A component is a maximally connected subgraph
An isolate is the smallest possible component: a single vertex without any ties to other vertexes in the graph
Connectivity and Components
How many components?
Connectivity and Components
A bridge is an edge that, if removed, creates more components
A cutpoint is a node that, if removed, creates more components
Centrality and Centralization
Centrality: Nodal measurement
Who are the most important actors in a network?
Centralization: Graph measurement
How much difference in "importance" is there between actors within a network?
Generally, compares the observed network's centralization against the theoretical maximum
Centrality and Centralization
The Big Lebowski
Character co-appearances
Centrality and Centralization
- Degree
- Betweenness
- Closeness
- Eigenvector
(Freeman 1979; Bonacich 1987)
Cumulative Degree Distribution
Cumulative Degree Distribution
Preferential Attachment
-
Cumulative Advantage
-
Matthew Effect (Merton)
"For everyone who has will be given more, and he will have an abundance. Whoever does not have, even what he has will be taken from him." (Matthew 25:29)
-
Friendship Paradox (Feld 1991)
P(X=x) ~ x^(-alpha)
Nodes are of degree greater than or equal to x
P(X=x) is the probability of observing a node with degree x or greater
alpha is the scalar
(Barabási and Albert 1999)
Betweenness
How many geodesics go through a node (or edge)?
Variations
Edge weighted
Edge betweenness
Proximity, Scale Long Paths, and Cutoff
Endpoints
Random walk
Closeness
Q: What is closeness?
A: The inverse of farness!
Q: What is farness?
If connected, the sum of a node's geodesic distances to all other nodes
Variations:
Unconnected graphs
Edge weighted
Random walk
Ex. Kevin Bacon
1049th closest actor (of ~800k)
Sean Connery is closer!
(van der Hofstad 13 May 2013:8)
Eigenvector Centrality
Power comes from associating with the powerful
- Centrality accumulates from the centralities of associated alters
- Favors large, dense subgraphs (cliques)
- Equal to the first eigenvector of the network's adjacency matrix
Aren't all these usually getting at the same thing?
Often, but not necessarily (Krackhardt 1990)
Degree: (2 = 3 = 4), (1 = 5 = 6), 7
Betweenness: 4, 5, 6, (2 = 3), (7, 1)
Closeness: 4, 5, (2 = 3), 6, 1, 7
Eigenvector Centrality: (2 = 3), 4, 1, 5, 6, 7
Cohesive Subgroups
“the forces holding the individuals within the groupings in which they are” - Moreno and Jennings (1937:137)
Cohesive groups tend to
- Interact relatively frequently
- Have strong, direct ties within themselves
- Display high internal density
- Share attitudes and behaviors within themselves
- Exert pressure and social norms internally
Cliques
A maximally complete subgroup - Luce and Perry (1949)
~In other words~
Everyone has a tie to everyone else in the subgroup (complete)
No other, smaller subgroups include only a subset of the same actors (maximal)
Alternatives to Cliques
- Geodesic-based approaches
-
n-cliques, n-clans, n-clubs
- Not robust to edge deletion
-
No in-group/out-group distinction
- Degree-based approaches
-
k-plexes, k-cores
- No ingroup/outgroup distinction
- Connectivity-based approaches
- Lambda sets, Moody & White's (2003) cohesive blocks
- Nodes not necessarily directly or closely connected
- Ingroup/outgroup distinctions
- LS Sets
- Modularity-based methods
k-cores
Cohesive "seedbeds" nested within a network
Minimum #ties (k) each member of a subgroup has to other subgroup members
"Coreness" (c)
If a node belongs to a c-core, but not a (c+1)-core
Directed graphs may measure k-cores through
- Ties going inward
- Ties going outward
- Total ties
Alvarez-Hamelin et al. (2006); Seidman (1983)
Community Detection
Goal: Find groups with more ties among members and fewer ties between groups than expected (conditional on degree)
Key Measurement: Modularity, Q, between -0.5 to 1 (Newman 2006)
- Hierarchical Algorithms
- Top-Down
- Girvan-Newman (Newman & Girvan 2004)
- Leading Eigenvector* (Newman 2006)
- Bottom-Up
- Fast-Greedy* (Clauset et al. 2004)
- Walktrap (Pons & Latapy 2005)
- Louvain method*, ** (Blondel et al. 2008)
- Spin-Glass (Reichardt & Bornholdt 2006; Traag & Bruggeman 2008)
*Modularity optimized, **Semi-hierarchical
Choose an algorithm based upon theory, functionality, or highest modularity
Louvain Method, First Pass
Louvain Method, Second Pass
Louvain Method, Both Passes
Density Comparisons
Modularity: 0.36, 0.44
Graph Density: 0.14
|
Community Density |
|
A |
B |
C |
D |
A |
0.60 |
0.28 |
0.24 |
0.20 |
B |
0.28 |
0.42 |
0.24 |
0.20 |
C |
0.24 |
0.24 |
0.47 |
0.23 |
D |
0.20 |
0.20 |
0.23 |
0.32 |