Title Text

Celebrating 50 years of ICALP

Pierluigi Crescenzi

Gran Sasso Science Institute

July 4-8, 2022

In collaboration with Oleksandr Bezrukov, Daniele Carnevale, and Filippos Christodoulou

A data and graph mining comparative analysis

Pierluigi Crescenzi

Celebrating ICALP

  • Bi-dimensional presentation
    • Down arrow: next slide in same section
    • Right arrow: next section
    • Up arrow: previous slide in same section
    • Left arrow: previous section
    • Slide number \(s.n\): \(n\)-th slide of section \(s\)
  • How to interact with plots
    • Values are shown when hovering on the points
    • Clicking on one legend item makes the corresponding plot appear or disappear
    • Clicking on autoscale button, rescale axes (first thing to do)
    • Clicking on download button, save plot as PNG file
      • Not working in slide show, but on frame opened in other tab

Instructions

July 4-8, 2022

Pierluigi Crescenzi

July 4-8, 2022

Celebrating ICALP

Data gathering

  • 16 conferences
    • 4 general (FOCS, ICALP, STACS, STOC)
    • 2 algorithms (ESA, SODA)
    • 2 automated verification (CAV, TACAS)
    • 2 cryptography (CRYPTO, EUROCRYPT)
    • 2 distributed computation (DISC, PODC)
    • 2 logic (CSL, LICS)
    • 2 programming languages (ESOP, POPL)

DBLP

CSL: no edition in 2019

DISC: no edition in 1988

ESOP: no edition in 1987, 1989, 1991, 1993, 1995, and 1997

EUROCRYPT: no edition in 1983

ICALP: no edition in 1973 and 1975

POPL: no edition in 1974 and from 2018 (journal PACMPL)

  • First editions
    • CAV: 1990
    • CRYPTO: 1981
    • CSL: 1987
    • DISC: 1987
    • ESA: 1993
    • ESOP: 1986
    • ECRYPT: 1982
    • FOCS: 1960
    • ICALP: 1972
    • LICS: 1986
    • PODC: 1982
    • POPL: 1973
    • SODA: 1990
    • STACS: 1984
    • STOC: 1969
    • TACAS: 1995

Pierluigi Crescenzi

Celebrating ICALP

Basic data mining

Number of papers per year

  • Only SODA produces more than ICALP
  • CSL, ESA, ESOP, and STACS seem to be stable in last 20 years
  • CSL and LICS together in 2014
  • Points corresponding
    to years in which a conference did not take place are not shown

Clicking on one conference acronym makes its plot appear/disappear

July 4-8, 2022

Pierluigi Crescenzi

Celebrating ICALP

Basic data mining

Number of authors per year

  • Similar to the previous one

July 4-8, 2022

Clicking on one conference acronym makes its plot appear/disappear

Pierluigi Crescenzi

Celebrating ICALP

Basic data mining

Percentage of new authors per year

  • Conferences behave more or less the same
    • Stabilizing between 40% and 50%
      • Every year approximately half of the authors of a  conference are new authors
    • Worth considering co-authorship between authors

July 4-8, 2022

Clicking on one conference acronym makes its plot appear/disappear

Pierluigi Crescenzi

Celebrating ICALP

Basic data mining

Average percentage of new authors

July 4-8, 2022

Pierluigi Crescenzi

Celebrating ICALP

Basic data mining

Coauthorship size distribution

July 4-8, 2022

Clicking on one conference acronym makes its plot appear/disappear

  • Some conferences seem to prefer two and three authors to single author
  • Maximum number of co-authors: 24 (TACAS)
  • Minimum maximum number of co-authors: 7 (CSL and LICS)

Pierluigi Crescenzi

Celebrating ICALP

Basic data mining

ICALP coauthorship size distribution evolution

  • During the evolution of ICALP, papers with two, three, and even four authors have become more popular than single-author papers
  • Similar results for the other conferences

July 4-8, 2022

Clicking on one period  makes its plot appear/disappear

Pierluigi Crescenzi

Celebrating ICALP

Basic data mining

Author set similarity

  • Similarity computed as Sorensen-Dice index
  • STOC and FOCS, and CRYPTO and EUROCRYPT most similar
  • Similarity seems to respect disciplines
  • Conferences more similar to ICALP
    • STACS (0.37)
    • FOCS (0.36)
    • STOC (0.36)
    • SODA (0.35)
    • ESA (0.29)
    • LICS (0.22)

July 4-8, 2022

Pierluigi Crescenzi

Celebrating ICALP

Gender analysis

Gender identification

  • Mostly by querying the genderize.io web service
    • Based on first names
  • Partly by manually searching on the web
  • In collaboration with Oleksandr Bezrukov

July 4-8, 2022

Pierluigi Crescenzi

Celebrating ICALP

Gender analysis

Female/male ratio evolution

  • Stable for most conferences
  • Slightly increasing for few conferences
  • Still below 0.2 for almost all conferences in 2021

Clicking on one conference acronym  makes its plot appear/disappear

July 4-8, 2022

Pierluigi Crescenzi

Celebrating ICALP

Topic analysis

ICALP Title Word Viewer

  • Most frequent 30 words
  • Of all the words contained in the titles of ICALP papers in a certain interval, the plot shows what percentage of them are, for example, "algorithm" or "logic"

Clicking on one word makes its plot appear/disappear

July 4-8, 2022

Pierluigi Crescenzi

Celebrating ICALP

Topic analysis

Word clouds of conference titles

Algorithms

Cryptography

ICALP

Formal methods

Distributed computing

July 4-8, 2022

Pierluigi Crescenzi

Celebrating ICALP

Basic graph mining

Static and temporal graphs

  • Static (undirected) graph
    • Nodes: authors who presented at least one paper at conference
    • Edges: \((a_1,a_2)\) if \(a_1\) and \(a_2\) co-authored at least one paper (not necessarily presented at the conference)
    • In other words: subgraph of DBLP graph induced by set of conference authors
  • Temporal graph
    • Nodes: same as above
    • Edges: \((a_1,a_2,y)\) if \(a_1\) and \(a_2\) co-authored at least one paper (not necessarily presented at the conference) in year \(y\)

July 4-8, 2022

Pierluigi Crescenzi

Celebrating ICALP

Basic graph mining

Static graph statistics

CAV
CRYPTO
CSL
DISC
ESA
ESOP
EUROCRYPT
FOCS
ICALP
LICS
PODC
POPL
SODA
STACS
STOC
TACAS

2615
1896
1402
1552
2728
1329
1720
3233
4651
1901
2325
1980
4019
2636
3053
2111

16094
13445
5281
7702
21108
6026
11599
25628
35426
9416
12380
9576
34121
15611
25647
12853

0.0047
0.0075
0.0054
0.0064
0.0057
0.0068
0.0078
0.0049
0.0033
0.0052
0.0046
0.0049
0.0042
0.0045
0.0055
0.0058

0.98
0.95
0.94
0.97
0.99
0.97
0.93
0.95
0.98
0.97
0.95
0.96
0.99
0.97
0.98
0.98

Conference

N. nodes

N. edges

Density

LCC

  • Graphs sparse, not connected, with giant connected component

July 4-8, 2022

Pierluigi Crescenzi

Celebrating ICALP

Basic graph mining

Densification

  • Number of edges vs number of edges for each year
    • Log-log plot
    • First year of all conferences
  • Most conferences densify in a similar way

July 4-8, 2022

Clicking on one conference acronym  makes its plot appear/disappear

Pierluigi Crescenzi

Celebrating ICALP

Basic graph mining

Diameter shrinking

 

  • CAV, DISC, and FOCS largest diameter in 2021
  • ESA smallest diameter in 2021
  • Several conferences stable for many years
    • Effective diameter should be analyzed

July 4-8, 2022

Clicking on one conference acronym  makes its plot appear/disappear

Pierluigi Crescenzi

Celebrating ICALP

Basic graph mining

Degrees of separation

  • Degrees of separation: average distance
    • CSL largest value in 2021
    • STOC smallest value in 2021
    • ESOP reached a quite high value

July 4-8, 2022

Clicking on one conference acronym  makes its plot appear/disappear

Pierluigi Crescenzi

Celebrating ICALP

Centrality analysis

Static centrality measures

  • On largest connected component of static graphs of general conferences
    • Degree: number of coauthors
    • Closeness: average distance from author to  other authors
    • Betweenness: fraction of shortest paths passing through author
  • In collaboration with Filippos Christodoulou

July 4-8, 2022

Pierluigi Crescenzi

Celebrating ICALP

Centrality analysis

Degree centrality

N. Alon N. Alon E.D. Demaine N. Alon
E.D. Demaine A. Wigderson Fedor V. Fomin A. Wigderson
M. Taghi Hajiaghayi C.H. Papadimitriou K. Mehlhorn Y. Mansour
C.H. Papadimitriou Y. Mansour D. Lokshtanov C.H. Papadimitriou
K. Mehlhorn M. Taghi Hajiaghayi S. Saurabh M. Taghi Hajiaghayi
A. Wigderson R.E. Tarjan G.J. Woeginger R.E. Tarjan
A. Gupta A. Gupta M. Pilipczuk M. Naor
D. Peleg D.P. Woodruff H.L. Bodlaender V. S. Mirrokni
F.V. Fomin M. Naor G. Rote A. Gupta
D.P. Woodruff M. Sudan D. Peleg M. Sudan

ICALP

FOCS

STACS

July 4-8, 2022

STOC

Pierluigi Crescenzi

Celebrating ICALP

Centrality analysis

Closeness centrality

N. Alon N. Alon K. Mehlhorn N. Alon
C.H. Papadimitriou A. Wigderson F. V. Fomin A. Wigderson
D. Peleg C.H. Papadimitriou E.D. Demaine C.H. Papadimitriou
M. Taghi Hajiaghayi R.M. Karp C.H. Papadimitriou R.M. Karp
M. Henzinger M. Taghi Hajiaghayi D. Peleg M. Taghi Hajiaghayi
G. F. Italiano M. Sudan H.L. Bodlaender M. Sudan
E.D. Demaine R.E. Tarjan G.J. Woeginger Y. Mansour
K. Mehlhorn Y. Mansour M. Karpinski M. Naor
C. Mathieu R. J. Lipton F. Meyer auf der Heide M. Charikar
F. V. Fomin
F. T. Leighton A. Czumaj R. J. Lipton

ICALP

FOCS

STACS

July 4-8, 2022

STOC

Pierluigi Crescenzi

Celebrating ICALP

Centrality analysis

Betweenness centrality

N. Alon N. Alon K. Mehlhorn N. Alon
C.H. Papadimitriou C.H. Papadimitriou E.D. Demaine A. Wigderson
M. Y. Vardi A. Wigderson G.J. Woeginger C.H. Papadimitriou
K. Mehlhorn R. M. Karp P. G. Spirakis R.E. Tarjan
D. Peleg R.E. Tarjan F. V. Fomin K. Mehlhorn
P. G. Spirakis E.D. Demaine C.H. Papadimitriou Y. Mansour
E.D. Demaine R. J. Lipton J. O. Shallit R. J. Lipton
G. F. Italiano V. Guruswami D. Peleg V. Guruswami
M. Henzinger M. Taghi Hajiaghayi J. van Leeuwen R.M. Karp
C. Mathieu K. Mehlhorn P. Rossmanith F. T. Leighton

ICALP

FOCS

STACS

July 4-8, 2022

STOC

Pierluigi Crescenzi

Celebrating ICALP

Centrality analysis

Temporal centrality measures

  • On temporal graph of ICALP
    • Temporal walk centrality

      • Ability to obtain and distribute information in temporal network

      • Parameter \(\alpha\): controls how  long walks are down-weighted compared to short walks

    • Temporal closeness

      • Area covered by temporal harmonic closeness plot

  • In collaboration with Filippos Christodoulou

July 4-8, 2022

Pierluigi Crescenzi

Celebrating ICALP

Centrality analysis

Temporal walk centrality in ICALP graph

E.D. Demaine E.D. Demaine E.D. Demaine H. Edelsbrunner
N. Alon N. Alon F.V. Fomin L.J. Guibas
F.V. Fomin F.V. Fomin N. Alon H. A. Maurer
L. Gasieniec L. Gasieniec L.J. Guibas G. Rozenberg
P. Bose P. Bose S. Saurabh B. Chazelle
J. Naor J. Naor H. Edelsbrunner E. Welzl
S. Saurabh S. Saurabh M. Taghi Hajiaghay A. Salomaa
R. Ostrovsky M. Taghi Hajiaghay D. Peleg F.V. Fomin
A. Wigderson A. Wigderson D. Lokshtanov J. Urrutia
M. Taghi Hajiaghayi R. Ostrovsky M.H. Overmars S. Ginsburg

\(\alpha=0.001\)

\(\alpha=0.01\)

\(\alpha=0.1\)

\(\alpha=1\)

July 4-8, 2022

Pierluigi Crescenzi

Celebrating ICALP

Centrality analysis

Temporal closeness in ICALP graph

July 4-8, 2022

Clicking on one author  makes its plot appear/disappear

  • Authors listed in non-increasing order of global temporal closeness

Pierluigi Crescenzi

Celebrating ICALP

Community analysis

Similarity based on paper titles

  • Degree of membership to a community based on similarity with conference anchors (same anchors as in wordclouds)
    • Similarity based on words appearing in titles of papers
  • In collaboration with Daniele Carnevale
  • You can check...

Three communities

Four communities

July 4-8, 2022

Pierluigi Crescenzi

Celebrating ICALP

  • P. Crescenzi, R. Grossi, M. Habib, L. Lanzi, A. Marino: On computing the diameter of real-world undirected graphs. Theor. Comput. Sci. 514: 84-95, 2013.
  • P. Crescenzi, C. Magnien, A. Marino: Finding Top-k Nodes for Temporal Closeness in Large Temporal Graphs. Algorithms 13(9): 211, 2020.
  • The dblp team: dblp computer science bibliography (monthly snapshot release of March 2022).
  • L.C. Freeman: Centrality in social networks: Conceptual clarification. Social Networks, 1 (3): 215–239, 1979.
  • genderize.io
  • Google Book Ngram Viewer
  • J. Huang, A.J. Gates, R. Sinatra, A.-L. Barabási: Historical comparison of gender inequality in scientific careers across countries and disciplines. Proc Natl Acad Sci, 117(9):4609-4616, 2020.
  • J. Leskovec, J.M. Kleinberg, C. Faloutsos: Graph evolution: Densification and shrinking diameters. ACM Trans. Knowl. Discov. Data 1(1): 2, 2007.
  • S. Milgram: The Small World Problem. Psychology Today, 1967.
  • L. Oettershagen, P. Mutzel, N.M. Kriege: Temporal Walk Centrality: Ranking Nodes in Evolving Networks. WWW 2022: 1640-1650.
  • Plotly Julia Library.
  • T. Sørensen: A method of establishing groups of equal amplitude in plant sociology based on similarity of species and its application to analyses of the vegetation on Danish commons. Kongelige Danske Videnskabernes Selskab, 5 (4): 1–34, 1948.
  • L.R. Dice: Measures of the Amount of Ecologic Association Between Species. Ecology, 26 (3): 297–302, 1945.

July 4-8, 2022

  • References
pierluigi.crescenzi@gssi.it

Celebrating 50 years of ICALP

By Pierluigi Crescenzi

Private

Celebrating 50 years of ICALP

A data and graph mining comparative analysis