Thanks to Luca Aceto, Daniele Carnevale, Filippos Christodoulou, and Oleksandr Skorupskyy

Pierluigi Crescenzi

Gran Sasso Science Institute

SIROCCO 2023, Alcalá de Henares, June 9, 2023

Thirty Years of SIROCCO

A Data and Graph Mining Comparative Analysis of Its Temporal Evolution

Thanks to L. Aceto, D. Carnevale, F. Christodoulou, and O. Skorupsky

  • Computer science is a "young" discipline with a long history
    • According to Wikipedia, the term appears in [Fein, 1960] but "algorithms for performing computations have existed since antiquity"
  • Many computer science conferences are now active
    • 6240 in DBLP dataset (at July 7, 2023)
      • It was 6036 at November 26, 2022
  • Sufficient data for data and graph mining analysis not only on entire DBLP dataset, but also on specific (set of) conference(s)
  • Many of them are celebrating or have celebrated special anniversaries

Thirthy years of SIROCCO 

Motivation

Pierluigi Crescenzi

June 9, 2023

SIROCCO 2023

  • 16 conferences (besides SIROCCO)
    • 4 general (FOCS, ICALP, STACS, STOC)
    • 2 algorithms (ESA, SODA)
    • 2 automated verification (CAV, TACAS)
    • 2 cryptography (CRYPTO, EUROCRYPT)
    • 2 distributed computation (DISC, PODC)
    • 2 logic (CSL, LICS)
    • 2 programming languages (ESOP, POPL)

CSL: no edition in 2019

DISC: no edition in 1988

ESOP: no edition in 1987, 1989, 1991, 1993, 1995, and 1997

EUROCRYPT: no edition in 1983

ICALP: no edition in 1973 and 1975

POPL: no edition in 1974 and from 2018 (journal PACMPL)

  • First editions
    • CAV: 1990
    • CRYPTO: 1981
    • CSL: 1987
    • DISC: 1987
    • ESA: 1993
    • ESOP: 1986
    • ECRYPT: 1982
    • FOCS: 1960
    • ICALP: 1972
    • LICS: 1986
    • PODC: 1982
    • POPL: 1973
    • SODA: 1990
    • STACS: 1984
    • STOC: 1969
    • TACAS: 1995

Thirthy years of SIROCCO 

Dataset

Pierluigi Crescenzi

June 9, 2023

SIROCCO 2023

  • Similarity computed as Sorensen-Dice index
  • STOC and FOCS, and CRYPTO and EUROCRYPT most similar (over 0.5)
  • Similarity seems to respect disciplines
  • Conferences more similar to SIROCCO
    • DISC (0.26)
    • PODC (0.21)
    • ESA (0.13)
    • STACS (0.11)

Thirthy years of SIROCCO 

Similarity among sets of authors

Pierluigi Crescenzi

June 9, 2023

SIROCCO 2023

  • The German approach
    • Fixed bound on number of accepted papers
  • The EC approach
    • Growth is the only thing that matters
  • The Aristotelian approach
    • In medio stat virtus
  • SIROCCO is German

Thirthy years of SIROCCO 

Evolution of number of papers

Pierluigi Crescenzi

June 9, 2023

SIROCCO 2023

  • Almost all conferences (e.g. STOC) seem to prefer papers with two authors
    • Also SIROCCO
  • One conference (CSL) seem to prefer single-author papers to multiple-author ones
  • One (TACAS) seem to prefer even papers with five authors to single-author papers
  • Maximum number of co-authors: 24 (TACAS)
  • Minimum maximum number of co-authors: 7 (CSL and LICS)

Thirthy years of SIROCCO 

Evolution of co-authorship set sizes

Pierluigi Crescenzi

June 9, 2023

SIROCCO 2023

  • During the evolution of SIROCCO, papers with two, three, four, and even five authors have become more popular than single-author papers
  • Similar results for the other conferences
  • Paper with 13 authors
    • Bemmann, Biermeier, Bürmann, Kemper, Knollmann, Knorr, Kothe, Mäcker, Malatyali, Meyer auf der Heide, Riechers, Schaefer, Sundermeier: Monitoring of Domain-Related Problems in Distributed Data Streams

Thirthy years of SIROCCO 

Evolution of co-authorship set sizes

Pierluigi Crescenzi

June 9, 2023

SIROCCO 2023

  • Red: new authors
  • Blue: fully new authors
    • First paper published in conference not co-authored with  author who already published in conference

Thirthy years of SIROCCO 

New and fully new authors

Pierluigi Crescenzi

June 9, 2023

SIROCCO 2023

  • Mostly by querying the genderize.io web service
    • Based on first names
  • Partly by manually searching on the web

Thirthy years of SIROCCO 

Female/male label assignment

Pierluigi Crescenzi

June 9, 2023

SIROCCO 2023

Thirthy years of SIROCCO 

Evolution of female percentage

Pierluigi Crescenzi

June 9, 2023

SIROCCO 2023

  • Static (undirected) graph
    • Nodes: authors who presented at least one paper at conference
    • Edges: \((a_1,a_2)\) if \(a_1\) and \(a_2\) co-authored at least one paper (not necessarily presented at the conference)
    • In other words: subgraph of DBLP graph induced by set of conference authors
  • Temporal graph
    • Nodes: same as above
    • Edges: \((a_1,a_2,y)\) if \(a_1\) and \(a_2\) co-authored at least one paper (not necessarily presented at the conference) in year \(y\)

Thirthy years of SIROCCO 

Collaboration graphs

Pierluigi Crescenzi

June 9, 2023

SIROCCO 2023

  • Graphs sparse, not connected, with giant connected component

\(|E| \propto |V|^\alpha\)

CAV
CRYPTO
CSL
DISC
ESA
ESOP
EUROCRYPT
FOCS
ICALP
LICS
PODC
POPL
SIROCCO
SODA
STACS
STOC
TACAS

2733
1988
1455
1602
2868
1367
1821
3346
4837
1953
2393
1979
923
4173
2740
3192
2239

17077
14323
5560
8165
22761
6368
12552
27021
37511
9886
13000
9691
4858
36097
16469
27199
13901

0.0046
0.0073
0.0053
0.0064
0.0055
0.0068
0.0076
0.0048
0.0032
0.0052
0.0045
0.0050
0.0114
0.0041
0.0044
0.0053
0.0055

0.98
0.95
0.95
0.97
0.99
0.97
0.94
0.96
0.98
0.98
0.96
0.96
0.98
0.99
0.97
0.98
0.98

1.31
1.40
1.27
1.29
1.28
1.35
1.44
1.63
1.47
1.28
1.30
1.34
1.28
1.28
1.32
1.53
1.26

Conference

N. nodes

N. edges

Density

LCC

\(\alpha\)

Thirthy years of SIROCCO 

Static graph statistics

Pierluigi Crescenzi

June 9, 2023

SIROCCO 2023

  • Temporal closeness
    • Area covered by temporal harmonic closeness plot

Thirthy years of SIROCCO 

Temporal closeness

Pierluigi Crescenzi

June 9, 2023

SIROCCO 2023

  • Intersection between the sets of the top-50 authors with respect to the temporal closeness

Thirthy years of SIROCCO 

Similarity based on temporal closeness

Pierluigi Crescenzi

June 9, 2023

SIROCCO 2023

  • Degree of membership to a community based on similarity with conference anchors (same anchors as in wordclouds)
    • Similarity based on words appearing in titles of papers
    • If you published a paper at ICALP, you can try...

Thirthy years of SIROCCO 

TCS communities

Pierluigi Crescenzi

June 9, 2023

SIROCCO 2023

  • SIROCCO represented by all its authors who published at least one paper at ICALP
    • Biased...
  • SIROCCO is the place where algorithm researchers like to play the role of distributed computing researchers

Thirthy years of SIROCCO 

SIROCCO and TCS communities

Pierluigi Crescenzi

June 9, 2023

SIROCCO 2023

  • P. Crescenzi, C. Magnien, A. Marino: Finding Top-k Nodes for Temporal Closeness in Large Temporal Graphs. Algorithms 13(9): 211, 2020.
  • The dblp team: dblp computer science bibliography (monthly snapshot release of March 2022).
  • genderize.io
  • Google Book Ngram Viewer
  • J. Huang, A.J. Gates, R. Sinatra, A.-L. Barabási: Historical comparison of gender inequality in scientific careers across countries and disciplines. Proc Natl Acad Sci, 117(9):4609-4616, 2020.
  • J. Leskovec, J.M. Kleinberg, C. Faloutsos: Graph evolution: Densification and shrinking diameters. ACM Trans. Knowl. Discov. Data 1(1): 2, 2007.
  • Plotly Julia Library.
  • T. Sørensen: A method of establishing groups of equal amplitude in plant sociology based on similarity of species and its application to analyses of the vegetation on Danish commons. Kongelige Danske Videnskabernes Selskab, 5 (4): 1–34, 1948.
  • L.R. Dice: Measures of the Amount of Ecologic Association Between Species. Ecology, 26 (3): 297–302, 1945.
  • References
pierluigi.crescenzi@gssi.it

Thirthy years of SIROCCO 

References and web site

Pierluigi Crescenzi

June 9, 2023

SIROCCO 2023

pierluigi.crescenzi@gssi.it

SIROCCO for dummies

By Pierluigi Crescenzi

Private

SIROCCO for dummies

Thirty Years of SIROCCO. A Data and Graph Mining Comparative Analysis of Its Temporal Evolution