Social and Political Data Science: Introduction

Computational Social Science

Karl Ho

School of Economic, Political and Policy Sciences

University of Texas at Dallas


Computational social science ≠ computer science + social data 

  • What is Computational Social Science?

    • Emerging interdisciplinary field that combines social science theories and methods with computational techniques to study complex social phenomena.

    • Featuring large-scale data analysis and computer simulations to explore social dynamics, understand human behavior, and test social theories.

    • Employing data from social media, online networks, digital archives, and administrative records, to uncover patterns, trends, and insights.

  • How CCS works?

    • Introducing computational tools and algorithms that can handle the complexities of social data, such as network analysis, text mining, machine learning, and agent-based modeling.

    • Transforming social science research, offering new ways to address longstanding questions, discover novel patterns, and gain deeper insights into social phenomena.


  • Challenges

    • ethical considerations related to data privacy, bias, and algorithmic transparency

    • interdisciplinary collaboration and methodological rigor.



  • CCS needs new training and education programs to equip researchers with the necessary skills and knowledge to leverage computational techniques effectively.

  • New Challenges

    • Misalignment of universities
      • Integration of social science with computer science and data science is slow
      • Parochialism
        • Data Science in traditional STEM disciplines not recognizing structure and generation of new big/social data
      • Multidisciplinary research may be less well recognized and rewarded.
    • Inadequate Data-Sharing Paradigm
      • Facebook, Twitter limit data access (API changes, blocks to data collection)
      • Found data DGP
      • Big tech gags research/academic freedom
  • Suggestions

    • Strengthen collaboration
    • New data structures
    • Factor ethical, legal and social implication in data design
    • Reorganize higher institutions/universities

What is difference between CSS and Data Science?

  • CSS uses Data Science methods and tools for Social Science studies and solve social and political problems:

    • Machine Learning

    • Collection of Big/Social data

    • Analytics: visualization, data/information management

How about SDAR?

  • Social Data Analytics and Research is:

    • Data Science 

    • Also CSS on computational part

    • Interdisciplinary by design, not limited to Social Science

      • Causal Inference, Methods, Forecasting (Economics)

      • Social and Political studies (Sociology, Political Science)

      • Policy studies (PPPE, PNM)

      • Spatial analysis (GIS)

Some key terms

  • Algorithm

  • Artificial Intelligence (AI)

  • Computational thinking

  • Computational modeling

  • Parallel computing

  • Machine learning

  • Deep learning (NN, CNN, RNN)

  • NLP (Natural Language Processing)

  • Language Models or Large Language Models (LLM)

  • Agent-based Modeling

ML, DL, NLP and AI

CNN Model in Action

CNN Model in Action

Parallel Computing: Amdahl's law

Amdahl's Law: 

$$S = \frac{1}{(1-P)+\frac{P}{N}}$$


- S is the speedup of the system.
- P is the proportion of the system that can be improved.
- N is the number of processors used in the system. 

Amdahl's Law is used to calculate the theoretical speedup of a system when making improvements to only a portion of the system, given the proportion of the system that can be improved and the number of processors used in the system.

Amdahl, Gene M. "Validity of the Single Processor Approach to Achieving Large Scale Computing Capabilities." Proceedings of the April 18-20, 1967, Spring Joint Computer Conference. 1967.

Parallel Computing: Amdahl's law

  • doSNOW has the advantage of working on both Windows and Mac OS X.

R packages: doSNOW