Social and Political Data Science: Introduction

Karl Ho

University of Texas at Dallas

 Governance: Integrating Public Voices for \(CO_2\) Management and Energy Policies using GPT and Generative Data Methods

Prepared for presentation at the The 15th International Conference and Practical Forum on Public Governance –Collaborative Governance for Sustainability, Resilience, and Social Transformation, National Chung Hsing University, Taichung, Taiwan, December 8-10, 2023

Speaker bio.

Climate change and air quality become center stage topics for public discourses particularly during election times.  Taiwan voters, for instance, are eager to examine presidential candidates' energy and \(CO_2\) management policies in evaluating their electoral decisions.

Introduction

 This project proposes a sophisticated AI-driven approach to evaluate and predict public support for \(CO_2\) management and energy policies. By integrating Generative Pre-trained Transformers (GPT) with generative data methods, we aim to process and interpret large-scale public opinion data.

Source: Taiwan News

Source: Executive Yuan

The design of this study is composed of three elements:

  • Big data

  • Generative Pre-trained Transformer (GPT) models

  • Predictive Scenario Analysis

It begins with the collection of diverse datasets to build text data corpora, followed by extensive data preprocessing and feature engineering and predictive models with scenario analysis.

Design

The use of GPT models facilitates a deep contextual understanding of public sentiments and perspectives. In parallel, generative data methods, particularly Generative Adversarial Networks (GANs), are employed for scenario analysis.

GPT models

Generative data methods generate synthetic data representing various hypothetical scenarios, enabling the exploration of how public opinion might shift in response to different policy changes. Sentiment analysis and opinion mining are then applied to gauge public support or opposition and to identify specific policy aspects that drive public opinion.

Synthetic data

Most recent studies demonstrate GPT can be used to generate "virtual populations" using "silicon sampling" to simulate targeted human responses (Argyle et al. 2023). 

Synthetic data

Most recent studies demonstrate GPT can be used to generate "virtual populations" using "silicon sampling" to simulate targeted human responses (Argyle et al. 2023). 

Synthetic data

Most recent studies demonstrate GPT can be used to generate "virtual populations" using "silicon sampling" to simulate targeted human responses (Argyle et al. 2023). 

Synthetic data

The study also proposes including predictive modeling and scenario forecasting, using the developed models to predict future trends in public opinion and to forecast how these might evolve under different policy scenarios. This approach not only offers a comprehensive understanding of current public sentiments but also equips policymakers with tools to align policies more closely with public preferences and concerns, thereby enhancing the effectiveness of environmental policy-making.

Predictive models and scenario forecasting

The following outlines the steps in building a GPT-based policy advising system to more accurately integrate public opinion in policy making:

  1. Data collection: Corpora building
  2. Data Preprocessing and Feature Engineering
  3. Application of GPT for Contextual Analysis
  4. Sentiment Analysis and Opinion Mining
  5. Generative Data Modeling for Scenario Analysis

Steps

Gather large datasets of public opinion data related to CO2 management and energy policies, including:

  • Social media posts
  • Survey responses
  • Forum discussions
    • PTT
    • DCard
  • News comments (e.g. editorials)

Step 1: Corpora building

  • Preprocess the data for NLP tasks:

    • Perform feature engineering to identify key variables influencing public opinion, such as:

      • sentiments

      • slow onset events and extreme weather events

      • frequency of policy mentions

      • other factors such as demographics

  • Utilize NLP techniques to extract and construct meaningful features from unstructured data.

Step 2: Data Preprocessing and Feature Engineering

  • Use GPT-based models for deep contextual analysis of public opinions, understanding underlying sentiments and perspectives.

  • Build context for conditioning in silicon sampling

Step 3: Application of GPT for Contextual Analysis

  • Conduct temporal sentiment analysis to gauge public support or opposition over time.

  • Use opinion mining to identify specific policy aspects that are most influential in public opinion.

Step 4:  Sentiment Analysis and Opinion Mining:

  • Employ generative models like GANs to create synthetic data representing various hypothetical public opinion scenarios under different policy conditions especially in contexts with limited data (few-shot learning).

  • Analyze how public opinion might shift in response to changes in CO2 management and energy policies.

Step 5: Generative Data Modeling for Scenario Analysis

  • This proposal will lay out future roadmap to achieve comprehensive understanding of public sentiments and opinion movements

  • Challenges:
    • Fine-tuning
      • High computational cost for model
      • Spurious features of training data (e.g. fake news, biased surveys)

Expected Outcomes and Challenges

  • Project Significance:  

    • New technology in AI allows better understanding public opinion via GPT and predictive scenario analysis

    • It will significantly enhance environmental policy-making

    • It helps stake-holders align policies with true, real-time public preferences.

Conclusion 

  • Q: Why not just use surveys?

  • A: Surveys are snapshots of public opinion, which could be subject to change over time and different scenarios (e.g. ECFA).  Using GPT-based methods will scale and improve overtime, giving consistent metrics for evaluating biases and stable vs. precarious public opinions.

  • Q: Can A.I. methods be absolutely correct?

  • A: There is always a part of public opinion we can never fully measure or understand.  With AI assistance, we will learn about this sub-population better and better over time. 

Q & A

  1. AI-driven Data Collection

  2. Data Preprocessing and Feature Engineering

  3. Application of GPT for Contextual Analysis

  4. Generative Data Modeling for Scenario Analysis

  5. Sentiment Analysis and Opinion Mining

  6. Predictive Modeling and Scenario Forecasting

  7. Interactive Dashboard and Visualization

Directions

Thank you!

NCHU & UTD Dual degree in Data Science

Talk: Artificial Intelligence and Sustainability Governance: Integrating Public Voices for CO_2 Management and Energy Policies using GPT and Generative Data Methods

By Karl Ho

Talk: Artificial Intelligence and Sustainability Governance: Integrating Public Voices for CO_2 Management and Energy Policies using GPT and Generative Data Methods

  • 291