Social and Political Data Science: Introduction

Karl Ho

University of Texas at Dallas

Governance: Integrating Public Voices for \(CO_2\) Management and Energy Policies using GPT and Generative Data Methods

Prepared for presentation at the The 15th International Conference and Practical Forum on Public Governance –Collaborative Governance for Sustainability, Resilience, and Social Transformation, National Chung Hsing University, Taichung, Taiwan, December 8-10, 2023

Karl Ho is:
- Professor of Instruction at University of Texas at Dallas (UTD) School of Economic, Political and Policy Sciences (EPPS)
- Co-founder of the UTD Social Data Analytics and Research program (SDAR)
- Founder of DataGeneration.io
- Author of Data Programming
- Co-Principal Investigator of the Hong Kong Election Study project
- Website: karlho.com (talks, lecture, publications)

Speaker bio.

Climate change and air quality become center stage topics for public discourses particularly during election times. Taiwan voters, for instance, are eager to examine presidential candidates' energy and \(CO_2\) management policies in evaluating their electoral decisions.

Introduction

This project proposes a sophisticated AI-driven approach to evaluate and predict public support for \(CO_2\) management and energy policies. By integrating Generative Pre-trained Transformers (GPT) with generative data methods, we aim to process and interpret large-scale public opinion data.

Source: Taiwan News

Source: Executive Yuan

The design of this study is composed of three elements:

Big data
Generative Pre-trained Transformer (GPT) models
Predictive Scenario Analysis

It begins with the collection of diverse datasets to build text data corpora, followed by extensive data preprocessing and feature engineering and predictive models with scenario analysis.

Design

The use of GPT models facilitates a deep contextual understanding of public sentiments and perspectives. In parallel, generative data methods, particularly Generative Adversarial Networks (GANs), are employed for scenario analysis.

GPT models

Generative data methods generate synthetic data representing various hypothetical scenarios, enabling the exploration of how public opinion might shift in response to different policy changes. Sentiment analysis and opinion mining are then applied to gauge public support or opposition and to identify specific policy aspects that drive public opinion.

Synthetic data

Most recent studies demonstrate GPT can be used to generate "virtual populations" using "silicon sampling" to simulate targeted human responses (Argyle et al. 2023).

Synthetic data

Most recent studies demonstrate GPT can be used to generate "virtual populations" using "silicon sampling" to simulate targeted human responses (Argyle et al. 2023).

Synthetic data

Most recent studies demonstrate GPT can be used to generate "virtual populations" using "silicon sampling" to simulate targeted human responses (Argyle et al. 2023).

Synthetic data

The study also proposes including predictive modeling and scenario forecasting, using the developed models to predict future trends in public opinion and to forecast how these might evolve under different policy scenarios. This approach not only offers a comprehensive understanding of current public sentiments but also equips policymakers with tools to align policies more closely with public preferences and concerns, thereby enhancing the effectiveness of environmental policy-making.

Predictive models and scenario forecasting

The following outlines the steps in building a GPT-based policy advising system to more accurately integrate public opinion in policy making:

Data collection: Corpora building
Data Preprocessing and Feature Engineering
Application of GPT for Contextual Analysis
Sentiment Analysis and Opinion Mining
Generative Data Modeling for Scenario Analysis

Steps

Gather large datasets of public opinion data related to CO2 management and energy policies, including:

Social media posts
Survey responses
Forum discussions
- PTT
- DCard
News comments (e.g. editorials)

Step 1: Corpora building

Preprocess the data for NLP tasks:
- Perform feature engineering to identify key variables influencing public opinion, such as:
  - sentiments
  - slow onset events and extreme weather events
  - frequency of policy mentions
  - other factors such as demographics
Utilize NLP techniques to extract and construct meaningful features from unstructured data.

Step 2: Data Preprocessing and Feature Engineering

Use GPT-based models for deep contextual analysis of public opinions, understanding underlying sentiments and perspectives.
Build context for conditioning in silicon sampling

Step 3: Application of GPT for Contextual Analysis

Conduct temporal sentiment analysis to gauge public support or opposition over time.
Use opinion mining to identify specific policy aspects that are most influential in public opinion.

Step 4: Sentiment Analysis and Opinion Mining:

Employ generative models like GANs to create synthetic data representing various hypothetical public opinion scenarios under different policy conditions especially in contexts with limited data (few-shot learning).
Analyze how public opinion might shift in response to changes in CO2 management and energy policies.

Step 5: Generative Data Modeling for Scenario Analysis

This proposal will lay out future roadmap to achieve comprehensive understanding of public sentiments and opinion movements
Challenges:
- Fine-tuning
  - High computational cost for model
  - Spurious features of training data (e.g. fake news, biased surveys)

Expected Outcomes and Challenges

Project Significance:
- New technology in AI allows better understanding public opinion via GPT and predictive scenario analysis
- It will significantly enhance environmental policy-making
- It helps stake-holders align policies with true, real-time public preferences.

Conclusion

Q: Why not just use surveys?
A: Surveys are snapshots of public opinion, which could be subject to change over time and different scenarios (e.g. ECFA). Using GPT-based methods will scale and improve overtime, giving consistent metrics for evaluating biases and stable vs. precarious public opinions.
Q: Can A.I. methods be absolutely correct?
A: There is always a part of public opinion we can never fully measure or understand. With AI assistance, we will learn about this sub-population better and better over time.

Q & A

AI-driven Data Collection
Data Preprocessing and Feature Engineering
Application of GPT for Contextual Analysis
Generative Data Modeling for Scenario Analysis
Sentiment Analysis and Opinion Mining
Predictive Modeling and Scenario Forecasting
Interactive Dashboard and Visualization

Directions

Thank you!

NCHU & UTD Dual degree in Data Science

Talk: Artificial Intelligence and Sustainability Governance: Integrating Public Voices for CO_2 Management and Energy Policies using GPT and Generative Data Methods

By Karl Ho

Talk: Artificial Intelligence and Sustainability Governance: Integrating Public Voices for CO_2 Management and Energy Policies using GPT and Generative Data Methods

a year ago
444

Karl Ho

Data Generation datageneration.io

Governance: Integrating Public Voices for \(CO_2\) Management and Energy Policies using GPT and Generative Data Methods

Karl Ho is:

Speaker bio.

Introduction

The design of this study is composed of three elements:

Big data

Generative Pre-trained Transformer (GPT) models

Predictive Scenario Analysis

It begins with the collection of diverse datasets to build text data corpora, followed by extensive data preprocessing and feature engineering and predictive models with scenario analysis.

Design

The use of GPT models facilitates a deep contextual understanding of public sentiments and perspectives. In parallel, generative data methods, particularly Generative Adversarial Networks (GANs), are employed for scenario analysis.

GPT models

Synthetic data

Most recent studies demonstrate GPT can be used to generate "virtual populations" using "silicon sampling" to simulate targeted human responses (Argyle et al. 2023).

Synthetic data

Most recent studies demonstrate GPT can be used to generate "virtual populations" using "silicon sampling" to simulate targeted human responses (Argyle et al. 2023).

Synthetic data

Most recent studies demonstrate GPT can be used to generate "virtual populations" using "silicon sampling" to simulate targeted human responses (Argyle et al. 2023).

Synthetic data

Predictive models and scenario forecasting

The following outlines the steps in building a GPT-based policy advising system to more accurately integrate public opinion in policy making:

Steps

Gather large datasets of public opinion data related to CO2 management and energy policies, including:

Step 1: Corpora building

Preprocess the data for NLP tasks:

Perform feature engineering to identify key variables influencing public opinion, such as:

sentiments

slow onset events and extreme weather events

frequency of policy mentions

other factors such as demographics

Utilize NLP techniques to extract and construct meaningful features from unstructured data.

Step 2: Data Preprocessing and Feature Engineering

Use GPT-based models for deep contextual analysis of public opinions, understanding underlying sentiments and perspectives.

Step 3: Application of GPT for Contextual Analysis

Conduct temporal sentiment analysis to gauge public support or opposition over time.

Use opinion mining to identify specific policy aspects that are most influential in public opinion.

Step 4: Sentiment Analysis and Opinion Mining:

Employ generative models like GANs to create synthetic data representing various hypothetical public opinion scenarios under different policy conditions especially in contexts with limited data (few-shot learning).

Analyze how public opinion might shift in response to changes in CO2 management and energy policies.

Step 5: Generative Data Modeling for Scenario Analysis

This proposal will lay out future roadmap to achieve comprehensive understanding of public sentiments and opinion movements

Expected Outcomes and Challenges

Project Significance:

New technology in AI allows better understanding public opinion via GPT and predictive scenario analysis

It will significantly enhance environmental policy-making

It helps stake-holders align policies with true, real-time public preferences.

Conclusion

Q: Why not just use surveys?

A: Surveys are snapshots of public opinion, which could be subject to change over time and different scenarios (e.g. ECFA). Using GPT-based methods will scale and improve overtime, giving consistent metrics for evaluating biases and stable vs. precarious public opinions.

Q: Can A.I. methods be absolutely correct?

A: There is always a part of public opinion we can never fully measure or understand. With AI assistance, we will learn about this sub-population better and better over time.

Q & A

AI-driven Data Collection

Data Preprocessing and Feature Engineering

Application of GPT for Contextual Analysis

Generative Data Modeling for Scenario Analysis

Sentiment Analysis and Opinion Mining

Predictive Modeling and Scenario Forecasting

Interactive Dashboard and Visualization

Directions

Thank you!

NCHU & UTD Dual degree in Data Science

Talk: Artificial Intelligence and Sustainability Governance: Integrating Public Voices for CO_2 Management and Energy Policies using GPT and Generative Data Methods

More from Karl Ho