Social and Political Data Science: Introduction

Knowledge Mining

Karl Ho

School of Economic, Political and Policy Sciences

University of Texas at Dallas

Exploratory Data Analysis

Exploratory Data Analysis (EDA)

  • Data visualization basics

  • Hypothesis generation

  • Visualizing, Transforming, and Modeling data

  • Refine research questions and/or generate new questions

Data Visualization

  • Frequency table

  • Histogram 

    • A histogram divides the x-axis into equally spaced bins and then uses the height of a bar to display the number of observations that fall in each bin.

  • Improve the chart/table visualization

Data Visualization

  • Data scales                              

    • Nominal

    • Ordinal

    • Interval

    • Ratio

Quantitative

Qualitative

}

}

Data Visualization

  • Categorical variables                             

    • Nominal

    • Ordinal

  • as.factor, as.ordered in R

Data Visualization

  • Continuous variables                             

    • Interval

    • Ratio

  • as.integer, as.numeric in R