Social and Political Data Science: Introduction

Knowledge Mining

Karl Ho

School of Economic, Political and Policy Sciences

University of Texas at Dallas

Non-linear Models 

Linear models:

  • Simple

  • Interpretable

  • Great for prediction

  • Easy to fit

Linear models

Non-Linear models:

  • Simple

  • Interpretable

  • Great for prediction

  • Easy to fit

Non-Linear models

? ? ?

? ? ?

? ? ?

? ? ?

Not necessarily!!

  • The truth is never or almost linear!

  • Yet often the linearity assumption is good.

  • What is linearity?

 

Non-Linear models

Non-Linear models

  • polynomials,

  • step functions,

  • splines,

  • local regression

  • generalized additive  models

  • Can be simple and interpretable as linear models.

Polynomial Regression

$$y_{i}=\beta_0+\beta_1x_i+\beta_2x_i^2+\beta_3x_i^3 + ... +\beta_dx_i^d + \epsilon_i$$

Polynomial regression extends the linear model by adding extra predictors, obtained by raising each of the original predictors to a power. For example, a cubic regression uses three variables, \(X, X^2 and  X^3\), as predictors. This approach provides a simple way to provide a non- linear fit to data.

Polynomial Regression

$$y_{i}=\beta_0+\beta_1x_i+\beta_2x_i^2+\beta_3x_i^3 + ... +\beta_dx_i^d + \epsilon_i$$

Step functions

Step functions cut the range of a variable into \(K\) distinct regions in order to produce a qualitative variable. This has the effect of fitting a piecewise constant function.

$$y_{i}=\beta_0+\beta_1(C_1)x_i+\beta_2(C_2)x_i+ ... +\beta_K(C_K)x_i + \epsilon_i$$

Step functions

\(C_1(X) = I(X < 35), C_2(X) = I(35 <= X < 50),...,C_3(X) = I(X>= 65)\)

Step functions

  • Easy to work with. Creates a series of dummy variables representing each group.

  • Useful way of creating interactions that are easy to interpret. For example, interaction effect of Year and Age:

  • $$ I (Year < 2005) · Age, I (Year 2005) · Age $$would allow for different linear functions in each age category.

  • Choice of cutpoints or knots can be problematic. 

Regression splines

Regression splines are more flexible than polynomials and step functions, and in fact are an extension of the two. They involve dividing the range of \(X\) into \(K\) distinct regions.

  • Within each region, a polynomial function is fit to the data. However, these polynomials are constrained so that they join smoothly at the region boundaries, or knots. Provided that the interval is divided into enough regions, this can produce an extremely flexible fit.

Piecewise Polynomials

$$y_{i}=\beta_0+\beta_1x_i+\beta_2x_i^2+\beta_3x_i^3 +  \epsilon_i$$

Piecewise polynomial regression fits separate low-degree polynomials over different regions of \(X\). For example,

where the coefficients \(\beta_0, \beta_1, \beta_2, and  \beta_3\) differ in different parts of the range of \(X\). The points where the coefficients change are called \(knots\).

Piecewise Polynomials

Better to add constraints to the polynomials, e.g. continuity.

Smooth, local but no continuity

Top Left: The cubic polynomials unconstrained.

Top Right: The cubic polynomials constrained to be continuous at age=50. Bottom Left: Cubic polynomials constrained to be continuous, and to have continuous first and second derivatives. Bottom Right: A linear spline is shown, constrained to be continuous.

Knot at 50

Linear Splines

A linear spline with knots at \(\xi_k, k = 1,...,K\) is a piecewise linear polynomial continuous at each knot.

$$y_{i}=\beta_0+\beta_1b_1(x_i)+\beta_2b_2(x_i)+ ... +\beta_{K+1}b_{K+1}(x_i) + \epsilon_i$$

where the \(b_k\) are basis functions.

Here the \(()_+\) means positive part; i.e.

Truncated function

Starting at 0 for continuity

Linear Splines

Starting at 0 for continuity

}

Cubic Splines

A cubic spline with knots at  \(\xi_k, k = 1,...,K\) is a piecewise cubic polynomial with continuous derivatives up to order 2 at each knot.

$$y_{i}=\beta_0+\beta_1b_1(x_i)+\beta_2b_2(x_i)+ ... +\beta_{K+3}b_{K+3}(x_i) + \epsilon_i$$

Truncated power function

Adding the last term in the cubic polynomial will lead to a discontinuity in only the third derivative at \(\xi\); the function will remain continuous, with continuous first and second derivatives, at each of the knots.

Cubic Splines

Natural cubic spline is better!

Smoothing splines

Smoothing splines are similar to regression splines, but arise in a slightly different situation. Smoothing splines result from minimizing a residual sum of squares criterion subject to a smoothness penalty.

Local regression

Local regression is similar to splines, but differs in an important way. The regions are allowed to overlap, and indeed they do so in a very smooth way.

Generalized additive models (GAM)

Generalized additive models accommodates different nonlinear methods to deal with multiple predictors.