Social and Political Data Science: Introduction

Knowledge Mining

Karl Ho

School of Economic, Political and Policy Sciences

University of Texas at Dallas

Non-linear Models

Step functions

$$C_1(X) = I(X < 35), C_2(X) = I(35 <= X < 50),...,C_3(X) = I(X>= 65)$$

Piecewise Polynomials

Better to add constraints to the polynomials, e.g. continuity.

Smooth, local but no continuity

Top Left: The cubic polynomials unconstrained.

Top Right: The cubic polynomials constrained to be continuous at age=50. Bottom Left: Cubic polynomials constrained to be continuous, and to have continuous first and second derivatives. Bottom Right: A linear spline is shown, constrained to be continuous.

Knot at 50

Linear Splines

A linear spline with knots at $$\xi_k, k = 1,...,K$$ is a piecewise linear polynomial continuous at each knot.

$$y_{i}=\beta_0+\beta_1b_1(x_i)+\beta_2b_2(x_i)+ ... +\beta_{K+1}b_{K+1}(x_i) + \epsilon_i$$

Here the $$()_+$$ means positive part; i.e.

Truncated function

Starting at 0 for continuity

Linear Splines

Starting at 0 for continuity

}

Cubic Splines

A cubic spline with knots at  $$\xi_k, k = 1,...,K$$ is a piecewise cubic polynomial with continuous derivatives up to order 2 at each knot.

$$y_{i}=\beta_0+\beta_1b_1(x_i)+\beta_2b_2(x_i)+ ... +\beta_{K+3}b_{K+3}(x_i) + \epsilon_i$$

Truncated power function

Adding the last term in the cubic polynomial will lead to a discontinuity in only the third derivative at $$\xi$$; the function will remain continuous, with continuous first and second derivatives, at each of the knots.

Cubic Splines

Natural cubic spline is better!