Karl Ho
School of Economic, Political and Policy Sciences
University of Texas at Dallas
$$ \hat{f}(x_0) = \hat{\beta_0}+\hat{\beta_1}x_0+\hat{\beta_2}x_0^2+\hat{\beta_3}x_0^3+\hat{\beta_4}x_0^4 $$
$$ \hat{f}(x_0) = \hat{\beta_0}+\hat{\beta_1}x_0+\hat{\beta_2}x_0^2+\hat{\beta_3}x_0^3+\hat{\beta_4}x_0^4 $$
\(C_1(X) = I(X < 35), C_2(X) = I(35 <= X < 50),...,C_3(X) = I(X>= 65)\)
Smooth, local but no continuity
Top Left: The cubic polynomials unconstrained.
Top Right: The cubic polynomials constrained to be continuous at age=50. Bottom Left: Cubic polynomials constrained to be continuous, and to have continuous first and second derivatives. Bottom Right: A linear spline is shown, constrained to be continuous.
Knot at 50
$$y_{i}=\beta_0+\beta_1b_1(x_i)+\beta_2b_2(x_i)+ ... +\beta_{K+1}b_{K+1}(x_i) + \epsilon_i$$
Truncated function
Starting at 0 for continuity
Starting at 0 for continuity
$$y_{i}=\beta_0+\beta_1b_1(x_i)+\beta_2b_2(x_i)+ ... +\beta_{K+3}b_{K+3}(x_i) + \epsilon_i$$
Truncated power function
$$y_{i}=\beta_0+\beta_1b_1(x_i)+\beta_2b_2(x_i)+ ... +\beta_{K+3}b_{K+3}(x_i) + \epsilon_i$$
Truncated power function
A natural cubic spline extrapolates linearly beyond the boundary knots. This adds 4 = 2 × 2 extra constraints, and allows us to put more internal knots for the same degrees of freedom as a regular cubic spline.
Natural cubic spline is better!
Adding the last term in the cubic polynomial will lead to a discontinuity in only the third derivative at \(\xi\); the function will remain continuous, with continuous first and second derivatives, at each of the knots.
Natural cubic spline is better!
Fitting splines in R is easy: \(bs(x, ...)\) for any degree splines, and \(ns(x, ...)\) for natural cubic splines, in package \(splines\).
$$ \underset{g \in S}\text{minimize}\sum_{i=1}^n(y_i-g(x_i))^2+\lambda\int g^"(t)^2dt $$
The solution is a natural cubic spline, with a knot at every unique value of \(x_i\). The roughness penalty still controls the roughness via \(\lambda\).
Some details
We can specify degree of freedom \(df\) rather than \(\lambda\)!
In \(R: smooth.spline(age, wage, df = 10)\)
The leave-one-out (LOO) cross-validated error is given by:
$$ \text {RSS}_{cv}(\lambda) = \sum^{n}_{i=1}(y_i-\hat{g}_{\lambda}^{(-i)}(x_i))^2 = \sum^{n}_{i=1}\Bigg[\frac{y_i - \hat{g}_{\lambda}(x_i)}{1 - \{\text S_{\lambda}\}_{ii}}\Bigg]^2$$
This is probably the most difficult equation to understand and type in \(LaTeX\)!
Can fit a GAM simply using, e.g. natural splines:
Coefficients not that interesting; fitted functions are. The previous plot was produced using \(plot.gam\).
Can mix terms — some linear, some nonlinear — and use \(anova()\) to compare models.
Can use smoothing splines or local regression as well:
GAMs are additive, although low-order interactions can be included in a natural way using, e.g. bivariate smoothers or interactions of the form \(ns(age,df=5):ns(year,df=5)\).
lm(wage ∼ ns(year, df = 5) + ns(age, df = 5) + education)
gam(wage ∼ s(year, df = 5) + lo(age, span = .5) + education)
$$ log\Bigg(\frac{p(X)}{1-p(X)}\Bigg) = \beta_0+f_1(X_1)+f_2(X_2)+\cdot\cdot\cdot+f_p(X_p)$$
gam(I(wage > 250) ∼ year + s(age, df = 5) + education, family = binomial)