PSY 716
Models are considered nested when:
Example of nested models:
The F test for change in R² compares two nested models to determine if adding terms significantly improves fit:
$$ F = \frac{(R^2_{full} - R^2_{reduced})/(df_{full} - df_{reduced})}{(1 - R^2_{full})/(N - df_{full} - 1)} $$
Where:
The F test for nested models is directly related to sum of squares:
$$ F = \frac{(SS_{reduced} - SS_{full})/(df_{full} - df_{reduced})}{SS_{full}/(N - df_{full} - 1)} $$
Where:
This highlights that different SS types are just different model comparisons!
For the F test comparing nested models:
The F statistic follows an F distribution with these degrees of freedom under the null hypothesis that the additional parameters equal zero.
As we've discussed ad nauseam at this point, ANOVA models can be expressed as regression models using indicator variables:
$$y_{ij} = \mu + \alpha_i + \epsilon_{ij}$$
is equivalent to:
$$y_{ij} = \beta_0 + \beta_1 x_{1ij} + \beta_2 x_{2ij} + ... + \epsilon_{ij}$$
Consider a one-way ANOVA with factor A having 3 levels:
ANOVA notation: $$y_{ij} = \mu + \alpha_i + \epsilon_{ij}$$
Regression formulation:
$$y_{ij} = \beta_0 + \beta_1 x_{1ij} + \beta_2 x_{2ij} + \epsilon_{ij}$$
If we use effects coding (the default for ANOVA), then:
$$\beta_0 = \mu + \alpha_3$$ (reference level)
$$\beta_1 = \alpha_1 - \alpha_3$$ (difference between level 1 and reference)
$$\beta_2 = \alpha_2 - \alpha_3$$ (difference between level 2 and reference)
This is why regression tests individual coefficients, while ANOVA tests the overall effect.
The overall F-test in ANOVA (testing if \( \alpha_i = 0 \) for all \(i )\) is equivalent to test of \( R^2 \) in regression.
$$H_0: R^2 = 0$$
And, equivalently
$$H_0: \beta_1 = \beta_2 = ... = \beta_{k-1} = 0$$
Both represent the test of whether the factor has any effect on the response.
Three different approaches to calculating sums of squares:
Each leads to different hypothesis tests and interpretations!
Also called sequential SS:
\(SS(A|Intercept)\)
\(SS(B|Intercept, A)\)
\(SS(A \times B|Intercept, A, B)\)
Also called hierarchical SS:
\( SS(A|Intercept, B) \)
\( SS(B|Intercept, A) \)
\(SS(A \times B|Intercept, A, B)\)
Also called marginal or orthogonal SS:
\( SS(A|Intercept, B, A \times B) \)
\( SS(B|Intercept, A, A \times B) \)
\( SS(A \times B|Intercept, A, B) \)
Sums of squares correspond to specific model comparisons:
The coding scheme used for categorical variables affects interpretation:
Different coding schemes align with different SS types, but we aren't going to discuss this much here.
Marginal means (also called estimated marginal means or least-squares means):
In unbalanced designs:
Example: If treatment A is given mostly to severe cases and treatment B to mild cases, raw means will be misleading unless we account for severity.
For a two-way ANOVA with factors A and B:
Marginal mean for level i of factor A: \( \mu_{A_i} = \sum_{j=1}^{J} \frac{\mu_{ij}}{J} \)
Where \( \mu_{ij} \) is the predicted mean for the cell with level i of factor A and level j of factor B
This gives equal weight to each level of B, regardless of sample sizes
To test differences between marginal means:
Pairwise comparisons: Test if two specific marginal means differ
Contrast tests: Test specific combinations of marginal means
Different SS types correspond to different tests of marginal means:
This is why Type III SS is often recommended for unbalanced designs where marginal means are of interest.