Interactions, part 4

PSY 716

Nested Models and F Tests for Change in R²

  • Models are considered nested when:

    • One model contains all terms from a simpler model
    • Plus at least one additional term
  • Example of nested models:

    • Model 1: \( y = \beta_0 + \beta_1 x_1 + \epsilon \)
    • Model 2: \( y = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \epsilon \)
    • Model 3: \( y = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \beta_3 x_1 x_2 + \epsilon \)

F Test for Change in R²

The F test for change in R² compares two nested models to determine if adding terms significantly improves fit:

$$ F = \frac{(R^2_{full} - R^2_{reduced})/(df_{full} - df_{reduced})}{(1 - R^2_{full})/(N - df_{full} - 1)} $$

Where:

  • \( R^2_{full} \) = \( R^2 \) of the more complex model
  • \( R^2_{reduced} \) = \( R^2 \) of the simpler model
  • \( df_{full} \) = number of predictors in the full model
  • \( df_{reduced} \) = number of predictors in the reduced model
  • \( N \) = sample size

Model Comparison and Sum of Squares

The F test for nested models is directly related to sum of squares:

$$ F = \frac{(SS_{reduced} - SS_{full})/(df_{full} - df_{reduced})}{SS_{full}/(N - df_{full} - 1)} $$

Where:

  • \( SS_{reduced} \) = residual sum of squares from the reduced model
  • \( SS_{full} \) = residual sum of squares from the full model

This highlights that different SS types are just different model comparisons!

Degrees of Freedom for F Test

For the F test comparing nested models:

  • Numerator df = difference in number of parameters: \( df_{full} - df_{reduced} \)
  • Denominator df = residual df from the full model: \( N - df_{full} - 1\)

The F statistic follows an F distribution with these degrees of freedom under the null hypothesis that the additional parameters equal zero.

The Regression-ANOVA Connection

As we've discussed ad nauseam at this point, ANOVA models can be expressed as regression models using indicator variables:

$$y_{ij} = \mu + \alpha_i + \epsilon_{ij}$$

is equivalent to:

$$y_{ij} = \beta_0 + \beta_1 x_{1ij} + \beta_2 x_{2ij} + ... + \epsilon_{ij}$$

Consider a one-way ANOVA with factor A having 3 levels:

ANOVA notation: $$y_{ij} = \mu + \alpha_i + \epsilon_{ij}$$

Regression formulation:

 

$$y_{ij} = \beta_0 + \beta_1 x_{1ij} + \beta_2 x_{2ij} + \epsilon_{ij}$$

  • Where \( x_{1ij} = 1 \) if observation in level 1, 0 otherwise
  • Where \( x_{2ij} = 1 \) if observation in level 2, 0 otherwise
  • Level 3 is the reference level (when both \( x_1\) and \( x_2 \) are 0)

If we use effects coding (the default for ANOVA), then:

$$\beta_0 = \mu + \alpha_3$$ (reference level)

 

$$\beta_1 = \alpha_1 - \alpha_3$$ (difference between level 1 and reference)

 

$$\beta_2 = \alpha_2 - \alpha_3$$ (difference between level 2 and reference)

This is why regression tests individual coefficients, while ANOVA tests the overall effect.

F-Tests Can Be Applied in Both Frameworks

The overall F-test in ANOVA (testing if \( \alpha_i = 0 \) for all \(i )\) is equivalent to test of \( R^2 \) in regression.

$$H_0: R^2 = 0$$

 

And, equivalently

 

$$H_0: \beta_1 = \beta_2 = ... = \beta_{k-1} = 0$$

Both represent the test of whether the factor has any effect on the response.

Sums of Squares

Three different approaches to calculating sums of squares:

  1. Type I (Sequential): Calculate in the order terms are specified
  2. Type II (Hierarchical): Test each term after all others, except those containing it
  3. Type III (Marginal): Test each term as if it were entered last

Each leads to different hypothesis tests and interpretations!

Type I Sums of Squares

Also called sequential SS:

  • Calculated in the order specified in the model
  • Each term is adjusted only for terms that precede it
  • Changes if you reorder terms in the model
  • Equivalent to comparing nested models sequentially:
    1. \(SS(A|Intercept)\)

    2. \(SS(B|Intercept, A)\)

    3. \(SS(A \times B|Intercept, A, B)\)

Type II Sums of Squares

Also called hierarchical SS:

  • Each term adjusted for all other terms except those containing it
  • Respects marginality principle
  • Doesn't change with reordering
  • Tests main effects adjusted for all other main effects:
    1. \( SS(A|Intercept, B) \)

    2. \( SS(B|Intercept, A) \)

    3. \(SS(A \times B|Intercept, A, B)\)

Type III Sums of Squares

Also called marginal or orthogonal SS:

  • Each term adjusted for all other terms including those containing it
  • Tests each effect as if it were entered last in the model
  • Most commonly reported in statistical software (other than the anova() command in R, curiously)
    1. \( SS(A|Intercept, B, A \times B) \)

    2. \( SS(B|Intercept, A, A \times B) \)

    3. \( SS(A \times B|Intercept, A, B) \)

Regression Model Comparisons

Sums of squares correspond to specific model comparisons:

Type I:

  • Full model vs. model without term and all higher-order terms containing it

Type II:

  • Full model vs. model without term but with all other terms of same or lower order

Type III:

  • Full model vs. model without term but with all other terms (including higher-order)

Coding Schemes

The coding scheme used for categorical variables affects interpretation:

 

  • Treatment/Dummy Coding: One level as reference (regression default)
  • Effect Coding: Sum of effects constrained to zero (ANOVA default)

Different coding schemes align with different SS types, but we aren't going to discuss this much here.

Common Pitfalls and Misconceptions

  1. Interpreting main effects in the presence of interactions
  2. Using Type III SS without understanding what's being tested
  3. Comparing regression coefficients across different coding schemes
  4. Using Type I SS without consideration of term ordering
  5. Assuming all software packages use the same defaults

Marginal Means in ANOVA

Marginal means (also called estimated marginal means or least-squares means):

  • Population means estimated from a model
  • Account for other factors and covariates in the model
  • Essential for interpreting effects in unbalanced designs
  • Provide adjusted means that control for confounding factors

Why Marginal Means Matter

In unbalanced designs:

  • Cell frequencies differ across factor combinations
  • Simple arithmetic means are biased by cell frequencies
  • Marginal means provide unbiased estimates of factor effects

Example: If treatment A is given mostly to severe cases and treatment B to mild cases, raw means will be misleading unless we account for severity.

Computing Marginal Means

For a two-way ANOVA with factors A and B:

  • Marginal mean for level i of factor A: \( \mu_{A_i} = \sum_{j=1}^{J} \frac{\mu_{ij}}{J} \)

  • Where \( \mu_{ij} \) is the predicted mean for the cell with level i of factor A and level j of factor B

  • This gives equal weight to each level of B, regardless of sample sizes

Testing Differences in Marginal Means

To test differences between marginal means:

  1. Pairwise comparisons: Test if two specific marginal means differ

    • Typically uses t-tests with appropriate error term
    • Often requires multiple comparison adjustment (Tukey, Bonferroni, etc.)
  2. Contrast tests: Test specific combinations of marginal means

    • Uses linear combinations: \( psi = \sum c_i \mu_i \) where \( \sum c_i = 0 \)
    • Allows testing complex hypotheses about relationships between means

Relationship to SS Types

Different SS types correspond to different tests of marginal means:

  • Type I: Tests dependent on model order, not clearly related to marginal means
  • Type II: Tests main effects averaged over other main effects
  • Type III: Tests main effects using unweighted marginal means, maintaining equal weight across cells

This is why Type III SS is often recommended for unbalanced designs where marginal means are of interest.

Putting it all together

  • ANOVA and regression are the same model viewed differently
  • Choice of sum of squares should be based on:
    • Experimental design
    • Balance of the data
    • Specific hypotheses of interest
  • Understanding the connection helps select appropriate analysis
  • Different software packages use different defaults
    • R: Type I by default
    • SPSS: Type III by default