Feature Transfromations

Business Analytics

Motivation

Our Focus

f(\beta, x) = \sum _{j=1}^d \beta_j x_j

Linear in both Parameters and inputs

Potentially Problematic

f(\beta, x) = \sum _{j=1}^d \beta_j x_j

Linear in both Parameters and inputs

Misspecification Error

Types of Error

Feature Transformations

The Optimization Problem

\hat{\beta} = \underset{\beta}{\text{argmin}} \ \underbrace{\frac{1}{n}\sum_{i=1}^n \Big(Y_i - \sum_{p =1}^d \beta_pX_{ip}\Big)^2}_{\text{Objective Function}}
f(\beta, x) = \sum_{p=1}^d\sum_{j=1}^k \beta_{pk} \phi_k(x_p)

Feature Transformations

Model

\phi(x)

Polynomial Transformations in Python

linear_model1 = smf.ols(f'{dep_var} ~ {rhs_var}', data=df)
results1 = linear_model1.fit()
linear_model3 = smf.ols(f'{dep_var} ~ {rhs_var} + I({rhs_var} **2) + I({rhs_var} **3) ', data=df)
results3 = linear_model3.fit()
\phi_k(x) = x^k