Shen Shen
March 1, 2024
(some slides adapted from Tamara Broderick and Phillip Isola)
new
input \(x\)
new
prediction \(y\)
Testing
(predicting)
Recap:
- OLS can have analytical formula and "easy" prediction mechanism
- Regularization
- Cross-validation
- Gradient descent
(vanilla, sign-based)
linear classifier
linear
logistic regression (classifier)
An aside:
The idea of "distance" appeared in
it will play a central role in later weeks
it will play a central role in fundamental algorithms we won't discuss:
(next
week)
(vanilla, sign-based)
linear classifier
using
polynomial feature transformation
Using polynomial features of order 3
Underfitting
Appropriate model
Overfitting
high error on train set
high error on test set
low error on train set
low error on test set
very low error on train set
very high error on test set
Underfitting
Appropriate model
Overfitting
(Example goal: diagnose if people have heart disease based on their available info.)
(Example: logistic regression. Loss: negative log likelihood. Regularizer: ridge penalty)
(Example: analytical/closed-form optimization, sgd)
(Example goal: diagnose if people have heart disease based on their available info.)
Identify relevant info and encode as real numbers
Encode in such a way that's sensible for the task.
What about jobs?
What about medicine?
Recall, if used one-hot, need exact combo in data to learn corresponding parameter
Strongly disagree | Disagree | Neutral | Agree | Strongly agree |
---|---|---|---|---|
1 | 2 | 3 | 4 | 5 |
Strongly disagree | Disagree | Neutral | Agree | Strongly agree |
---|---|---|---|---|
1 | 2 | 3 | 4 | 5 |
Strongly disagree | Disagree | Neutral | Agree | Strongly agree |
---|---|---|---|---|
1 | 2 | 3 | 4 | 5 |
Strongly disagree | Disagree | Neutral | Agree | Strongly agree |
---|---|---|---|---|
1,0,0,0,0 | 1,1,0,0,0 | 1,1,1,0,0 | 1,1,1,1,0 | 1,1,1,1,1 |
We'd love it for you to share some lecture feedback.