(DRAFT)
Shen Shen
Feb 7, 2025
6.390-personal@mit.edu
Logistical issues? Personal concerns? We’d love to help out!
course assistant
lecturer
recitation/lab instructors
haley
priya
chris
vince
mardavij
manolis
paul
shen
TAs
LAs
Optimization + first-principle physics
Recall lab1 intro
(images adapted from Tamara Broderick)
Recall lab1 Q1
def random_regress(X, Y, k):
d, n = X.shape
# generate k random hypotheses
ths = np.random.randn(d, k)
th0s = np.random.randn(1, k)
# compute the mean squared error of each hypothesis on the data set
errors = lin_reg_err(X, Y, ths, th0s.T)
# Find the index of the hypotheses with the lowest error
i = np.argmin(errors)
# return the theta and theta0 parameters that define that hypothesis
theta, theta0 = ths[:,i:i+1], th0s[:,i:i+1]
return (theta, theta0), errors[i]
Now training error: \[ J(\theta) = \frac{1}{n} \sum_{i=1}^n\left(\theta^{\top} x^{(i)}-y^{(i)}\right)^2\]
Define
🥰
🥺
Set Gradient \(\nabla_\theta J(\theta) \stackrel{\text { set }}{=} 0\)
The beauty of \( \theta^*=\left(\tilde{X}^{\top} \tilde{X}\right)^{-1} \tilde{X}^{\top} \tilde{Y}\): simple, general, unique minimizer
\(\tilde{X}\) is not full column rank
Quick Summary:
Typically
🥺
🥰
🥰
🥺
good idea to shuffle data first
a way to "reuse" data
it's not to evaluate a hypothesis
rather, it's to evaluate learning algorithm (e.g. hypothesis class choice, hyperparameters)
Could e.g. have an outer loop for picking good hyperparameter or hypothesis class
We'd love to hear your thoughts.