Goal : xmin f(x)
Let f be a differentiable and convex function from Rd→R, x∗∈Rd is a global minimum of f if and only if ∇f(x∗)=0.
If f:Rd→R,g:Rd→R are both convex functions, then f(x)+g(x) is a convex function
https://katex.org/docs/supported.html
Let f:R→R is a convex and non-decreasing function and g:Rd→R be a convex function, then their composition h=f(g(x)) is also a convex function.
Let f:R→R is a convex function and g:Rd→R be a linear function, then their composition h=f(g(x)) is also a convex function.
In general, if f and g are both convex functions, then h=fog may not be convex function.
Note: g is concave if and only if f=−g is convex.
Linear Regression:
Training data → X1,X2,...,Xn with corresponding outputs y1,y2,...,yn, where Xi∈Rd and yi∈R, ∀i.
Gradient of the sum of squares error
Analytical or closed form solution of coefficients w∗ of a linear regression model
In linear regression, the gradient descent approach avoids the inverse computation by iteratively updating the weights.
Stochastic gradient descent:
Consider the constrained optimization problem as follows:
Lagrangian function:
Note: depending upon if x is inside or outside the constrained set, we will get the objective value to be f(x) or inf.
Weak Duality | Strong Duality |
---|---|
|
If f and g are convex functions. |
Consider the optimization problem with multiple equality and inequality constraints as follows:
The Lagrangian function is expressed as follows:
Karush-Kuhn-Tucker Conditions:
Example:
minimize
f(x)=2(x1+1)2+2(x2−4)2
subject to
x12+x22≤9
x1+x2≥2