Grab the slides: slides.com/cheukting_ho/legend-data-log-reg

Every Monday 5pm UK time

by Cheuk Ting Ho

When what you want to predict has only 2 outcomes. For example,

- To predict whether an email is spam (1) or (0)

- Whether the tumor is malignant (1) or not (0)

- Whether the customer will leave (1) or not (0)

- Data are not forming a line

- Relations of x and y are not close to linear

- We need another line to "fit" the data

If we find the right t-asix the data will look like a Sigmoid function then we can distingulish 0 and 1

We find the right (set of) b by mininising the error (the slider game)

Remember how we measure the error of the linear regrestion last time?

Similar to the sum of error square, the standard way of measure how wrong (cost function) of the model form the actually training data is root mean square error (RMSE)

there for the **cost function** of linear regression is:

Y'' = 1/ 1+ e**-(b1X1+b0)

where if Y'' > 0.5, Y' =1; else, Y'=0

🤔

We want this minimized!!!

Every Monday 5pm UK time

Get the notebooks: https://github.com/Cheukting/legend_data