1.9 Multilayered Network of Neurons
Your first Deep Neural Network
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
Recap: Complex Functions
What we saw in the previous chapter?
(c) One Fourth Labs
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
Repeat slide 5.1 from the previous lecture
The Road Ahead
What's going to change now ?
(c) One Fourth Labs
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
data:image/s3,"s3://crabby-images/dcc9b/dcc9bdf68f50f1666ba1be21faaba16ea3647b3b" alt=""
data:image/s3,"s3://crabby-images/f129b/f129bf51ce0caccc43130d1ee91ce6a4ad28f150" alt=""
data:image/s3,"s3://crabby-images/f81fb/f81fbe2e264ed21958c5512bfb4df5732b4d7a69" alt=""
data:image/s3,"s3://crabby-images/2ccda/2ccdafa2918d0ae8ea24278fe96d3a03e0539ebd" alt=""
data:image/s3,"s3://crabby-images/7b701/7b701489622d211457ca4fac061b5b8a5b1653db" alt=""
data:image/s3,"s3://crabby-images/2c4eb/2c4eb75a7b16a5777ef80ee5b4b915982f426973" alt=""
data:image/s3,"s3://crabby-images/f0d82/f0d822ff7ce63037c9f2a7e9665b547bd30b958b" alt=""
Loss
Model
Data
Task
Evaluation
Learning
Real inputs
Non-linear
Task specific loss functions
Real outputs
Back-propagation
data:image/s3,"s3://crabby-images/2c8f7/2c8f747882dc6228440fab2c6fc91a806bb6352e" alt=""
Data and Task
What kind of data and tasks have DNNs been used for ?
(c) One Fourth Labs
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
data:image/s3,"s3://crabby-images/afbb9/afbb90f5c1e1f76b1b519719df18774cbabe7d12" alt=""
data:image/s3,"s3://crabby-images/16579/16579198e74d0177aa4ec31204c362e2a8670fb8" alt=""
data:image/s3,"s3://crabby-images/3e8ca/3e8ca48e9bb367a138140b4a46c433a91201b770" alt=""
data:image/s3,"s3://crabby-images/790c7/790c7ff1107c03cf4a50b5fac0c9f4fd50e75ef9" alt=""
data:image/s3,"s3://crabby-images/348bb/348bb87137d3d2699d3d031776018c6e432af715" alt=""
data:image/s3,"s3://crabby-images/dfa7f/dfa7f88efef30a1922b65ef02c97005e590bdcf7" alt=""
data:image/s3,"s3://crabby-images/115b3/115b35c5a1e4c939d61d1e627c2da0eecd06b9f5" alt=""
data:image/s3,"s3://crabby-images/890ff/890ffece85346c7a8305f677f3bbb28f4e0b224e" alt=""
28x28 Images
data:image/s3,"s3://crabby-images/a0793/a07934b9482dcfdb5651c46538802c7312752408" alt=""
data:image/s3,"s3://crabby-images/9afd7/9afd7133c0dee99141eefa04560ee5f67c73a434" alt=""
data:image/s3,"s3://crabby-images/afbb9/afbb90f5c1e1f76b1b519719df18774cbabe7d12" alt=""
255 | ||||||
255 | 183 | |||||
255 | 183 | 95 | ||||
255 | 183 | 95 | 8 | 93 | 196 | 253 |
255 | 183 | 95 | 8 | 93 | 196 | 253 |
254 | 154 | 37 | 7 | 28 | 172 | 254 |
255 | 183 | 95 | 8 | 93 | 196 | 253 |
254 | 154 | 37 | 7 | 28 | 172 | 254 |
252 | 221 | ... | ... | ... | ... | ... |
... | ... | ... | ... | ... | ... | ... |
... | ... | ... | ... | ... | ... | ... |
... | ... | ... | ... | ... | 198 | 253 |
252 | 250 | 187 | 178 | 195 | 253 | 253 |
How can we represent MNIST images as a vector ?
- Using pixel values of each cell
- Matrix having pixel values will be of size 28x28 ( As MNIST images are of size 28x28)
- Each pixel value can range from 0 to 255. Standardise pixel values by dividing with 255
- Now, Flatten the matrix to convert into a vector of size 784 (28x28)
255 | 183 | 95 | 8 | 93 | 196 | 253 |
254 | 154 | 37 | 7 | 28 | 172 | 254 |
252 | 221 | ... | ... | ... | ... | ... |
... | ... | ... | ... | ... | ... | ... |
... | ... | ... | ... | ... | ... | ... |
... | ... | ... | ... | ... | 198 | 253 |
252 | 250 | 187 | 178 | 195 | 253 | 253 |
1 | 183 | 95 | 8 | 93 | 196 | 253 |
254 | 154 | 37 | 7 | 28 | 172 | 254 |
252 | 221 | ... | ... | ... | ... | ... |
... | ... | ... | ... | ... | ... | ... |
... | ... | ... | ... | ... | ... | ... |
... | ... | ... | ... | ... | 198 | 253 |
252 | 250 | 187 | 178 | 195 | 253 | 253 |
1 | 0.72 | 95 | 8 | 93 | 196 | 253 |
254 | 154 | 37 | 7 | 28 | 172 | 254 |
252 | 221 | ... | ... | ... | ... | ... |
... | ... | ... | ... | ... | ... | ... |
... | ... | ... | ... | ... | ... | ... |
... | ... | ... | ... | ... | 198 | 253 |
252 | 250 | 187 | 178 | 195 | 253 | 253 |
1 | 0.72 | 0.37 | 8 | 93 | 196 | 253 |
254 | 154 | 37 | 7 | 28 | 172 | 254 |
252 | 221 | ... | ... | ... | ... | ... |
... | ... | ... | ... | ... | ... | ... |
... | ... | ... | ... | ... | ... | ... |
... | ... | ... | ... | ... | 198 | 253 |
252 | 250 | 187 | 178 | 195 | 253 | 253 |
1 | 0.72 | 0.37 | 0.03 | 0.36 | 0.77 | 0.99 |
254 | 154 | 37 | 7 | 28 | 172 | 254 |
252 | 221 | ... | ... | ... | ... | ... |
... | ... | ... | ... | ... | ... | ... |
... | ... | ... | ... | ... | ... | ... |
... | ... | ... | ... | ... | 198 | 253 |
252 | 250 | 187 | 178 | 195 | 253 | 253 |
1 | 0.72 | 0.37 | 0.03 | 0.36 | 0.77 | 0.99 |
1 | 0.60 | 0.14 | 0.03 | 0.11 | 0.67 | 1 |
252 | 221 | ... | ... | ... | ... | ... |
... | ... | ... | ... | ... | ... | ... |
... | ... | ... | ... | ... | ... | ... |
... | ... | ... | ... | ... | 198 | 253 |
252 | 250 | 187 | 178 | 195 | 253 | 253 |
1 | 0.72 | 0.37 | 0.03 | 0.36 | 0.77 | 0.99 |
1 | 0.60 | 0.14 | 0.03 | 0.11 | 0.67 | 1 |
0.99 | 0.87 | ... | ... | ... | ... | ... |
... | ... | ... | ... | ... | ... | ... |
... | ... | ... | ... | ... | ... | ... |
... | ... | ... | ... | ... | 0.78 | 0.99 |
0.99 | 0.98 | 0.73 | 0.69 | 0.76 | 0.99 | 0.99 |
Data and Task
What kind of data and tasks have DNNs been used for ?
(c) One Fourth Labs
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
data:image/s3,"s3://crabby-images/afbb9/afbb90f5c1e1f76b1b519719df18774cbabe7d12" alt=""
data:image/s3,"s3://crabby-images/16579/16579198e74d0177aa4ec31204c362e2a8670fb8" alt=""
data:image/s3,"s3://crabby-images/3e8ca/3e8ca48e9bb367a138140b4a46c433a91201b770" alt=""
data:image/s3,"s3://crabby-images/790c7/790c7ff1107c03cf4a50b5fac0c9f4fd50e75ef9" alt=""
data:image/s3,"s3://crabby-images/348bb/348bb87137d3d2699d3d031776018c6e432af715" alt=""
data:image/s3,"s3://crabby-images/dfa7f/dfa7f88efef30a1922b65ef02c97005e590bdcf7" alt=""
data:image/s3,"s3://crabby-images/115b3/115b35c5a1e4c939d61d1e627c2da0eecd06b9f5" alt=""
data:image/s3,"s3://crabby-images/890ff/890ffece85346c7a8305f677f3bbb28f4e0b224e" alt=""
28x28 Images
data:image/s3,"s3://crabby-images/a0793/a07934b9482dcfdb5651c46538802c7312752408" alt=""
data:image/s3,"s3://crabby-images/9afd7/9afd7133c0dee99141eefa04560ee5f67c73a434" alt=""
How can we represent MNIST images as a vector ?
- Using pixel values of each cell
- Matrix having pixel values will be of size 28x28 ( As MNIST images are of size 28x28)
- Each pixel value can range from 0 to 255. Standardise pixel values by dividing with 255
- Now, Flatten the matrix to convert into a vector of size 784 (28x28)
\( \left[\begin{array}{lcr} 1.00, 0.72, 0.37 \dots, 0.76, 0.99, 0.99 \end{array} \right]\)
\( \left[\begin{array}{lcr} 1.00, 0.85, 0.73 \dots, 0.68, 1.00, 1.00 \end{array} \right]\)
\( \left[\begin{array}{lcr} 1.00, 0.76, 0.64 \dots, 0.86, 0.99, 1.00 \end{array} \right]\)
\( \left[\begin{array}{lcr} 0.99, 0.82, 0.26 \dots, 0.53, 0.87, 1.00 \end{array} \right]\)
\( \left[\begin{array}{lcr} 0.73, 0.81, 0.87 \dots, 0.76, 0.79, 0.67 \end{array} \right]\)
\( \left[\begin{array}{lcr} 0.84, 0.72, 0.31 \dots, 0.26, 0.51, 0.99 \end{array} \right]\)
\( \left[\begin{array}{lcr} 1.00, 1.00, 0.96 \dots, 0.88, 0.79, 0.99 \end{array} \right]\)
\( \left[\begin{array}{lcr} 0.33, 0.52, 0.47 \dots, 0.76, 0.95, 1.00 \end{array} \right]\)
\( \left[\begin{array}{lcr} 0.85, 0.72, 0.97 \dots, 0.86, 0.94, 0.99 \end{array} \right]\)
\( \left[\begin{array}{lcr} 0.84, 0.92, 0.28 \dots, 0.76, 1.0, 0.99 \end{array} \right]\)
Data and Task
What kind of data and tasks have DNNs been used for ?
(c) One Fourth Labs
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
data:image/s3,"s3://crabby-images/afbb9/afbb90f5c1e1f76b1b519719df18774cbabe7d12" alt=""
data:image/s3,"s3://crabby-images/16579/16579198e74d0177aa4ec31204c362e2a8670fb8" alt=""
data:image/s3,"s3://crabby-images/3e8ca/3e8ca48e9bb367a138140b4a46c433a91201b770" alt=""
data:image/s3,"s3://crabby-images/790c7/790c7ff1107c03cf4a50b5fac0c9f4fd50e75ef9" alt=""
data:image/s3,"s3://crabby-images/348bb/348bb87137d3d2699d3d031776018c6e432af715" alt=""
data:image/s3,"s3://crabby-images/dfa7f/dfa7f88efef30a1922b65ef02c97005e590bdcf7" alt=""
data:image/s3,"s3://crabby-images/115b3/115b35c5a1e4c939d61d1e627c2da0eecd06b9f5" alt=""
data:image/s3,"s3://crabby-images/890ff/890ffece85346c7a8305f677f3bbb28f4e0b224e" alt=""
28x28 Images
data:image/s3,"s3://crabby-images/a0793/a07934b9482dcfdb5651c46538802c7312752408" alt=""
data:image/s3,"s3://crabby-images/9afd7/9afd7133c0dee99141eefa04560ee5f67c73a434" alt=""
\( \left[\begin{array}{lcr} 1.00, 0.72, 0.37 \dots, 0.76, 0.99, 0.99 \end{array} \right]\)
\( \left[\begin{array}{lcr} 1.00, 0.85, 0.73 \dots, 0.68, 1.00, 1.00 \end{array} \right]\)
\( \left[\begin{array}{lcr} 1.00, 0.76, 0.64 \dots, 0.86, 0.99, 1.00 \end{array} \right]\)
\( \left[\begin{array}{lcr} 0.99, 0.82, 0.26 \dots, 0.53, 0.87, 1.00 \end{array} \right]\)
\( \left[\begin{array}{lcr} 0.73, 0.81, 0.87 \dots, 0.76, 0.79, 0.67 \end{array} \right]\)
\( \left[\begin{array}{lcr} 0.84, 0.72, 0.31 \dots, 0.26, 0.51, 0.99 \end{array} \right]\)
\( \left[\begin{array}{lcr} 1.00, 1.00, 0.96 \dots, 0.88, 0.79, 0.99 \end{array} \right]\)
\( \left[\begin{array}{lcr} 0.33, 0.52, 0.47 \dots, 0.76, 0.95, 1.00 \end{array} \right]\)
\( \left[\begin{array}{lcr} 0.85, 0.72, 0.97 \dots, 0.86, 0.94, 0.99 \end{array} \right]\)
\( \left[\begin{array}{lcr} 0.84, 0.92, 0.28 \dots, 0.76, 1.00, 0.99 \end{array} \right]\)
Class Label
0
1
2
3
4
5
6
7
8
9
Class labels can be represented as one hot vectors
Class Labels - One hot Representation
\( \left[\begin{array}{lcr} 1, 0, 0, 0, 0, 0, 0, 0, 0, 0 \end{array} \right]\)
\( \left[\begin{array}{lcr} 0, 1, 0, 0, 0, 0, 0, 0, 0, 0 \end{array} \right]\)
\( \left[\begin{array}{lcr} 0, 0, 1, 0, 0, 0, 0, 0, 0, 0 \end{array} \right]\)
\( \left[\begin{array}{lcr} 0, 0, 0, 1, 0, 0, 0, 0, 0, 0 \end{array} \right]\)
\( \left[\begin{array}{lcr} 0, 0, 0, 0, 1, 0, 0, 0, 0, 0 \end{array} \right]\)
\( \left[\begin{array}{lcr} 0, 0, 0, 0, 0, 1, 0, 0, 0, 0 \end{array} \right]\)
\( \left[\begin{array}{lcr} 0, 0, 0, 0, 0, 0, 1, 0, 0, 0 \end{array} \right]\)
\( \left[\begin{array}{lcr} 0, 0, 0, 0, 0, 0, 0, 1, 0, 0 \end{array} \right]\)
\( \left[\begin{array}{lcr} 0, 0, 0, 0, 0, 0, 0, 0, 1, 0 \end{array} \right]\)
\( \left[\begin{array}{lcr} 0, 0, 0, 0, 0, 0, 0, 0, 0, 1 \end{array} \right]\)
Data and Task
What kind of data and tasks have DNNs been used for ?
(c) One Fourth Labs
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
- Now have two more slides on other Kaggle tasks for which DNNs have been tried (preferably, some non-image tasks and at least one regression task. You could also repeat the churn prediction task from before)
- Finally have 1 slide on our task which is multi character classification
- Same layout and animations repeated from the previous slide only data changes
- Show MNIST dataset sample on LHS
- Show by animation how you will flatten each image and convert it to a vector (of course you cannot show that
-
Data and Task
What kind of data and tasks have DNNs been used for ?
(c) One Fourth Labs
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
(c) One Fourth Labs
Indian Liver Patient Records \(^{*}\)
- whether person needs to be diagnosed or not ?
Age |
65 |
62 |
20 |
84 |
Albumin |
3.3 |
3.2 |
4 |
3.2 |
T_Bilirubin |
0.7 |
10.9 |
1.1 |
0.7 |
D |
0 |
0 |
1 |
1 |
\( \hat{y} = \hat{f}(x_1, x_2, .... ,x_{N}) \)
\( \hat{D} = \hat{f}(Age, Albumin,T\_Bilirubin,.....) \)
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Data and Task
What kind of data and tasks have DNNs been used for ?
(c) One Fourth Labs
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
(c) One Fourth Labs
Boston Housing\(^{*}\)
- Predict Housing Values in Suburbs of Boston
Crime |
0.00632 |
0.02731 |
0.3237 |
0.6905 |
Avg No of rooms |
6.575 |
6.421 |
6.998 |
7.147 |
Age |
65.2 |
78.9 |
45.8 |
54.2 |
House Value |
24 |
21.6 |
33.4 |
36.2 |
\( \hat{y} = \hat{f}(x_1, x_2, .... ,x_{N}) \)
\( \hat{D} = \hat{f}(Crime, Avg \ no \ of \ rooms, Age, .... ) \)
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Model
How to build complex functions using Deep Neural Networks?
(c) One Fourth Labs
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
\(x_2\)
Cost
3.5
8k
12k
data:image/s3,"s3://crabby-images/808ac/808ac1b18ba6f7b91dd394380b7fc35e82afc6a9" alt=""
data:image/s3,"s3://crabby-images/e6f68/e6f68f5b9971f1413c45235cbc217279e7b4b51a" alt=""
data:image/s3,"s3://crabby-images/68a15/68a1503b63721a9035c85d9cfa444752fc865177" alt=""
\( \hat{y} = \frac{1}{1+e^{-(w_1* x_1 + w_2*x_2+b)}} \)
\(w_1\)
\(w_2\)
\(x_2\)
\(x_1\)
\( \hat{y} \)
data:image/s3,"s3://crabby-images/316dd/316dd6ce39662a5ff0944e01f3396eb60d58db11" alt=""
data:image/s3,"s3://crabby-images/166a1/166a1cee6af0292438f8f3493f522ee80e827184" alt=""
data:image/s3,"s3://crabby-images/c551c/c551c18c71c28352d48c881d7338cffa2e6fbaa1" alt=""
data:image/s3,"s3://crabby-images/db463/db463df9d0b88f4fa3bdef308a79252d85f2089d" alt=""
data:image/s3,"s3://crabby-images/52cff/52cff4e0153530ff3ab60a563a778a72e6a6acf8" alt=""
data:image/s3,"s3://crabby-images/c23d2/c23d2899776adb6a329a7e4696c3ab7f1b571510" alt=""
data:image/s3,"s3://crabby-images/6fcf0/6fcf0400918386981ecd4828359de1eec8ba42ca" alt=""
data:image/s3,"s3://crabby-images/14c88/14c88beb7369a6c27545a4289da5bcf843563496" alt=""
data:image/s3,"s3://crabby-images/a57f4/a57f406627b01d0e71707ec34dc68ffad2fcb652" alt=""
data:image/s3,"s3://crabby-images/22476/224767e7bcf94e1ed242e0bc2bd58d6585116266" alt=""
data:image/s3,"s3://crabby-images/9d34d/9d34d446c58b1feb83e05d81caea007bf8ada298" alt=""
data:image/s3,"s3://crabby-images/dbf1c/dbf1cdc66dde3c1a3e8458a8826a7cbd14c198c7" alt=""
data:image/s3,"s3://crabby-images/ab967/ab967b6ceec25066e9c9660b94af0aa7445fe730" alt=""
data:image/s3,"s3://crabby-images/1f1fa/1f1fac7cdbcbd2debadb8a344456c645b1e673f0" alt=""
data:image/s3,"s3://crabby-images/64387/643879cc5e84a56923a5bf698d07098ea1625629" alt=""
data:image/s3,"s3://crabby-images/e6dde/e6dde16f67d07330e1d4b8db1aedbdfd090db4a8" alt=""
data:image/s3,"s3://crabby-images/f0099/f00992d5927732c3499f346d5fbd1edd93896b0d" alt=""
data:image/s3,"s3://crabby-images/55b71/55b71cd8cfe7fc7b4274e5a2d87303022d9e02e1" alt=""
data:image/s3,"s3://crabby-images/a762a/a762a310c05e73bf11dea556dead6319773761d4" alt=""
data:image/s3,"s3://crabby-images/a59da/a59dac1fa9dd3de211e086ef2ddfd5695fc1555e" alt=""
data:image/s3,"s3://crabby-images/b982f/b982f3ea648a4c90f1f993b0d5362de2f204a44f" alt=""
data:image/s3,"s3://crabby-images/92d02/92d02826b435f618c6cb278b4826c51eaf065ae4" alt=""
data:image/s3,"s3://crabby-images/d3395/d3395830cad75fe788759f48d0c47a64f93e2c65" alt=""
data:image/s3,"s3://crabby-images/049dc/049dcf89668c8352f37eb9daded9072ec4be4a14" alt=""
4.5
Screen size
\(x_1\)
\(b\)
\(w_{11}\)
\( \hat{y} = f(x_1,x_2) \)
Model
How to build complex functions using Deep Neural Networks?
(c) One Fourth Labs
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
\(x_2\)
Cost
3.5
8k
12k
data:image/s3,"s3://crabby-images/808ac/808ac1b18ba6f7b91dd394380b7fc35e82afc6a9" alt=""
data:image/s3,"s3://crabby-images/e6f68/e6f68f5b9971f1413c45235cbc217279e7b4b51a" alt=""
data:image/s3,"s3://crabby-images/68a15/68a1503b63721a9035c85d9cfa444752fc865177" alt=""
\( h = f(x_1,x_2) \)
\( h = \frac{1}{1+e^{-(w_{11}* x_1 + w_{12}*x_2+b_1)}} \)
\(w_{11}\)
\(w_{12}\)
\(x_2\)
\(x_1\)
\( \hat{y} \)
4.5
Screen size
\(x_1\)
\(b_1\)
\(b_2\)
\(w_{21}\)
\( \hat{y} = g(h) \)
\( = g(f(x_{1},x_{2})) \)
\(\hat{y} = \frac{1}{1+e^{-(w_{21}*h + b_2)}}\)
data:image/s3,"s3://crabby-images/6a073/6a073911cc21a79854e637a3960106ceb1c4e88f" alt=""
data:image/s3,"s3://crabby-images/267dc/267dc02f5cc39b3fa71fc1ab4c7433a63519a064" alt=""
data:image/s3,"s3://crabby-images/7fe52/7fe52ef109ab5e7c22e7b679a32f2ce24d5db890" alt=""
data:image/s3,"s3://crabby-images/943bc/943bc694eabd074192fd92f478ff562737de1006" alt=""
data:image/s3,"s3://crabby-images/72ea6/72ea64ce30eaf551012592a56d1471efed6b3e20" alt=""
data:image/s3,"s3://crabby-images/18e6a/18e6af9d454fd56041c2cbc3e5b138756ddcc75a" alt=""
data:image/s3,"s3://crabby-images/884d9/884d9dfc7464bda30830cd8f5c75e7caaa7cf118" alt=""
data:image/s3,"s3://crabby-images/9662b/9662b724e4d05b8eed59d4fa99fcd0ba9c559943" alt=""
data:image/s3,"s3://crabby-images/5bfdd/5bfddcbcc09d1e8308e19957fb3965118292011a" alt=""
data:image/s3,"s3://crabby-images/035fe/035fea260680be1d9a0505322f5e8ea151124192" alt=""
data:image/s3,"s3://crabby-images/a7655/a76556dede8b9fe79cca8725d70b03328199216b" alt=""
data:image/s3,"s3://crabby-images/9f02d/9f02d5d551e84d9142a80925bac2a6a8daefccef" alt=""
data:image/s3,"s3://crabby-images/623cc/623cc397cb8673d3fab36001212edcadbd5220d7" alt=""
data:image/s3,"s3://crabby-images/c56f0/c56f05707fca9aa7a4c478ff71c881d3afc23031" alt=""
data:image/s3,"s3://crabby-images/20963/20963ac3a99200d725901c3a5ce502c3521581d1" alt=""
data:image/s3,"s3://crabby-images/90c6b/90c6b86039431ebb50d553375bf4e37e33a7b89f" alt=""
data:image/s3,"s3://crabby-images/088af/088af602f6b99222b7737cfb5409daafc9a361aa" alt=""
data:image/s3,"s3://crabby-images/b3a09/b3a09064e3a5e59f8c252406ef21a77d92cb408e" alt=""
data:image/s3,"s3://crabby-images/e00e2/e00e250b13de91984bcfe67abfb1053e885fefc1" alt=""
data:image/s3,"s3://crabby-images/e34a9/e34a97396e53734e34f798335f3c06262942c9e8" alt=""
data:image/s3,"s3://crabby-images/ed12c/ed12cf1dd86320f73758705aed96f57b4e9cb47d" alt=""
data:image/s3,"s3://crabby-images/cc881/cc881079c12e1cfaa2b15762ba23af2fce136436" alt=""
data:image/s3,"s3://crabby-images/66485/6648548b158b34ce4b98762b6f21407652fb660e" alt=""
data:image/s3,"s3://crabby-images/7fbd4/7fbd4c398f064f1920e97c16fb1f1ce8c206002e" alt=""
data:image/s3,"s3://crabby-images/5c5e5/5c5e5e5536d7b23ac160b8d874b83909a9ffeb4a" alt=""
data:image/s3,"s3://crabby-images/30691/306912186dde90f159fda26346d13f6e47fb97a8" alt=""
data:image/s3,"s3://crabby-images/75403/75403c5a9ba6b5720ef1a6285678d54fbaf6207c" alt=""
data:image/s3,"s3://crabby-images/2244e/2244e6c28752e29b02ff3bb0bed6b17553609f45" alt=""
data:image/s3,"s3://crabby-images/57166/57166113aa1660ed4096f23a73e3ac5cf30c1ff5" alt=""
data:image/s3,"s3://crabby-images/de2d7/de2d7a9a2d7c86707faf8471e9bb900befa7ef51" alt=""
data:image/s3,"s3://crabby-images/1ddb5/1ddb53e53ef49428d109f1fa593ef86fdc95716e" alt=""
data:image/s3,"s3://crabby-images/2b42b/2b42b60a080d1df25b13cdbb175dd3f65c6efc5f" alt=""
data:image/s3,"s3://crabby-images/2ef1f/2ef1f71ad62890727891b8e5fade5af16a7189d4" alt=""
data:image/s3,"s3://crabby-images/e04b7/e04b75f1407846f2795b7488f7d32141ff87e108" alt=""
data:image/s3,"s3://crabby-images/e04b7/e04b75f1407846f2795b7488f7d32141ff87e108" alt=""
data:image/s3,"s3://crabby-images/48efc/48efc316a8689ce15066dd84fb3d380ee133cedf" alt=""
data:image/s3,"s3://crabby-images/bd8ae/bd8ae924c6a2854e47c813c5cf65773f2d03c444" alt=""
data:image/s3,"s3://crabby-images/ecc1f/ecc1f1ce63f8cbcc91e579e7ccee52e65a6da4a0" alt=""
data:image/s3,"s3://crabby-images/ecc24/ecc24b8d95039f4cac69f68f33c33a7491f93c2e" alt=""
data:image/s3,"s3://crabby-images/70963/70963b54a782af026411832a2affff8b63eb8533" alt=""
data:image/s3,"s3://crabby-images/5b555/5b5551ce81c9732a87c67083e239252e1b98463e" alt=""
data:image/s3,"s3://crabby-images/13415/134151ce82fbfbdfcdafef6d2905efcff2516d9a" alt=""
data:image/s3,"s3://crabby-images/f495b/f495b602eed4ed44ab8cdb9f7b9636a36b022cbb" alt=""
data:image/s3,"s3://crabby-images/ac4a8/ac4a8b034de17a0ba2ecf95887a327cbfbb07f68" alt=""
data:image/s3,"s3://crabby-images/73a3d/73a3d5f383f625a2a7efa68581228f944f6ac8b3" alt=""
data:image/s3,"s3://crabby-images/7de7e/7de7e055aa9eb6d734d4352c6733271de5f052a5" alt=""
data:image/s3,"s3://crabby-images/ff2fd/ff2fd68a751b115aecac56b5c342550d2ccca9c3" alt=""
data:image/s3,"s3://crabby-images/f5a7f/f5a7fe40f61a594eb25cdaa43b5b9b80a1e73953" alt=""
data:image/s3,"s3://crabby-images/1ac8f/1ac8fbc9cd9f09bded7e3dd8fd61c8331afd38f5" alt=""
Model
How to build complex functions using Deep Neural Networks?
(c) One Fourth Labs
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
\(x_2\)
Cost
3.5
8k
12k
data:image/s3,"s3://crabby-images/808ac/808ac1b18ba6f7b91dd394380b7fc35e82afc6a9" alt=""
data:image/s3,"s3://crabby-images/e6f68/e6f68f5b9971f1413c45235cbc217279e7b4b51a" alt=""
data:image/s3,"s3://crabby-images/68a15/68a1503b63721a9035c85d9cfa444752fc865177" alt=""
\( h_1 = f_1(x_1,x_2) \)
\( h_1 = \frac{1}{1+e^{-(w_{11}* x_1 + w_{12}*x_2+b_1)}} \)
\(w_{11}\)
\(w_{12}\)
\(x_2\)
\(x_1\)
\( \hat{y} \)
4.5
Screen size
\(x_1\)
\(b_1\)
\(b_2\)
\(w_{21}\)
\( \hat{y} = g(h_1,h_2) \)
\(\hat{y} = \frac{1}{1+e^{-(w_{21}*h_1 + w_{22}*h_2 + b_2)}}\)
\(w_{14}\)
\(w_{13}\)
\(w_{22}\)
\( h_2 = f_2(x_1,x_2) \)
\( h_2 = \frac{1}{1+e^{-(w_{13}* x_1 + w_{14}*x_2+b_1)}} \)
data:image/s3,"s3://crabby-images/6e1b7/6e1b78d8ea45f6c74f0778da2229a496e19b1660" alt=""
data:image/s3,"s3://crabby-images/812ba/812ba94083f5642754e1f5590fa153cb884e3a9d" alt=""
data:image/s3,"s3://crabby-images/e648d/e648df3ab8140a94a59942ef7071c85a37fe72eb" alt=""
data:image/s3,"s3://crabby-images/004dd/004dd78f72e1b0ec4c67d579bc5f5b97ee74f435" alt=""
data:image/s3,"s3://crabby-images/c9057/c9057b785416d3c32c7a2dc7e4beed13d753033d" alt=""
data:image/s3,"s3://crabby-images/cfb3f/cfb3fea2815e420bc1bd62251d2de4504ef049dc" alt=""
data:image/s3,"s3://crabby-images/f628c/f628cbf4f86a09694019309d103fc995d600b4af" alt=""
data:image/s3,"s3://crabby-images/336b9/336b9bc9331aa32c1fe300140553d07aa3813e8d" alt=""
data:image/s3,"s3://crabby-images/554a3/554a3db62d07f4370a2d824fc2a642a4fe6cb43b" alt=""
data:image/s3,"s3://crabby-images/0288f/0288f8e095e981dca62bf863108308a9bc047946" alt=""
data:image/s3,"s3://crabby-images/fc4f2/fc4f2307099cbb4b35bd769715d0ac860baf8e01" alt=""
data:image/s3,"s3://crabby-images/daa2e/daa2e72fa0e713414eceb1cb943075e299f5d4e6" alt=""
data:image/s3,"s3://crabby-images/63fba/63fbaacc83a0feb7c18f240d60d3e90e20c3c17a" alt=""
data:image/s3,"s3://crabby-images/776d9/776d98aec2faaa82d7fc2c1eba726ea081f09adc" alt=""
data:image/s3,"s3://crabby-images/492da/492dad56eb5e72e3e4f528856abbe8a4c4382a36" alt=""
data:image/s3,"s3://crabby-images/884a0/884a0f4853887b66a2129c3e9d30b7b6b263c8ee" alt=""
data:image/s3,"s3://crabby-images/bf18f/bf18f3797e7c1b1e5f60eab96f0831d0cb18f0fb" alt=""
data:image/s3,"s3://crabby-images/29381/293813dc3bfd554b924e7d5ae48a71fc72194035" alt=""
data:image/s3,"s3://crabby-images/1cefa/1cefa2b2bc058be4dfb597667437919d59570148" alt=""
data:image/s3,"s3://crabby-images/4d464/4d46465d51a9dabb6ce1f0bb3d539e56ab89387a" alt=""
data:image/s3,"s3://crabby-images/b76e4/b76e4a6d72b4f6ea4f4bb899dcb2e0d8c831895d" alt=""
data:image/s3,"s3://crabby-images/6fa4f/6fa4fb5f16e384f1b5d5f40b5fe38117e67d1338" alt=""
data:image/s3,"s3://crabby-images/8a9db/8a9db25c34735d5de9429cbec245661f41efc38b" alt=""
data:image/s3,"s3://crabby-images/10b7f/10b7fd56653f9fb48c1b0f23dcb7c1d6f562b0a5" alt=""
data:image/s3,"s3://crabby-images/4481d/4481db33495a87ef7d5abd5365016293f528e7e2" alt=""
data:image/s3,"s3://crabby-images/22924/22924aa002903b7269816b54fa23b29c22e6b1ed" alt=""
data:image/s3,"s3://crabby-images/11124/111243b151ff5e882f71e9ccfc5bf73a92fb669a" alt=""
data:image/s3,"s3://crabby-images/3f6c0/3f6c08a98844c9e951e241d16650d1a9653eb5bf" alt=""
data:image/s3,"s3://crabby-images/cb779/cb779e8f866d82e5a3069f51ca7d2d363d29cbd4" alt=""
data:image/s3,"s3://crabby-images/eb468/eb468d25c2804b5cfeb47ba572b340565eeba794" alt=""
data:image/s3,"s3://crabby-images/24341/24341dd6d4c75eee09b7f904330d1b5d8f114b64" alt=""
data:image/s3,"s3://crabby-images/bb3cc/bb3cc880c60e79cf2b4080fce07b7c9e2ec97fcf" alt=""
data:image/s3,"s3://crabby-images/569bf/569bf82e017e03e9e37ddceb14ccab27907fe8d5" alt=""
data:image/s3,"s3://crabby-images/1cec4/1cec4be43f92e7ef76960c55c865cfe41573e72f" alt=""
data:image/s3,"s3://crabby-images/07880/078800cd43ecad0ffceb2eb7b515a33903d397f3" alt=""
data:image/s3,"s3://crabby-images/41488/41488b30921d3e4e873f3d3b42db45aaa5c5d370" alt=""
data:image/s3,"s3://crabby-images/d1346/d1346be7c37cd346fcffa4d6099b798cf1770d4e" alt=""
data:image/s3,"s3://crabby-images/a814f/a814f2d7cecd9fed77e91902a45f6bebd0f2eb88" alt=""
data:image/s3,"s3://crabby-images/521f4/521f4879b6d1f4cd132e0a3b52997b97347ab703" alt=""
data:image/s3,"s3://crabby-images/7dd82/7dd829777105f8afcb9a02048d538454bb15b4cb" alt=""
data:image/s3,"s3://crabby-images/c23f0/c23f0da1514f38ea1601d83b5e2913d53a674825" alt=""
data:image/s3,"s3://crabby-images/ff5bb/ff5bb621727890dea880e7727c7266fdb6a0d961" alt=""
data:image/s3,"s3://crabby-images/3ffa3/3ffa3f0eb63935b08948951e5998a523b0fb6000" alt=""
data:image/s3,"s3://crabby-images/cf498/cf498305a4da0eec3ce9133ac3f7ddff19b742be" alt=""
data:image/s3,"s3://crabby-images/11066/11066c0835e56ef465e5ca9aa06fbcfc85d9e5f9" alt=""
Model
Can we clarify the terminology a bit ?
(c) One Fourth Labs
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
\(h_{3} = \hat{y} = f(x) \)
- The pre-activation at layer 'i' is given by
\( a_i(x) = W_ih_{i-1}(x) + b_i \)
- The activation at layer 'i' is given by
\( h_i(x) = g(a_i(x)) \)
- The activation at output layer 'L' is given by
\( f(x) = h_L = O(a_L) \)
where 'g' is called as the activation function
where 'O' is called as the output activation function
(c) One Fourth Labs
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
\(h_{L} = \hat{y} = f(x) \)
\(\hat{y} = f(x) = O(W_3g(W_2g(W_1x + b_1) + b_2) + b_3)\)
Model
How do we decide the output layer ?
Model
How do we decide the output layer ?
(c) One Fourth Labs
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
- On RHS show the imdb example from my lectures
- ON LHS show the apple example from my lecture
- Below LHS example, pictorially show other examples of regression from Kaggle
- Below RHS example, pictorially show other examples of classification from Kaggle
- Finally show that in our contest also we need to do regression (bounding box predict x,y,w,h) and classification (character recognition)
Model
How do we decide the output layer ?
(c) One Fourth Labs
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
isActor
Damon
. . .
isDirector
Nolan
. . . .
\(x_i\)
imdb
Rating
critics
Rating
RT
Rating
\(y_i\) = { 8.8 7.3 8.1 846,320 }
\(y_i\) = { 1 0 0 0 }
data:image/s3,"s3://crabby-images/d1566/d156656f2d09aa12f90835cbe64e7b6f078d42e3" alt=""
Apple
Banana
Orange
Grape
Box Office
Collection
Model
What is the output layer for regression problems ?
(c) One Fourth Labs
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
x = [x1, x2, x3, x4, x5]
def sigmoid(a):
return 1.0/(1.0+ np.exp(-a))
def output_layer(a):
return a
def forward_propagation(x):
L = 3 #Total number of layers
W = {...} #Assume weights are learnt
a[1] = W[1]*x + b[1]
for i in range(1,L):
h[i] = sigmoid(a[i])
a[i+1] = W[i+1]*h[i] + b[i+1]
Y = output_layer(a[L])
Model
What is the output layer for classification problems ?
(c) One Fourth Labs
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
data:image/s3,"s3://crabby-images/d1566/d156656f2d09aa12f90835cbe64e7b6f078d42e3" alt=""
Apple
\(\hat{y}\) = { 1, 0, 0, 0 }
Banana
Orange
Grape
True Output :
\(\hat{y}\) = { 0.64, 0.03, 0.26, 0.07 }
Predicted Output :
data:image/s3,"s3://crabby-images/e1bde/e1bde6442392aa1f0de7e962e09698572e6c8d52" alt=""
data:image/s3,"s3://crabby-images/799c6/799c64280fd48dfc21ca3e34f156ed0af2a36242" alt=""
What kind of output activation function should we use?
Model
What is the output layer for classification problems ?
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
data:image/s3,"s3://crabby-images/d1566/d156656f2d09aa12f90835cbe64e7b6f078d42e3" alt=""
Apple
Banana
Orange
Grape
.
.
.
.
.
.
\(a_1 = W_1*x\)
Model
What is the output layer for classification problems ?
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
\(a_1 = W_1*x\)
\(h_{11} = g(a_{11})\)
\(h_{12} = g(a_{12})\)
\(h_{1\ 10} = g(a_{1\ 10})\)
. . . .
data:image/s3,"s3://crabby-images/d1566/d156656f2d09aa12f90835cbe64e7b6f078d42e3" alt=""
Apple
Banana
Orange
Grape
\(h_1 = g(a_1)\)
Model
What is the output layer for classification problems ?
(c) One Fourth Labs
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
data:image/s3,"s3://crabby-images/d1566/d156656f2d09aa12f90835cbe64e7b6f078d42e3" alt=""
Apple
Banana
Orange
Grape
.
.
.
.
.
.
\(a_2 = W_2*h_1\)
Model
What is the output layer for classification problems ?
(c) One Fourth Labs
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
\(a_2 = W_2*h_1\)
\(h_{21} = g(a_{21})\)
\(h_{22} = g(a_{22})\)
\(h_{2\ 10} = g(a_{2\ 10})\)
. . . .
Apple
Banana
Orange
Grape
data:image/s3,"s3://crabby-images/d1566/d156656f2d09aa12f90835cbe64e7b6f078d42e3" alt=""
\(h_2 = g(a_2)\)
Model
What is the output layer for classification problems ?
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
Apple
Banana
Orange
Grape
data:image/s3,"s3://crabby-images/d1566/d156656f2d09aa12f90835cbe64e7b6f078d42e3" alt=""
\(a_3 = W_3*h_2\)
Model
What is the output layer for classification problems ?
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
\(a_3 = W_3*h_2\)
\(\hat{y}_{1} = O(a_{31})\)
\(\hat{y}_{2} = O(a_{32})\)
\(\hat{y}_{4} = O(a_{34})\)
\(\hat{y}_{3} = O(a_{33})\)
Apple
Banana
Orange
Grape
data:image/s3,"s3://crabby-images/d1566/d156656f2d09aa12f90835cbe64e7b6f078d42e3" alt=""
Model
What is the output layer for classification problems ?
(c) One Fourth Labs
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
data:image/s3,"s3://crabby-images/e1bde/e1bde6442392aa1f0de7e962e09698572e6c8d52" alt=""
data:image/s3,"s3://crabby-images/799c6/799c64280fd48dfc21ca3e34f156ed0af2a36242" alt=""
Take each entry and divide by the sum of all entries
data:image/s3,"s3://crabby-images/799c6/799c64280fd48dfc21ca3e34f156ed0af2a36242" alt=""
We will now try using softmax function
Apple
Banana
Orange
Grape
data:image/s3,"s3://crabby-images/d1566/d156656f2d09aa12f90835cbe64e7b6f078d42e3" alt=""
Model
What is the output layer for classification problems ?
(c) One Fourth Labs
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
data:image/s3,"s3://crabby-images/9f8c6/9f8c6af8fd443f655a8e583c696452d6c2ae4c2a" alt=""
Model
What is the output layer for classification problems ?
(c) One Fourth Labs
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
\(h = [ h_{1} h_{2} h_{3} h_{4} ]\)
\(softmax(h) = [softmax(h_{1}) softmax(h_{2}) softmax(h_{3}) softmax(h_{4})] \)
Apple
Banana
Orange
Grape
data:image/s3,"s3://crabby-images/d1566/d156656f2d09aa12f90835cbe64e7b6f078d42e3" alt=""
Model
What is the output layer for classification problems ?
(c) One Fourth Labs
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
Apple
Banana
Orange
Grape
data:image/s3,"s3://crabby-images/d1566/d156656f2d09aa12f90835cbe64e7b6f078d42e3" alt=""
\(a_2 = W_2*h_1\)
\(h_2 = g(a_2)\)
\(a_1 = W_1*x\)
\(a_3 = W_3*h_2\)
\(h_1 = g(a_1)\)
\(\hat{y} = softmax(a_3)\)
Model
What is the output layer for regression problems ?
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
isActor
Damon
. . .
isDirector
Nolan
. . . .
\(x_i\)
Box Office
Collection
\(\hat{y}\) = $ 15,032,493.29
True Output :
\(\hat{y}\) = $ 10,517,330.07
Predicted Output :
data:image/s3,"s3://crabby-images/e1bde/e1bde6442392aa1f0de7e962e09698572e6c8d52" alt=""
data:image/s3,"s3://crabby-images/799c6/799c64280fd48dfc21ca3e34f156ed0af2a36242" alt=""
What kind of output function should we use?
Model
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
.
.
.
.
.
.
\(a_1 = W_1*x\)
What is the output layer for regression problems ?
Box Office
Collection
isActor
Damon
. . .
isDirector
Nolan
. . .
\(x_i\)
Model
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
\(a_1 = W_1*x\)
\(h_{11} = g(a_{11})\)
\(h_{12} = g(a_{12})\)
\(h_{1\ 5} = g(a_{1\ 5})\)
. . . .
\(h_1 = g(a_1)\)
Box Office
Collection
isActor
Damon
. . .
isDirector
Nolan
. . .
\(x_i\)
What is the output layer for regression problems ?
Model
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
.
.
.
.
.
.
\(a_2 = W_2*h_1\)
Box Office
Collection
isActor
Damon
. . .
isDirector
Nolan
. . .
\(x_i\)
What is the output layer for regression problems ?
Model
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
\(a_2 = W_2*h_1\)
\(h_{21} = g(a_{21})\)
\(h_{22} = g(a_{22})\)
\(h_{2\ 5} = g(a_{2\ 5})\)
. . . .
\(h_2 = g(a_2)\)
Box Office
Collection
isActor
Damon
. . .
isDirector
Nolan
. . .
\(x_i\)
What is the output layer for regression problems ?
Model
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
\(a_3 = W_3*h_2\)
Box Office
Collection
isActor
Damon
. . .
isDirector
Nolan
. . .
\(x_i\)
\(\hat{y} = O(a_{3})\)
What is the output layer for regression problems ?
Model
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
Can we use sigmoid function ?
data:image/s3,"s3://crabby-images/799c6/799c64280fd48dfc21ca3e34f156ed0af2a36242" alt=""
NO
What is the output layer for regression problems ?
data:image/s3,"s3://crabby-images/10845/108457f31c2347f5441451e65da32bfcc580d7fd" alt=""
data:image/s3,"s3://crabby-images/95e26/95e26bbbac0e2b7e8efedfc0957473cdc9362544" alt=""
data:image/s3,"s3://crabby-images/799c6/799c64280fd48dfc21ca3e34f156ed0af2a36242" alt=""
data:image/s3,"s3://crabby-images/799c6/799c64280fd48dfc21ca3e34f156ed0af2a36242" alt=""
Can we use softmax function ?
NO
data:image/s3,"s3://crabby-images/799c6/799c64280fd48dfc21ca3e34f156ed0af2a36242" alt=""
data:image/s3,"s3://crabby-images/799c6/799c64280fd48dfc21ca3e34f156ed0af2a36242" alt=""
Can we use real numbered pre-activation as it is ?
data:image/s3,"s3://crabby-images/799c6/799c64280fd48dfc21ca3e34f156ed0af2a36242" alt=""
Yes, it is a real number after all
data:image/s3,"s3://crabby-images/799c6/799c64280fd48dfc21ca3e34f156ed0af2a36242" alt=""
data:image/s3,"s3://crabby-images/799c6/799c64280fd48dfc21ca3e34f156ed0af2a36242" alt=""
What happens if we get a negative output ?
Should we not normalize it ?
Box Office
Collection
isActor
Damon
. . .
isDirector
Nolan
. . .
\(x_i\)
Model
(c) One Fourth Labs
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
\(a_2 = W_2*h_1\)
\(h_2 = g(a_2)\)
\(a_1 = W_1*x\)
\(a_3 = W_3*h_2\)
\(h_1 = g(a_1)\)
\(\hat{y} = a_3\)
What is the output layer for regression problems ?
Box Office
Collection
isActor
Damon
. . .
isDirector
Nolan
. . .
\(x_i\)
Model
Can we see the model in action?
(c) One Fourth Labs
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
1) We will show the demo which Ganga is preparing
Model
In practice how would you deal with extreme non-linearity ?
(c) One Fourth Labs
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
data:image/s3,"s3://crabby-images/dac82/dac825b37dea62fee9931aa094e3518cd1f471a8" alt=""
data:image/s3,"s3://crabby-images/ceb66/ceb66c5aa3c1231ec0f6db40973f8147fb469216" alt=""
-
-
-
-
-
-
-
-
-
data:image/s3,"s3://crabby-images/1a711/1a711b7b24bdbe66285d4a45ff878087a4b38b41" alt=""
-
-
-
data:image/s3,"s3://crabby-images/b000a/b000a1b48a6845014e117666ed48d21985ba5fc7" alt=""
Model
In practice how would you deal with extreme non-linearity ?
(c) One Fourth Labs
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
data:image/s3,"s3://crabby-images/7bb88/7bb88a809815898c5b6110fd5dff928a8b993edc" alt=""
data:image/s3,"s3://crabby-images/79d31/79d311be18840b80421253804772cb17e85a076e" alt=""
data:image/s3,"s3://crabby-images/bc3a4/bc3a486ee713286a0d039ba1a0ccbb85ef326f59" alt=""
data:image/s3,"s3://crabby-images/e72be/e72bebc9e9ccf50a69df056b0d8420c94b726ca2" alt=""
\(Model\)
\(Loss\)
Model
Why is Deep Learning also called Deep Representation Learning ?
(c) One Fourth Labs
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
Apple
Banana
Orange
Grape
data:image/s3,"s3://crabby-images/d1566/d156656f2d09aa12f90835cbe64e7b6f078d42e3" alt=""
Loss Function
What is the loss function that you use for a regression problem ?
(c) One Fourth Labs
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
Size in feet^2 | No of bedrroms | House Rent (Rupees) in 1000's |
---|---|---|
850 | 2 | 12 |
1100 | 2 | 20 |
1000 | 3 | 19 |
.... | .... | .... |
\(h_{2} = \hat{y} = f(x) \)
\(a_1 = W_1*x + b_1 = [ 0.67 -0.415 ]\)
\(h_1 = sigmoid(a_1) = [ 0.66 0.40 ]\)
\(a_2 = W_2*h_1 + b_2 = 11.5 \)
\(h_2 = a_2 = 11.5\)
Output :
Squared Error Loss :
\(L(\Theta) = (11.5 - 12)^2\)
\(= (0.25)\)
Loss Function
What is the loss function that you use for a regression problem ?
(c) One Fourth Labs
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
Size in feet^2 | No of bedrroms | House Rent (Rupees) in 1000's |
---|---|---|
850 | 2 | 12 |
1100 | 2 | 14 |
1000 | 3 | 15 |
.... | .... | .... |
\(h_{2} = \hat{y} = f(x) \)
\(a_1 = W_1*x + b_1 = [ 0.72 -0.39 ]\)
\(h_1 = sigmoid(a_1) = [ 0.67 0.40 ]\)
\(a_2 = W_2*h_1 + b_2 = 11.6 \)
\(h_2 = a_2 = 11.6\)
Output :
Squared Error Loss :
\(L(\Theta) = (11.6 - 14)^2\)
\(= (5.76)\)
Loss Function
What is the loss function that you use for a regression problem ?
(c) One Fourth Labs
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
Size in feet^2 | No of bedrroms | House Rent (Rupees) in 1000's |
---|---|---|
850 | 2 | 12 |
1100 | 2 | 14 |
1000 | 3 | 15 |
.... | .... | .... |
\(h_{2} = \hat{y} = f(x) \)
\(a_1 = W_1*x + b_1 = [ 0.95 -0.65 ]\)
\(h_1 = sigmoid(a_1) = [ 0.72 0.34 ]\)
\(a_2 = W_2*h_1 + b_2 = 11.5 \)
\(h_2 = a_2 = 11.5\)
Output :
Squared Error Loss :
\(L(\Theta) = (11.5 - 15)^2\)
\(= (12.25)\)
Loss Function
What is the loss function that you use for a regression problem ?
(c) One Fourth Labs
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
Size in feet^2 | No of bedrroms | House Rent (Rupees) in 1000's |
---|---|---|
850 | 2 | 12 |
1100 | 2 | 14 |
1000 | 3 | 15 |
.... | .... | .... |
\(h_{2} = \hat{y} = f(x) \)
X = [X1, X2, X3, X4, ..., XN] #N 'd' dimensiomal data points
Y = [y1, y2, y3, y4, ..., yN]
def sigmoid(a):
return 1.0/(1.0+ np.exp(-a))
def output_layer(a):
return a
def forward_propagation(X):
L = 3 #Total number of layers
W = {...} #Assume weights are learnt
a[1] = W[1]*X + b[1]
for i in range(1,L):
h[i] = sigmoid(a[i])
a[i+1] = W[i+1]*h[i] + b[i+1]
return output_layer(a[L])
def compute_loss(X,Y):
N = len(X) #Number of data points
loss = 0
for x,y in zip(X,Y):
fx = forward_propagation(X)
loss += (1/N)*(fx - y)**2
return loss
Loss Function
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
\(x_i\)
\(a_1 = W_1*x + b_1 = [ 0.8 0.52 0.68 0.7 ]\)
\(h_1 = sigmoid(a_1) = [ 0.69 0.63 0.66 0.67 ]\)
\(a_2 = W_2*h_1 + b_2 = 0.948\)
\(\hat{y} = sigmoid(a_2) = 0.7207\)
Output :
Cross Entropy Loss:
\(L(\Theta) = -1*\log({0.7207})\)
\(= 0.327\)
What is the loss function that you use for a binary classification problem ?
Loss Function
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
\(x_i\)
\(a_1 = W_1*x + b_1 = [ 0.01 0.71 0.42 0.63 ]\)
\(h_1 = sigmoid(a_1) = [ 0.50 0.67 0.60 0.65 ]\)
\(a_2 = W_2*h_1 + b_2 = 0.921\)
\(\hat{y} = sigmoid(a_2) = 0.7152\)
Output :
Cross Entropy Loss:
\(L(\Theta) = -1*\log({1- 0.7152})\)
\(= 1.2560\)
What is the loss function that you use for a binary classification problem ?
Loss Function
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
\(x_i\)
What is the loss function that you use for a binary classification problem ?
X = [X1, X2, X3, X4, ..., XN] #N 'd' dimensiomal data points
Y = [y1, y2, y3, y4, ..., yN]
def sigmoid(a):
return 1.0/(1.0+ np.exp(-a))
def output_layer(a):
return a
def forward_propagation(X):
L = 3 #Total number of layers
W = {...} #Assume weights are learnt
a[1] = W[1]*X + b[1]
for i in range(1,L):
h[i] = sigmoid(a[i])
a[i+1] = W[i+1]*h[i] + b[i+1]
return output_layer(a[L])
def compute_loss(X,Y):
N = len(X) #Number of data points
loss = 0
for x,y in zip(X,Y):
fx = forward_propagation(X)
if y == 0:
loss += -(1/N)*np.log(1-fx)
else:
loss += -(1/N)*np.log(fx)
return loss
Loss Function
What is the loss function that you use for a multi-class classification problem ?
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
\(x_i\)
\(a_1 = W_1*x + b_1 = [ -0.19 -0.16 -0.09 0.77 ]\)
\(h_1 = sigmoid(a_1) = [ 0.45 0 .46 0 .49 0.68 ]\)
\(a_2 = W_2*h_1 + b_2 = [ 0.13 0.33 0.89 ]\)
\(\hat{y} = softmax(a_2) = [ 0.23 0.28 0.49 ]\)
Output :
Cross Entropy Loss:
\(L(\Theta) = -1*\log({0.28})\)
\(= 1.2729\)
Loss Function
What is the loss function that you use for a multi-class classification problem ?
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
\(x_i\)
\(a_1 = W_1*x + b_1 = [ 0.62 0.09 0.2 -0.15 ]\)
\(h_1 = sigmoid(a_1) = [ 0.65 0.52 0.55 0.46 ]\)
\(a_2 = W_2*h_1 + b_2 = [ 0.32 0.29 0.85 ]\)
Output :
Cross Entropy Loss:
\(L(\Theta) = -1*\log({0.4648})\)
\(= 0.7661\)
\(\hat{y} = softmax(a_2) = [ 0.2718 0.2634 0.4648 ]\)
Loss Function
What is the loss function that you use for a multi-class classification problem ?
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
\(x_i\)
\(a_1 = W_1*x + b_1 = [ 0.31 0.39 0.25 -0.54 ]\)
\(h_1 = sigmoid(a_1) = [ 0.58 0.60 0.56 0.37 ]\)
\(a_2 = W_2*h_1 + b_2 = [ 0.39 0.18 0.79 ]\)
\(\hat{y} = softmax(a_2) = [ 0.3024 0.2462 0.4514 ]\)
Output :
Cross Entropy Loss:
\(L(\Theta) = -1*\log({0.4514})\)
\(= 0.7954\)
Loss Function
What is the loss function that you use for a multi-class classification problem ?
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
\(x_i\)
\(a_1 = W_1*x + b_1 = [ 0.31 0.39 0.25 -0.54 ]\)
\(h_1 = sigmoid(a_1) = [ 0.58 0.60 0.56 0.37 ]\)
\(a_2 = W_2*h_1 + b_2 = [ 0.39 0.18 0.79 ]\)
\(\hat{y} = softmax(a_2) = [ 0.3024 0.2462 0.4514 ]\)
Output :
Cross Entropy Loss:
\(L(\Theta) = -1*\log({0.4514})\)
\(= 0.7954\)
Loss Function
What is the loss function that you use for a multi-class classification problem ?
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
\(x_i\)
X = [X1, X2, X3, X4, ..., XN] #N 'd' dimensiomal data points
Y = [y1, y2, y3, y4, ..., yN]
def sigmoid(a):
return 1.0/(1.0+ np.exp(-a))
def output_layer(a):
return a
def forward_propagation(X):
L = 3 #Total number of layers
W = {...} #Assume weights are learnt
a[1] = W[1]*X + b[1]
for i in range(1,L):
h[i] = sigmoid(a[i])
a[i+1] = W[i+1]*h[i] + b[i+1]
return output_layer(a[L])
def compute_loss(X,Y):
N = len(X) #Number of data points
loss = 0
for x,y in zip(X,Y):
fx = forward_propagation(X)
for i in range(len(y))
loss += -(1/N)*(y[i])*np.log(1-fx[i])
return loss
Loss Function
What have we learned so far?
(c) One Fourth Labs
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
\(x_i\)
data:image/s3,"s3://crabby-images/10845/108457f31c2347f5441451e65da32bfcc580d7fd" alt=""
data:image/s3,"s3://crabby-images/95e26/95e26bbbac0e2b7e8efedfc0957473cdc9362544" alt=""
data:image/s3,"s3://crabby-images/799c6/799c64280fd48dfc21ca3e34f156ed0af2a36242" alt=""
data:image/s3,"s3://crabby-images/799c6/799c64280fd48dfc21ca3e34f156ed0af2a36242" alt=""
But, who will give us the weights ?
Learning Algorithm
(Partial )Derivatives, Gradients
Can we do a quick recap of some basic calculus ?
(c) One Fourth Labs
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
\(?\)
\(?\)
\(?\)
(Partial )Derivatives, Gradients
Can we do a quick recap of some basic calculus ?
(c) One Fourth Labs
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
\(?\)
\(Say \ \ f(x) =(1/x)\)
\(, \ \ g(x) = e^{-x^{2}}\)
\(Say \ \ p(x) =e^{x}\)
\(, \ \ q(x) = -x^{2}\)
\(Say \ \ m(x) =-x\)
\(, \ \ n(x) = x^{2}\)
(Partial )Derivatives, Gradients
Can we do a quick recap of some basic calculus ?
(c) One Fourth Labs
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
\(?\)
\(Say \ \ f(x) =sin(x)\)
\(, \ \ g(x) = 1/e^{-x^{2}}\)
\(?\)
\(Say \ \ f(x) =cos(x)\)
\(, \ \ g(x) = sin(1/e^{-x^{2}})\)
\(?\)
\(Say \ \ f(x) =log(x)\)
\(, \ \ g(x) = cos(sin(1/e^{-x^{2}}))\)
(Partial )Derivatives, Gradients
Can we do a quick recap of some basic calculus ?
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
\(x\)
\(x^{2}\)
\(e^{-x}\)
\( sin(1/x)\)
\( cos(x)\)
\( log(x)\)
\(w_{1}\)
\(w_{2}\)
\(w_{3}\)
\(w_{4}\)
\(w_{5}\)
data:image/s3,"s3://crabby-images/10845/108457f31c2347f5441451e65da32bfcc580d7fd" alt=""
data:image/s3,"s3://crabby-images/95e26/95e26bbbac0e2b7e8efedfc0957473cdc9362544" alt=""
data:image/s3,"s3://crabby-images/799c6/799c64280fd48dfc21ca3e34f156ed0af2a36242" alt=""
data:image/s3,"s3://crabby-images/799c6/799c64280fd48dfc21ca3e34f156ed0af2a36242" alt=""
How do we compute partial derivative ?
Assume that all other variables are constant
(Partial )Derivatives, Gradients
Can we do a quick recap of some basic calculus ?
(c) One Fourth Labs
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
\(x\)
\(w_{1}\)
\(w_{2}\)
\(w_{3}\)
\(w_{4}\)
\(w_{5}\)
\(h_{1}\)
\(h_{2}\)
\(y\)
\(w_{7}\)
\(w_{6}\)
\(y\)
\(h_{3}\)
(Partial )Derivatives, Gradients
Can we do a quick recap of some basic calculus ?
(c) One Fourth Labs
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
\(x\)
\(w_{1}\)
\(w_{2}\)
\(w_{3}\)
\(w_{4}\)
\(w_{5}\)
\(h_{1}\)
\(h_{2}\)
\(w_{7}\)
\(w_{6}\)
\(y\)
\(h_{3}\)
(c) One Fourth Labs
data:image/s3,"s3://crabby-images/10845/108457f31c2347f5441451e65da32bfcc580d7fd" alt=""
data:image/s3,"s3://crabby-images/95e26/95e26bbbac0e2b7e8efedfc0957473cdc9362544" alt=""
data:image/s3,"s3://crabby-images/799c6/799c64280fd48dfc21ca3e34f156ed0af2a36242" alt=""
data:image/s3,"s3://crabby-images/799c6/799c64280fd48dfc21ca3e34f156ed0af2a36242" alt=""
Wouldn't it be tedious to compute such a partial derivative w.r.t all variables ?
Well, not really. We can reuse some of the work.
(Partial )Derivatives, Gradients
Can we do a quick recap of some basic calculus ?
(c) One Fourth Labs
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
(Partial )Derivatives, Gradients
Can we do a quick recap of some basic calculus ?
(c) One Fourth Labs
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
(Partial )Derivatives, Gradients
Can we do a quick recap of some basic calculus ?
(c) One Fourth Labs
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
(Partial )Derivatives, Gradients
What are the key takeaways ?
(c) One Fourth Labs
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
\(No\ matter\ how\ complex\ the\ function,\)
\(we\ can\ always\ compute\ the\ derivative\ wrt\) \(any\ variable\ using\ the\ chain\ rule\)
\(We\ can\ reuse\ a\ lot\ of\ work\ by\)
\(starting\ backwards\ and\ computing\)
\(simpler\ elements\ in\ the\ chain\)
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
(Partial )Derivatives, Gradients
What is a gradient ?
\(Gradient\ is\ simply\ a\ collection\ of\ partial \ derivatives\)
Learning Algorithm
Can we use the same Gradient Descent algorithm as before ?
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
\(x\)
\( Earlier: w, b\)
\(Now: w_{11}, w_{12}, ... \)
\( Earlier: L(w, b)\)
\(Now: L(w_{11}, w_{12}, ...) \)
\(x_i\)
X = [0.5, 2.5]
Y = [0.2, 0.9]
def f(x,w,b): #sigmoid with parameters w,b
return 1.0/(1.0+ np.exp(-(w*x + b)))
def error(w,b):
err = 0.0
for x,y in zip(X,Y):
fx = f(x,w,b)
err += 0.5*(fx - y)**2
return err
def grad_w(x,y,w,b):
fx = f(x,w,b)
return (fx - y)*fx*(1 - fx)*x
def grad_b(x,y,w,b):
fx = f(x,w,b)
return (fx - y)*fx*(1 - fx)
def do_gradient_descent():
w, b, eta, max_epochs = -2, -2, 1.0, 1000
for i in rang(max_epochs):
dw, db = 0, 0
for x, y in zip(X,Y):
dw += grad_w(x,y,w,b)
db += grad_b(x,y,w,b)
w = w - eta*dw
b = b - eta*db
Learning Algorithm
Can we use the same Gradient Descent algorithm as before ?
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
data:image/s3,"s3://crabby-images/c47da/c47da5382e08bde064259bb6ffe57d03e4aa81e6" alt=""
data:image/s3,"s3://crabby-images/9d20c/9d20cdadd38396025f835192c83f7e6ff06ea7fa" alt=""
data:image/s3,"s3://crabby-images/14c34/14c34ea6bdde67bbef251820fcef5efcbb9b6d72" alt=""
data:image/s3,"s3://crabby-images/e9090/e90904e8d2a19e7908af2ce410ec5cbefac7fd8c" alt=""
data:image/s3,"s3://crabby-images/59fa6/59fa63c245fd35a7f1ca09b238e215fc68fe132a" alt=""
data:image/s3,"s3://crabby-images/632fe/632fe07cdc45d385262b4b66e41ff973c999827c" alt=""
data:image/s3,"s3://crabby-images/223e9/223e90fc0a56bc163202f6ffe3f7367d40ce285d" alt=""
data:image/s3,"s3://crabby-images/216a8/216a8da9a8159739e868c712eb69d170c6ebafc2" alt=""
data:image/s3,"s3://crabby-images/1ce6f/1ce6f5baee767d8c8c6ba1250f2fea9b06320fa6" alt=""
data:image/s3,"s3://crabby-images/c54a6/c54a632f22a9049ef47a6f8bfdd70dc25fc7ad40" alt=""
data:image/s3,"s3://crabby-images/adc5f/adc5f90654b0df8324deef4f26550192a61699f8" alt=""
data:image/s3,"s3://crabby-images/fddeb/fddeb09309e95bc13c17cbb820de6a8de2a9fe36" alt=""
data:image/s3,"s3://crabby-images/01d20/01d207de5c8934597cffeb8e4da803882e6181de" alt=""
data:image/s3,"s3://crabby-images/ec8c2/ec8c2cbe5c4f36feb201eb27376bc049ec9213ca" alt=""
data:image/s3,"s3://crabby-images/cec08/cec083fc73a6c3e5feaf572130d68e232cdb4969" alt=""
data:image/s3,"s3://crabby-images/a4154/a4154421cbbb8ac2edd8077f9198b052123250bc" alt=""
data:image/s3,"s3://crabby-images/36cf7/36cf75bab96aea42945e75dbe612ec7e474f6d35" alt=""
data:image/s3,"s3://crabby-images/6b9e2/6b9e2a4fdccbfd008bda3e6658be912641651a70" alt=""
data:image/s3,"s3://crabby-images/4b52e/4b52e7d7a2dc86ff259afb39590bd46e46e30080" alt=""
data:image/s3,"s3://crabby-images/2cf3f/2cf3f93742c35de9f07e830d9e373c7187788a81" alt=""
data:image/s3,"s3://crabby-images/b6c0e/b6c0e40edb004415fa55eaf99e9c533740c76792" alt=""
data:image/s3,"s3://crabby-images/cedc8/cedc8fc6290998acc22405d567393b50f5d75705" alt=""
data:image/s3,"s3://crabby-images/a25da/a25daa7cd555ca0ae13c8e47d1e7004a17a035a0" alt=""
data:image/s3,"s3://crabby-images/9ca4c/9ca4c3cb4081a2bcf1892b5b8b67e7b29ab27659" alt=""
data:image/s3,"s3://crabby-images/a06c0/a06c01cab78bb461e5a6fadcfe07aaf1dcaa20c0" alt=""
data:image/s3,"s3://crabby-images/aca48/aca487f22e017e2a7230dcf0035726f2b4fae037" alt=""
data:image/s3,"s3://crabby-images/1a8da/1a8da2248c965ec4b6ed9a0b366524d630cf8e0e" alt=""
data:image/s3,"s3://crabby-images/37b68/37b688ed7903a895c702b091cd0b20bb2175405a" alt=""
data:image/s3,"s3://crabby-images/053ab/053ab14fb0e0daa6e416468f0442761c7b993806" alt=""
data:image/s3,"s3://crabby-images/984db/984db7cc44284f1926c31624fe609bcfca91a501" alt=""
data:image/s3,"s3://crabby-images/e8de8/e8de8d24edbcca73f5d94e7fb874b9ed849e1b06" alt=""
data:image/s3,"s3://crabby-images/15055/15055b87b446cd23c0221ffcdf6fe8d4f406839b" alt=""
data:image/s3,"s3://crabby-images/082a7/082a760d53fffad0181841e0c9e8f756562992a8" alt=""
data:image/s3,"s3://crabby-images/9362b/9362ba8234a1dc1248c8cfe67ebcb4f37ddb4a35" alt=""
data:image/s3,"s3://crabby-images/30bb3/30bb3cd4c44043d66e22c37eb11939d1de14a63c" alt=""
data:image/s3,"s3://crabby-images/a03ce/a03cee293c4a44ff804e701213b02dc7c98630d7" alt=""
data:image/s3,"s3://crabby-images/bfe1b/bfe1b5f723d41b4328d4b435cae67adfcdce1f91" alt=""
data:image/s3,"s3://crabby-images/b85c7/b85c7b6d1caccee9f37129a17c6fea03b1a53ad7" alt=""
data:image/s3,"s3://crabby-images/4d67f/4d67f8c7a5dece9c1752981c0fdd4fb07a88db89" alt=""
data:image/s3,"s3://crabby-images/5a2df/5a2dfb0b0d806d2d2d12ada638d94789eef5b79f" alt=""
data:image/s3,"s3://crabby-images/b87d6/b87d6b618efe9dd3235a7510cab47ea8c0197e8f" alt=""
data:image/s3,"s3://crabby-images/1799c/1799c0a1834aa2bacccebe811f852c21e12894c5" alt=""
data:image/s3,"s3://crabby-images/a782e/a782ea65278dfd576b842d5571208c1fb02dbbfc" alt=""
data:image/s3,"s3://crabby-images/741f2/741f2ea4bb4f274b5b4efd77d335d33eb6348ca4" alt=""
data:image/s3,"s3://crabby-images/d2948/d2948f133efd9d290d23fb7663437389b78598ba" alt=""
data:image/s3,"s3://crabby-images/b8058/b80588b73c67fb236c18d37f10c40ab50a99fb9b" alt=""
data:image/s3,"s3://crabby-images/5e286/5e286b0a35c9c711a840863937e1d1048b365a97" alt=""
data:image/s3,"s3://crabby-images/47114/47114aed55b2a601b95d6c2bc08d4f28d6056e54" alt=""
X = [0.5, 2.5]
Y = [0.2, 0.9]
def f(x,w,b): #sigmoid with parameters w,b
return 1.0/(1.0+ np.exp(-(w*x + b)))
def error(w,b):
err = 0.0
for x,y in zip(X,Y):
fx = f(x,w,b)
err += 0.5*(fx - y)**2
return err
def grad_w(x,y,w,b):
fx = f(x,w,b)
return (fx - y)*fx*(1 - fx)*x
def grad_b(x,y,w,b):
fx = f(x,w,b)
return (fx - y)*fx*(1 - fx)
def do_gradient_descent():
w, b, eta, max_epochs = -2, -2, 1.0, 1000
for i in rang(max_epochs):
dw, db = 0, 0
for x, y in zip(X,Y):
dw += grad_w(x,y,w,b)
db += grad_b(x,y,w,b)
w = w - eta*dw
b = b - eta*db
Learning Algorithm
How many derivatives do we need to compute and how do we compute them?
(c) One Fourth Labs
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
data:image/s3,"s3://crabby-images/57385/573850841a674ae9116194ce72d06ffe7c734289" alt=""
data:image/s3,"s3://crabby-images/b7bb9/b7bb9ea2e81b2fe2dea8a944c6ec81636da56c48" alt=""
data:image/s3,"s3://crabby-images/0a9ca/0a9caeb3a587c987db28c81f770c6dae75bd89d6" alt=""
data:image/s3,"s3://crabby-images/e4021/e40213c5bdd7058644ed1d7c83cfb13580a2950a" alt=""
data:image/s3,"s3://crabby-images/f4ba8/f4ba81998460dfe15fb8d3ede4c78a5e8bf59fe2" alt=""
data:image/s3,"s3://crabby-images/1845c/1845cffc51931183e8f17080aa676aafcf900720" alt=""
data:image/s3,"s3://crabby-images/a653b/a653b43541d164576f2b4e27167bd51014f2feeb" alt=""
data:image/s3,"s3://crabby-images/a9eff/a9eff0bd0460543e92644cf993f26895c77d7428" alt=""
\(x_i\)
Learning Algorithm
How many derivatives do we need to compute and how do we compute them?
(c) One Fourth Labs
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
\(x_i\)
- Let us focus on the highlighted weight (\(w_{222}\))
- To learn this weight, we have to compute partial derivative w.r.t loss function
Learning Algorithm
How do we compute the partial derivatives ?
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
\(x_2\)
\(x_1\)
\(x_3\)
\(x_4\)
\(a_1 = W_1*x + b_1 = [ 2.9 1.4 2.1 2.3 ]\)
\(h_1 = sigmoid(a_1) = [ 0.95 0.80 0.89 0.91 ]\)
\(a_2 = W_2*h_1 + b_2 = [ 1.66 0.45 ]\)
\(\hat{y} = softmax(a_2) = [ 0.77 0.23 ]\)
Output :
Squared Error Loss :
\(L(\Theta) = (1 - 0.77)^2 + (0.23)^2\)
\(= 0.1058\)
Learning Algorithm
How do we compute the partial derivatives ?
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
\(x_2\)
\(x_1\)
\(x_3\)
\(x_4\)
Learning Algorithm
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
\(x_2\)
\(x_1\)
\(x_3\)
\(x_4\)
Can we see one more example ?
Learning Algorithm
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
\(x_2\)
\(x_1\)
\(x_3\)
\(x_4\)
\(a_1 = W_1*x + b_1 = [ 2.9 1.4 2.1 2.3 ]\)
\(h_1 = sigmoid(a_1) = [ 0.95 0.80 0.89 0.91 ]\)
\(a_2 = W_2*h_1 + b_2 = [ 1.66 0.45 ]\)
\(\hat{y} = softmax(a_2) = [ 0.77 0.23 ]\)
Output :
Cross Entropy Loss :
\(L(\Theta) = -1*\log(0.77) \)
\(= 0.1135\)
What happens if we change the loss function ?
Learning Algorithm
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
\(x_2\)
\(x_1\)
\(x_3\)
\(x_4\)
What happens if we change the loss function ?
Learning Algorithm
Isn't this too tedious ?
(c) One Fourth Labs
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
Show a small DNN on LHS
ON RHS now show a pytorch logo
Now show the compute graph for one of the weights
nn.backprop() is all you need to write in PyTorch
Evaluation
How do you check the performance of a deep neural network?
(c) One Fourth Labs
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
Test Data
Indian Liver Patient Records \(^{*}\)
- whether person needs to be diagnosed or not ?
Age |
65 |
62 |
20 |
84 |
Albumin |
3.3 |
3.2 |
4 |
3.2 |
T_Bilirubin |
0.7 |
10.9 |
1.1 |
0.7 |
y |
0 |
0 |
1 |
1 |
.
.
.
Predicted |
0 |
1 |
1 |
0 |
Take-aways
What are the new things that we learned in this module ?
(c) One Fourth Labs
data:image/s3,"s3://crabby-images/9f383/9f383d9272ae598a104f63771d0eface8e0ecc46" alt=""
\( x_i \in \mathbb{R} \)
data:image/s3,"s3://crabby-images/dcc9b/dcc9bdf68f50f1666ba1be21faaba16ea3647b3b" alt=""
data:image/s3,"s3://crabby-images/f129b/f129bf51ce0caccc43130d1ee91ce6a4ad28f150" alt=""
data:image/s3,"s3://crabby-images/f81fb/f81fbe2e264ed21958c5512bfb4df5732b4d7a69" alt=""
data:image/s3,"s3://crabby-images/2ccda/2ccdafa2918d0ae8ea24278fe96d3a03e0539ebd" alt=""
data:image/s3,"s3://crabby-images/7b701/7b701489622d211457ca4fac061b5b8a5b1653db" alt=""
data:image/s3,"s3://crabby-images/2c4eb/2c4eb75a7b16a5777ef80ee5b4b915982f426973" alt=""
data:image/s3,"s3://crabby-images/f0d82/f0d822ff7ce63037c9f2a7e9665b547bd30b958b" alt=""
Loss
Model
Data
Task
Evaluation
Learning
Real inputs
Tasks with Real Inputs and Real Outputs
Back-propagation
Squared Error Loss :
Cross Entropy Loss:
Copy of Copy of Copy of Multilayered Network of Neurons
By preksha nema
Copy of Copy of Copy of Multilayered Network of Neurons
- 807