Introduction to Neural Network

What is a neural network?

Examples of neural netrowks?

Forbidden:

  1. Spam classification in Gmail
  2. Image classification in Tesla cars
  3. Face recognition

XOR

Visualization

Machine learning Flow

  1. Analyze data
  2. Transform the data
  3. Build and train a model
  4. Improve
  5. Predict

1. Analyze data

1. Analyze data

Baseline test

  1. Random
  2. A heuristic algorithm based on context knowledge

2. Prepare the data

  1. Remove/replace nulls
  2. Remove outliers
  3. Normalize

 

 

 

 

 

 

2. Split data into three subsets

3. Build a model

https://www.tensorflow.org/api_docs

Model structure

model = tf.keras.Sequential()

model.add(tf.keras.layers.Dense(8))
model.add(tf.keras.layers.Dense(1))

model.compile(optimizer='sgd', loss='mse')


model.fit(x, y, batch_size=32, epochs=10)

model.compile

Model type

tf.keras.Sequential() vs Functional API

Model Layers

  1. tf.keras.layers.Dense

  2. tf.keras.layers.Bidirectional

  3. tf.keras.layers.Dropout

  4. tf.keras.layers.LayerNormalization

  5. tf.keras.layers.GaussianNoise

Loss functions

  • tf.keras.losses.MeanSquaredError
  • tf.keras.losses.SparseCategoricalCrossentropy
  • tf.keras.losses.MeanAbsolutePercentageError

 

https://www.tensorflow.org/api_docs/python/tf/keras/losses

Optimizers

  • tf.keras.optimizers.Adam
  • tf.keras.optimizers.SGD
  • tf.keras.optimizers.Adadelta

 

https://www.tensorflow.org/api_docs/python/tf/keras/optimizers

Optimizers

Model Metrics

  • tf.keras.metrics.Precision
  • tf.keras.metrics.MeanAbsoluteError
  • tf.keras.metrics.SparseCategoricalAccuracy


https://www.tensorflow.org/api_docs/python/tf/keras/metrics

4. model.fit

Train data and validation data

Epoch, accuracy, loss

Epoch 1/100
8/8 [==============================] - 19s 2s/step - loss: 7.9671 - sparse_categorical_accuracy: 0.0478 - val_loss: 5.8485 - val_sparse_categorical_accuracy: 0.1860
Epoch 2/100
8/8 [==============================] - 13s 2s/step - loss: 5.8785 - sparse_categorical_accuracy: 0.1783 - val_loss: 5.1466 - val_sparse_categorical_accuracy: 0.2558
Epoch 3/100
8/8 [==============================] - 9s 1s/step - loss: 4.9277 - sparse_categorical_accuracy: 0.2348 - val_loss: 4.6174 - val_sparse_categorical_accuracy: 0.2907
Epoch 4/100
8/8 [==============================] - 12s 2s/step - loss: 4.0811 - sparse_categorical_accuracy: 0.3609 - val_loss: 3.6062 - val_sparse_categorical_accuracy: 0.3256
Epoch 5/100
8/8 [==============================] - 12s 2s/step - loss: 3.7264 - sparse_categorical_accuracy: 0.3522 - val_loss: 3.1789 - val_sparse_categorical_accuracy: 0.3721
Epoch 6/100
8/8 [==============================] - 10s 1s/step - loss: 2.5395 - sparse_categorical_accuracy: 0.4000 - val_loss: 3.0180 - val_sparse_categorical_accuracy: 0.3953

Callbacks

  • tf.keras.callbacks.EarlyStopping
  • tf.keras.callbacks.ModelCheckpoint
  • tf.keras.callbacks.ReduceLROnPlateau

 

https://www.tensorflow.org/api_docs/python/tf/keras/callbacks

Underfitting vs overfitting

Underfitting vs overfitting

  • Change model architecture
  • Generate new data / Data Augmentation
  • Regularization
  • Introduce noise/dropouts
  • Remove features
  • Early stopping
  • Cross-validation

The L1 penalty aims to minimize the absolute value of the weights.

The L2 penalty aims to minimize the squared magnitude of the weights. 

Regularization

Dropout

Cross-validation

model.evaluate

test data

8/8 [==============================] - 13s 2s/step - loss: 1.3734 - sparse_categorical_accuracy: 0.6565 - val_loss: 1.9678 - val_sparse_categorical_accuracy: 0.6163
Epoch 00022: early stopping
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
text_vectorization (TextVect (None, None)              0
_________________________________________________________________
embedding (Embedding)        (None, None, 512)         262144
_________________________________________________________________
bidirectional (Bidirectional (None, 256)               656384
_________________________________________________________________
dense (Dense)                (None, 64)                16448
_________________________________________________________________
dropout (Dropout)            (None, 64)                0
_________________________________________________________________
dense_1 (Dense)              (None, 14)                910
=================================================================
Total params: 935,886
Trainable params: 935,886
Non-trainable params: 0
_________________________________________________________________
None
1/1 [==============================] - 0s 1ms/step - loss: 2.0168 - sparse_categorical_accuracy: 0.6429
Test Loss: 2.0167980194091797
Test Accuracy: 0.6428571343421936

5. model.predict

Test Accuracy: 0.6428571343421936
[[-0.02160052 -0.00404637  0.02314329  0.02344155  0.01547053  0.02792127
  -0.04589589 -0.05501809  0.0039196  -0.00675361 -0.00584819 -0.0633457
  -0.01288266 -0.04705909]]
Predicted points : 5 ; Orginal points:  [13]

Questions?

User Story points estimation

Understanding agorithm

 

Heuristic algorithm vs machine learning

[Extra topic]

Hyperparameter Tuning with the HParams Dashboard

[Extra topic 2]

Pretrain models

Neugierde

Neugierde #2

Data Scientist Salaries

Senior Software Engineer Salaries