電腦視覺

Computer Vision

講師介紹

建中 22724 陳泓宇
建資副社長/學術/披薩長
dc : chenhowie#1138
ig : chenhowie0421
twitter : @chenhowie0421

電腦視覺介紹

Computer Vision Intro

運作原理

利用卷積神經網路(Convolutional Neural Networks)來讓電腦可以判斷圖像

在這堂課你會學到...

建立圖像分類器

視覺特徵提取背後的運作模式

掌握遷移學習來提升模型

利用數據擴充來擴展數據集

捲積分類器

Convolutional Classifier

捲積分類器流程

提取特徵

利用捲積神經網路把圖片中的特徵提取出來

extract the features

分類

使用前面提取出來的特徵來分類

classify

什麼是特徵

相信各位還記得第一堂教的

訓練分類器

Training Convolutional Classifier

訓練目標

知道要提取哪些特徵
利用特徵判斷類別

遷移學習

如今很少人會從頭開始訓練捲積網路，因此，我們可以重複使用提取特徵的模型，並使用不同的層(layer)來處理特徵分類

Transfer Learning

1

載入資料

2

載入已經訓練好的模型

4

訓練模型

3

連接訓練好的分類模型

遷移學習實作

載入資料

Car or Truck

載入資料

from tensorflow.keras.preprocessing import image_dataset_from_directory

ds_train_ = image_dataset_from_directory(
    'car-or-truck/train',
    labels='inferred',
    label_mode='binary',
    image_size=[128, 128],
    interpolation='nearest',
    batch_size=64,
    shuffle=True,
)

ds_valid_ = image_dataset_from_directory(
    'car-or-truck/valid',
    labels='inferred',
    label_mode='binary',
    image_size=[128, 128],
    interpolation='nearest',
    batch_size=64,
    shuffle=False,
)

資料預處理

import tensorflow as tf

def convert_to_float(image, label):
    image = tf.image.convert_image_dtype(image, dtype=tf.float32)
    return image, label

AUTOTUNE = tf.data.experimental.AUTOTUNE
ds_train = (
    ds_train_
    .map(convert_to_float)
    .cache()
    .prefetch(buffer_size=AUTOTUNE)
)
ds_valid = (
    ds_valid_
    .map(convert_to_float)
    .cache()
    .prefetch(buffer_size=AUTOTUNE)
)

遷移學習

pretrained_base = tf.keras.models.load_model(
    'cv-course-models/vgg16-pretrained-base',
)
pretrained_base.trainable = False

訓練模型

from tensorflow import keras
from tensorflow.keras import layers

model = keras.Sequential([
    pretrained_base,
    layers.Flatten(),
    layers.Dense(6, activation='relu'),
    layers.Dense(1, activation='sigmoid'),
])

model.compile(
    optimizer='adam',
    loss='binary_crossentropy',
    metrics=['binary_accuracy'],
)

history = model.fit(
    ds_train,
    validation_data=ds_valid,
    epochs=30,
)

訓練結果

特徵提取

Feature Extraction

特徵提取三大基本操作

針對特定特徵過濾圖像

在過濾後的圖像中檢測該特徵

壓縮圖像以增強特徵

卷積神經網路

ReLU

最大池化

卷積神經網路

Convolutional Neural Network

卷積神經網絡

Convolutional Neural Network

卷積核

Kernel

特徵圖

Feature Maps

加入激勵函數

ReLU

範例

model = keras.Sequential([
    layers.Conv2D(
        filters=64,
        kernel_size=3,
        activation='relu',
    )
])

filters : 輸出特徵圖的數量

kernel size : 卷積核的大小

activation : 激勵函數

卷積神經網絡實作

Convolutional Neural Network

載入圖片

import tensorflow as tf
import matplotlib.pyplot as plt
plt.rc('figure', autolayout=True)
plt.rc('axes', labelweight='bold', labelsize='large',
       titleweight='bold', titlesize=18, titlepad=10)
plt.rc('image', cmap='magma')

image_path = 'computer-vision-resources/car_feature.jpg'
image = tf.io.read_file(image_path)
image = tf.io.decode_jpeg(image)

image = tf.image.convert_image_dtype(image, dtype=tf.float32)
image = tf.expand_dims(image, axis=0)

建立卷積核

kernel = tf.constant([
    [-1, -1, -1],
    [-1,  8, -1],
    [-1, -1, -1],
])

kernel = tf.reshape(kernel, [*kernel.shape, 1, 1])
kernel = tf.cast(kernel, dtype=tf.float32)

卷積

image_filter = tf.nn.conv2d(
    input=image,
    filters=kernel,
    strides=1,
    padding='SAME',
)

加入圖片輸出

import tensorflow as tf
import matplotlib.pyplot as plt
plt.rc('figure', autolayout=True)
plt.rc('axes', labelweight='bold', labelsize='large',
       titleweight='bold', titlesize=18, titlepad=10)
plt.rc('image', cmap='magma')

image_path = 'computer-vision-resources/car_feature.jpg'
image = tf.io.read_file(image_path)
image = tf.io.decode_jpeg(image)

plt.figure(figsize=(6, 6))
plt.imshow(tf.squeeze(image), cmap='gray')
plt.axis('off')
plt.show()

kernel = tf.constant([
    [-1, -1, -1],
    [-1,  8, -1],
    [-1, -1, -1],
])


image = tf.image.convert_image_dtype(image, dtype=tf.float32)
image = tf.expand_dims(image, axis=0)
kernel = tf.reshape(kernel, [*kernel.shape, 1, 1])
kernel = tf.cast(kernel, dtype=tf.float32)


image_filter = tf.nn.conv2d(
    input=image,
    filters=kernel,
    strides=1,
    padding='SAME',
)

plt.figure(figsize=(6, 6))
plt.imshow(tf.squeeze(image_filter))
plt.axis('off')
plt.show()

池化層

Pooling Layer

運作方式

常見池化層種類

最大池化(maximum pooling)
平均池化(average pooling)

池化層特色

可以在壓縮圖片大小的同時，保留重要的特徵

為什麼要壓縮圖片

減少圖片的網格數，以此來降低訓練所需的資源

建立池化層方式

from tensorflow import keras
from tensorflow.keras import layers

model = keras.Sequential([
    layers.Conv2D(filters=64, kernel_size=3),
    layers.MaxPool2D(pool_size=2),
])

結合前面教的

滑動窗口

Sliding Window

什麼是滑動窗口

複習

前面提到對圖片的三種操作都在滑動窗口上執行

滑動窗口額外的參數

步伐 (stride)
填充 (padding)

步伐

窗口一次移動的距離

stride

strides=(2, 2)

填充

padding = "same"
padding = "valid"

padding

same

網格數不變

valid

網格數變少

自定義卷積神經網絡

Custom Convnet

建立方式

model = keras.Sequential([
    layers.Conv2D(filters=32, kernel_size=5, activation="relu", padding='same',
                  input_shape=[128, 128, 3]),
    layers.MaxPool2D(),

    layers.Conv2D(filters=64, kernel_size=3, activation="relu", padding='same'),
    layers.MaxPool2D(),

    layers.Conv2D(filters=128, kernel_size=3, activation="relu", padding='same'),
    layers.MaxPool2D(),

    layers.Flatten(),
    layers.Dense(units=6, activation="relu"),
    layers.Dense(units=1, activation="sigmoid"),
])

數據增強

Data Augmentation

增進模型

在訓練深度學習模型的時候，資料肯定是越多越好

增加訓練資料

對圖片進行各種處理

ex. 深淺、旋轉角度、位置、鏡像

什麼時候不能用

from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.layers.experimental import preprocessing


pretrained_base = tf.keras.models.load_model(
    '../input/cv-course-models/cv-course-models/vgg16-pretrained-base',
)
pretrained_base.trainable = False

model = keras.Sequential([
    preprocessing.RandomFlip('horizontal'),
    preprocessing.RandomContrast(0.5),
    pretrained_base,
    layers.Flatten(),
    layers.Dense(6, activation='relu'),
    layers.Dense(1, activation='sigmoid'),
])



model.compile(
    optimizer='adam',
    loss='binary_crossentropy',
    metrics=['binary_accuracy'],
)

history = model.fit(
    ds_train,
    validation_data=ds_valid,
    epochs=30,
    verbose=0,
)



import pandas as pd

history_frame = pd.DataFrame(history.history)

history_frame.loc[:, ['loss', 'val_loss']].plot()
history_frame.loc[:, ['binary_accuracy', 'val_binary_accuracy']].plot();

結果

電腦視覺

By Howie Chen

電腦視覺

Howie Chen

建中資訊社副社長兼學術陳泓宇

chenhowie0421

電腦視覺

講師介紹

電腦視覺介紹

運作原理

在這堂課你會學到...

捲積分類器

捲積分類器流程

提取特徵

分類

什麼是特徵

相信各位還記得第一堂教的

訓練分類器

訓練目標

遷移學習

1

2

4

3

遷移學習實作

載入資料

載入資料

資料預處理

遷移學習

訓練模型

訓練結果

特徵提取

特徵提取三大基本操作

卷積神經網路

卷積神經網絡

卷積核

特徵圖

加入激勵函數

範例

卷積神經網絡實作

載入圖片

建立卷積核

卷積

加入圖片輸出

池化層

運作方式

常見池化層種類

池化層特色

為什麼要壓縮圖片

建立池化層方式

結合前面教的

滑動窗口

什麼是滑動窗口

複習

滑動窗口額外的參數

步伐

填充

same

valid

自定義卷積神經網絡

建立方式

數據增強

增進模型

增加訓練資料

什麼時候不能用

結果

電腦視覺

More from Howie Chen