Graduation Project Presentation
Abhishek Kumar 2013ecs07
Department Of Computer Science and Engg.
Shri Mata Vaishno Devi University
****I have covered each section of the article in detail in the file submitted.
December-January
Study and Understand the details of CNN and DL
March
Training the models with different sets of hyperparameters
February
Started applying the All CNN and Fractional Max Pooling Model
April
With the stored weights, applying it to the Transfer Learning application
April
Preparing the report and finalizing the results
Oh Yes, Deep Learning is here
Yes, A big Yes
NN as a Brain analogy
Image as an Array
A brief idea of how a CNN works
Text
Here are the components of a CNN model
Advantages of MP2 is that it is Fast and Quickly reduces the size of the hidden layer also it encodes a degree of invariance with respect to translations and elastic distortions.
Issues with MP2 is that it has disjoint nature of pooling regions and since its size decreases rapidly, stacks of back-to-back CNNs are needed to build deep networks.
The advantage of Fractional Max Pooling is that it reduces the spatial size of the image by a factor of α, where α ∈ (1, 2) also it introduces randomness in terms of choice of pooling region.
Pooling Layers are to reduce the spatial dimensionality of the intermediate layers
A visual demo of Max Pooling where we reduce the dimension while keeping the important feature in-tact
Conclusion
Let's check this out.
Convolution instead of Pooling?
Global average pooling instead of FC?
Almost state of the art**
Layer 1 and 2 (the initial ones)
Layer 3 (the intermediate ones)
Layer 4 and 5 (the last ones)
Inception: The lifeline of Google (model A, the general features)
A model with specific features (model B, the intermediate)
Model C, very specific to our Dataset
B3B – the first 3 layers are copied from baseB and frozen. The remaining five higher layers are initialized randomly.
A3B – the first 3 layers are copied from baseA and frozen. The remaining five layers are initialized randomly
B3B+, like B3B but the first three layers are subsequently fine-tuned during training.
A3B+, like A3B but the first three layers are subsequently fine-tuned during training.
Architecture
Still working to get some substantial results...
Yes, this is a revolution, the next big thing maybe