How Do Machines See?
ARCHIVE
How Do Machines See?
ISCHOOL VISUAL IDENTITY
#002554
#fed141
#007396
#382F2D
#3eb1c8
#f9423a
#C8102E
NOTES
REFERENCES/RESOURCES
IMAGES
Teasers
https://innoeduvation.org/313/vision/tm/
https://innoeduvation.org/313/vision/tmAnimals/index.html
Teachable Machine References
Outline
Outline
What is Vision?
The Sciences of Making Machines That See
The Phenomenology of Networks
Machine Vision by Stepwise Refinement
Neurons, Real and Artificial
Activation Functions
Automation
Activation Functions Redux
Logic Gates
Going Deep
Teachable Machine
Networks of Neurons
What's Going on Here?
What's Going on Here?
What is vision?
STOP+THINK:
Define the verb "to see"
into a representation of
1) light
2) detection
6) external objects
3) multiple stimuli are processed
4) assembled
5) properties
that are processed by the brain
to receive light stimuli
the position, shape, brightness, color of objects in space
STOP+THINK:
Define the verb "to see"
Text
properties
external objects
representation of
multiple stimuli are processed
assembled
detection
1) light
Studying Vision in Fruit Flies
Questions?
How Do Machines See
is a big field of study
Visualizing the Fields of Seeing Machines
Machine Vision by Stepwise Refinement
-
Existence Detection
- e.g., is there a wall ahead?
-
Classification
- e.g., is this a dog or a cat?
-
Recognition
- e.g., is this the sheep named Alois?
-
Sequence Detection
- e.g., was that the ASL sign for "peace"?
1
질문이 있습니까?
But what's in the black box?
Neurons
Real and Artificial
2
Neuron Schematic
Neuron Preserved
Neuron as Shocked and Shocking!
dendrites
axon
nucleus
Neuron Anatomy
A Network of Trusted Friends
YOU
bit of the world
F01
F03
F06
F02
F04
F05
bit of the world
bit of the world
bit of the world
bit of the world
bit of the world
bit of the world
+ + = OK, then:
Every neuron is a function mapping an arbitrary set of binary inputs to a single binary output
7 of ten friends recommend going to the party instead of studying!
All Contacts are Equal!
Some Contacts are More Equal than Others
Go!
Go!
Go!
Hmm...
Hmm...
We "weight" our inputs based on how important they are ...
or how dependable they are.
Some Contacts are More Equal than Others
Compute the Weighted Sum of the Inputs
Sumweighted = Input1 x weight1 + input2 x weight2 + ... + inputn x weightn
des questions?
So Far...
An artificial neuron
- takes 1 or more binary inputs
- weights each one
- sums them up
But how does it decide whether or not to "fire"?
Activation Functions
Everyone's Got a Threshold!
if (weightedSum < threshold) do nothing else FIRE!
Everyone's Got a Threshold!
if (weightedSum < threshold) do nothing else FIRE!
weighted sum of inputs
0 1 2 3 4 5 6 7 8 9
output
1
0
threshold
STEP FUNCTION
Every neuron has an ACTIVATION FUNCTION
Activation Functions
Activation Functions
STOP+THINK
Here we are on Zoom. Inputs are others turning on their video. Weights are 1 and your threshold is 4. Do you turn your video on?
Activation Functions in Everyday Life
अब तक कोई सवाल?
STOP+THINK
Why do we sometimes take advice from someone "with a grain of salt"?
The Secret: You can change the weights
Think about it: if you get bad advice from one of your friends, how does that affect how you weight their advice next time?
What if you have a friend who ALWAYS seems to give good advice?
Feedback Again!
Putting it all together
Putting it all together
A three by three grid of "pixels"
Putting it all together
Think of the grid in one dimension
STOP +THINK
What does this pattern look like in one dimension?
STOP +THINK
What does this pattern look like in one dimension?
Putting it all together
Have three neurons that take the pixels as "inputs"
Putting it all together
Redraw to take advantage of the slide orientation
Putting it all together
Add a neuron that takes the outputs of the first three neurons as its inputs
How many weights?
STOP +THINK
Na
STOP +THINK
Nb
Nc
Nd
P1
P3
P4
P6
P7
P9
P2
P5
P8
w1a w2a w3a w4a w5a w6a w7a w8a w9a
w1b w2b w3b w4b w5b w6b w7b w8b w9b
w1c w2c w3c w4c w5c w6c w7c w8c w9c
wad wbd wcd
Na
Nb
Nc
Nd
P1
P3
P4
P6
P7
P9
P2
P5
P8
w1a w2a w3a w4a w5a w6a w7a w8a w9a
w1b w2b w3b w4b w5b w6b w7b w8b w9b
w1c w2c w3c w4c w5c w6c w7c w8c w9c
wad wbd wcd
"THE MODEL"
任何问题
Can We Automate This?
Could the network teach itself?
- give it random weights
- show it an example
- ask it to guess
- reward or punish it depending on result
WDTM?
reward or punish it depending on result
inputs
outputs
truth
feedback
Basic Idea: FEEDBACK
Feedback
Adjust each weight by some amount, a "delta" or difference depending on whether the sending node gave good advice (that is, got it right).
So Far
An artificial neuron converts a weighted sum of multiple inputs into a single output
A perceptron is a network of artificial neurons that can detect visual patterns
A perceptron is tuned by adjusting weights so it yields expected output for each input pattern.
"Learning" happens by increasing weights for neurons that "get it right" and decreasing weights for neurons that "get it wrong."
A system can be set to react more or less strongly to each learning experience.
"Learning" happens via feedback. The error or loss is the difference between the output and the "ground truth."
Haben Sie Fragen?
Activation Functions Redux
SIGMOID ACTIVATION FUNCTION
RECTIFIED LINEAR UNIT (ReLU) ACTIVATION FUNCTION
ReLU is very common activation function
-
Sum the inputs
-
If negative return 0
-
Otherwise return sum
Preguntas?
Going Deep
hidden layer
inputs
inputs
output
output
Outputs as Percentages
Explain "confidence"
So far...
classes
labels
training data
error/loss
weights
predictions
how much attention does a neuron pay to each of its inputs?
in a classification model, the "bins" into which we put items we have classified
the outputs when a model is run on new data
how much we are getting wrong
the "correct" answers that are used to train a model
samples that we "show" the model so it can adjust its weights based on whether it gets it right or wrong
Google's Teachable Machine
Let's Try It
Teachable Machine Workflow
Teachable Machine FILES
Teachable Machine FILES
Teachable Machine FILES
END
Text
<iframe height="300" style="width: 100%;" scrolling="no" title="" src="https://codepen.io/team/DanR/embed/wvdrORW?default-tab=js%2Cresult&editable=true&theme-id=light" frameborder="no" loading="lazy" allowtransparency="true" allowfullscreen="true" allow="geolocation *; microphone *; camera *; midi *; encrypted-media *">
See the Pen <a href="https://codepen.io/team/DanR/pen/wvdrORW">
</a> by innoeduvation(OLD) (<a href="https://codepen.io/team/DanR">@DanR</a>)
on <a href="https://codepen.io">CodePen</a>.
</iframe>
Extra
Here's a JavaScript implementation of the code in the video (Links to an external site.) (the JS code he mentions in the video):
I also have to doctor the CodePen embed a little. Here's what CodePen provides:
<iframe style="width: 100%;" title="Teachable Machine Example 01" src="https://codepen.io/team/DanR/embed/oNxLLyE?height=415&theme-id=light&default-tab=js,result&editable=true" height="415" allowfullscreen="allowfullscreen">
See the Pen <a href="https://codepen.io/team/DanR/pen/oNxLLyE">Teachable Machine Example 01</a> by innoeduvation(OLD)
(<a href="https://codepen.io/DanR">@DanR</a>) on <a href="https://codepen.io">CodePen</a>.</iframe>
And here's what I have to add to the <iframe> element after "allowfullscreen":
allow="geolocation *; microphone *; camera *; midi *; encrypted-media *"
allow="geolocation *; microphone *; camera *; midi *; encrypted-media *"
Could a perceptron replace logic gates?
Subideas
- The "delta" should depend on the error
- The "delta" should depend on how responsive we want to be to new information
wnew = wold + learnRate * error
Linear Algebra
Matrix
Tensor
Matrix
Tensor
Number (scalar)
Array (vector)
More on Tensors (optional)
Tutorial
YOUR MODEL
-
Teachable Machine
-
Codepen "Projects"
-
Classification Model
Codepen "Projects"
A workspace where all the files of a web development project can be kept in an arrangement that mimics what we would do for creating a website.
Codepen "Projects"
FORKING a project means copying all the files to your own account so you can continue developing (on a different fork in the road)
original
your fork
Languages are Forks
Outtakes
Archive copy of How do machines see?
By Dan Ryan
Archive copy of How do machines see?
- 196