A neuron receives a signal, processes it, and propagates the signal or not The brain is comprised of around 100 billion neurons, each connected to ~10k other neurons: 1015 synaptic c
Trang 1Trịnh Tấn Đạt
Khoa CNTT – Đại Học Sài Gòn
Email: trinhtandat@sgu.edu.vn
Website: https://sites.google.com/site/ttdat88/
Trang 3❖ What are artificial neural networks?
A neuron receives a signal, processes it, and propagates the signal (or not)
The brain is comprised of around 100 billion neurons, each connected to ~10k other neurons:
1015 synaptic connections
ANNs are a simplistic imitation of a brain comprised of dense net of simple structures
Origins: Algorithms that try to mimic the brain
Very widely used in 80s and early 90s; popularity diminished in late 90s.
Recent resurgence: State-of-the-art technique for many applica1ons
Trang 4Comparison of computing power
Neural networks are designed to be massively parallel
The brain is effectively a billion times faster
Trang 5Applications of neural networks
Trang 6Medical Imaging
Trang 7Fake Videos
Trang 8Conceptual mathematical model
Receives input from sources
Computes weighted sum
Passes through an activation function
Sends the signal to m succeeding neurons
Trang 9Artificial Neural Network
Organized into layers of neurons
Typically 3 or more: input, hidden and output
Neural networks are made up of nodes or units, connected by links
Each link has an associated weight and activation function
Trang 10 Simplified (binary) artificial neuron
Trang 11 Simplified (binary) artificial neuron with weights
Trang 12 Simplified (binary) artificial neuron; no weights
Trang 13 Simplified (binary) artificial neuron; add weights
Trang 14 Simplified (binary) artificial neuron; add weights
Trang 15Introducing Bias
Perceptron needs to take into account the bias
o Bias is just like an intercept added in a linear equation.
o It is an additional parameter in the Neural Network which is used to adjust the output along with the weighted sum of the inputs to the neuron.
o Bias acts like a constant which helps the model to fit the given data
Trang 16Sigmoid Neuron
The more common artificial neuron
Trang 17Sigmoid Neuron
In effect, a bias value allows you to
shift the activation function to the left
or right, which may be critical for
successful learning
network that has no bias:
Here is the function that this networkcomputes, for various values of w0:
Trang 19Simplified Two-Layer ANN
One hidden layer
Trang 20Simplified Two-Layer ANN
Trang 21Optimization Primer
Cost function`
Trang 22Calculate its derivative
Trang 23Gradient Descent
Trang 25Gradient Descent Optimization
Trang 26Backpropagation
Trang 28Activation functions
Bias (threshold) activation function was proposed first
Sigmoid and tanh introduce non-linearity with different codomains
ReLU is one of the more popular ones because its simple to compute and very robust to noisy inputs
Trang 29Sigmoid function
Sigmoid non-linearity squashes real numbers between [0, 1]
Historically a nice interpretation of neuron firing rate (i.e not firing at all to fully saturated firing )
Currently, not used as much because really large values too close to 0 or 1 result in gradients too close to 0 stopping
Trang 30Sigmoid function
Trang 31Tanh function
Tanh function squashes real numbers [-1, 1]
Same problem as sigmoid that its activations saturate thus killing gradients
But it is zero-centered minimizing the zig-zagging dynamics during gradient descent
Currently preferred sigmoid nonlinearity
Trang 32ReLU: Rectifier Linear Unit
ReLU’s activation is at threshold of zero
Quite popular over the last few years
Speeds up Stochastic Gradient Descent (SGD) convergence
It is easier to implement due to simpler mathematical functions
Sensitive to high learning rate during training resulting in “dead” neurons (i.e neurons that will not activate across the entire dataset)
Trang 33Neuron Modeling: Logistic Unit
Trang 34 1 hidden layer
Trang 35Modeling
Trang 37Other Network Architectures
Trang 38 Image Recognition: 4 classes ( one-hot encoding)
Trang 39Example
Trang 40Neural Network Classification
Trang 41Example: Perceptron - Representing Boolean Functions
Trang 43 Combining Representations to Create Non-Linear Functions
Trang 44Example: MNIST data
Trang 45Example: MNIST data
Trang 46Neural Network Learning
Trang 47Perceptron Learning Rule
Trang 48Batch Perceptron
Trang 49Learning in NN: Backpropagation
Trang 50Cost Function
Trang 51Optimizing the Neural Network
Trang 52Forward Propagation
Trang 53Backpropagation Intuition
Trang 58Backpropagation: Gradient Computation
Trang 59Backpropagation
Trang 60 Training a Neural Network via Gradient Descent with Backpropagation
Trang 61Training a Neural Network