245x Filetype PDF File size 0.43 MB Source: www.ic.unicamp.br
MO434 - Deep Learning
Fundamentals of (Deep) Neural Networks II
Alexandre Xavier Falc˜ao
Institute of Computing - UNICAMP
afalcao@ic.unicamp.br
Alexandre Xavier Falc˜ao MO434 - Deep Learning
Agenda
Aneural network with dense layers only – a Multi-Layer
Perceptron (MLP).
Activation and loss functions.
Stochastic Gradient Descent (SGD) optimizer.
The backpropagation algorithm.
Alexandre Xavier Falc˜ao MO434 - Deep Learning
Neural network with dense layers only
Consider a neural network with L dense layers and N neurons at
r
layer 1 ≤ r ≤ L.
Each neuron j ∈ [1,N ] of a layer r has a weight vector
r
wr r r r r
ww =(w ,w ,...,w ) with bias w ,
j j0 j1 jNr−1 j0
the input of layer r is the vector
yr−1 r−1 r−1 r−1
yy =(1,y , y , . . . , y ) and
1 2 N
r−1
r yr−1 wr
each perceptron j computes vj = hyy ,wwj i followed by
f (vr), where f is a differentiable activation function.
j
Alexandre Xavier Falc˜ao MO434 - Deep Learning
Examples of activation functions
Rectified Linear Unit (ReLU) ReLU derivative
f (v) = v v >0, f ′(v) = 1 v >0,
0 v ≤0. 0 v ≤0.
Logistic (a > 0) Logistic derivative
f (v) = 1 . f ′(v) = af(v)(1−f(v)).
1+e−av
Hyperbolic tangent Hyperbolic tangent derivative
2 f ′(v) = 1−f2(v)
f (v) = tanh(v) = 1+e−2v −1
SoftPlus derivative
SoftPlus 1
f ′(v) = −v.
f (v) = log (1+ev) 1+e
e
Alexandre Xavier Falc˜ao MO434 - Deep Learning
no reviews yet
Please Login to review.