neural network
convolution neural network
history:
perceptron 구현 => Mark 1 perceptron machine (update rule different from back prop)
Adaline and Mandaline (first Multilayer perceptron network) (no backprop yet)
Rumelhart suggest first backprop => chain rule & update rule
Yann LeCun's CNN, backprop, gradient based learning for NN
2012
Hinton lab acoustic modeling, speech recognition
2012 AlexNet:ImageNet Classification using Neural Network
How CNN became famous?
ConvNets are used everywhere
Convolutional Neural Network
fully connected layer
32 x 32 x 3 의 image => stretch to 3072 x 1
W : 10 x 3072
W x = (10,3072) x (3072 x 1) => 10 x 1
convolution layer
32 x 32 x 3 => preserve spatial structure
( 32 x 32 x 3 ) => 5 x 5 x 3 filter
Convolution layer
convolve(slide) over all spatial locations
output volume size = (n + 2 x pad - filter / stride) + 1
Examples:
Input volume : 32 x 32 x 3
10 5x5 filters with stride 1, pad 2
output volume size?
32 x 32 x 10
number of parameters in this layer?
each filter has 5*5*3 + 1 = 76 params
=> 76 * 10 = 760
Pooling Layer
make the representations smaller and more manageable (down sampling)
operates over each activation map independently
(usually do not overlap for downsampling)
why max pooling is used rather than average pooling?
filter is about how much the filter is activated
how big is more important than the location of value
combination of conv, ReLU, pooling ...... => typical CNN architecture
'AI > CS231n' 카테고리의 다른 글
CS231n - Lec7. Training Neural Networks 2 (0) | 2023.09.11 |
---|---|
CS231n- Lec6. Training Neural Networks 1 (0) | 2023.09.10 |
CS231n - Lec4. Backpropagation and Neural Network (0) | 2023.09.08 |
CS231n - Lec3. Loss Functions and Optimization (0) | 2023.09.06 |
CS231n - Lec2. Image Classification (0) | 2023.09.05 |