CS231n - Lec3. Loss Functions and Optimization

Loss Function

Loss function (cost function) : loss function is a method of evaluating how well your machine learning algorithm models your featured data set. measurement of how good your model is at predicting the expected outcome

Linear Classifier

Todo!

1. Define a loss function that quantify our unhappiness with the scores across the training data

2. Come up with a way of efficiently finding the parameters that minimize the loss function (optimization)

Multiclass SVM loss (Suppor vector machine):

sum(max(Sj-Sji+1 , 0))

+1 is safety margin

how can i choose +1? doesn't matter, it washes out like the overall setting of the scale in W

Q1. What happens to loss if car scores change a bit?

if we jiggle the score, loss will not change

Q2. What is the min/max possible loss?

min=0, max=infinity __/

Q3. At initialization W is small so all s==0, What is the loss?

#ofclasses - 1

=> useful thing to check at practice( for sanity check )

Q4. What if the sum was over all classes? (including j-y_i)

loss += 1

nothing changes significantly but use conventional way of omitting correct one to make mimnimum 0

Q5. What if we used mean instead of sum?

doesn't change, just rescaling

Q6. What if we used squared term Sum(Max(0, Sj-Sji+1)^2)

this is different algorithm. non linear way.

choosing over linear or square?

squared hinge loss : we dont want any big wrong but okay with small wrong

Suppose that we found a W such that loss=0. Is this W unique?

No! 2W, 3W, 4W ..... exists!

minimizing loss for train data is not too good.

add regularization to simplify the model (Occam'sRazor - simplest is the best)

ramda: regularizaiaon strength

다항식이 깊어지지않도록 방지 (L1:차수0, L2:전제차수합0 으로 유도)

L1보다 L2 선호 ( L1은 내가 원하는 특성이 제거됨, L2는 모든것 고려)

Softmax Classifier

in linear classifier, we dont say what scores mean.

but for the multinomial logistic regression, meaning exists

exp : sigmoid

확률=1 , -log=0 , loss=0

확률=0.01, -log=10, loss=10

Q1. What is the min/max possible loss L_i?

min : 0(-log1), max : infinity(-log0)

in order to get totally right, score should be like... +∞(정답), -∞(모든오답)

we will never get 0 loss

Q2. Usually at initialization, W is small so all s=0. what is the loss?

-log(1/C) also for sanity check

Q3. Suppose I take a datapoint and i jiggle a bit(changing its score slightly). What happends to the loss in both cases?

SVM only wants to keep correct score higher than others thats all, doesn't affect loss. But Softmax want to make correct score plus infinity and wrond minus infinity so that jiggling can show significant difference

Optimizaion

how to minimize loss?

~~Stategy #1 : Random search~~

Strategy #2 : Follow the slope - GRADIENT DESCENT

numerical gradient : easy to write but slow, approximate

W를 변화시켯을때 loss의 변화를 통해 gradient dW를 구함

W => W+h => dW (use sometimes for debugging - gradient check)

analytic gradient : use calculus, fast, exact, but error-prone

in practice : derive analytic gradient, check your implementation with numerical gradient

we know the gradient then, use GRADIENT DESCENT

weight += - step_size * weight_grad

- : minimum 방향으로, step_size : learning rate

Stochastic Gradient Descent(SGD)

매번 W를 업데이트하면 너무 느림. minibatch(32,64,128..)를 두어서 데이터 개수를 잘라서 n^i개로 W업뎃, ...반복

update W using sum of gradient descent of each minibatch

For Images

1. Color Histogram

2. Histogram of Oriented Gradient (HoG)

3. bag of words

이미지를 잘라서 비지도학습/ 클러스터 등으로 돌려버리면 각도, 색깔등이 뽑힘 => 새로운 이미지가 들어오면 기존것과 비교해 어떤 특징이 있는지 비교

CNN: 특징을 뽑아내서 사용하는 것이 아니라, 입력된 이미지에서 스스로 특징을 뽑아내도록 사용.

'AI > CS231n' 카테고리의 다른 글

CS231n- Lec6. Training Neural Networks 1 (0)	2023.09.10
CS231n - Lec5. Convolutional Neural Networks (0)	2023.09.08
CS231n - Lec4. Backpropagation and Neural Network (0)	2023.09.08
CS231n - Lec2. Image Classification (0)	2023.09.05
CS231n - Lec1. Intro (0)	2023.09.01

똑똑이가되고싶댜

CS231n - Lec3. Loss Functions and Optimization

Loss Function

Linear Classifier

Softmax Classifier

Optimizaion

For Images

'AI > CS231n' 카테고리의 다른 글

티스토리툴바

CS231n - Lec3. Loss Functions and Optimization

Loss Function

Linear Classifier

Softmax Classifier

Optimizaion

For Images

'AI > CS231n' 카테고리의 다른 글

'AI/CS231n' Related Articles

티스토리툴바