Table of Contents
0:00 Recap
2:25 How to choose your loss?
3:18 A probabilistic model for linear regression
7:50 Gradient descent, learning rate, SGD
11:30 Pytorch code for gradient descent
15:15 A probabilistic model for logistic regression
17:27 Notations (information theory)
20:58 Likelihood for logistic regression
22:43 BCELoss
23:41 BCEWithLogitsLoss
25:37 Beware of the reduction parameter
27:27 Softmax regression
30:52 NLLLoss
34:48 Classification in pytorch
36:36 Why maximizing accuracy directly is hard?
38:24 Classification in deep learning
40:50 Regression without knowing the underlying model
42:58 Overfitting in polynomial regression
45:20 Validation set
48:55 Notion of risk and hypothesis space
54:40 estimation error and approximation error
BCELoss
import torch.nn as nn
m = nn.Sigmoid()
loss = nn.BCELoss()
input = torch.randn(3,4,5)
target = torch.randn(3,4,5)
loss(m(input), target)
NLLLoss
and CrossEntropyLoss
import torch.nn as nn
m = nn.LogSoftmax(dim=1)
loss1 = nn.NLLLoss()
loss2 = nn.CrossEntropyLoss()
C = 4
input = torch.randn(3,C)
target = torch.empty(3, dtype=torch.long).random_(0,C)
assert loss1(m(input),target) == loss2(input,target)
Because its gradient has simple form: .