Autograd¶

Automatic differentiation이라고도 불리고 이건 모든 DL library에 내장 되어 있다.
forward와 backward pass가 가능하다.

라이브러리 로드¶

import numpy as np
import torch

x에 대한 y gradient 구하기¶

x=torch.randn(2,requires_grad=True)
print(x.grad)

None

x=torch.randn(2,requires_grad=True)
y=x*3
gradients=torch.tensor([100,0.1],dtype=torch.float)
y.backward(gradients)
print(x.grad)
print(y.grad)

tensor([300.0000,   0.3000])
None

C:\Users\won\anaconda3\envs\pydatavenv\lib\site-packages\ipykernel_launcher.py:6: UserWarning: The .grad attribute of a Tensor that is not a leaf Tensor is being accessed. Its .grad attribute won't be populated during autograd.backward(). If you indeed want the gradient for a non-leaf Tensor, use .retain_grad() on the non-leaf Tensor. If you access the non-leaf Tensor by mistake, make sure you access the leaf Tensor instead. See github.com/pytorch/pytorch/pull/30531 for more informations.

y.backward()함수는 y와 관련된 연산들에 대해서 모든 편미분 값을 계산하게 된다.
이때 들어오는 인자는 각 계산된 편미분 값에 곱해져서 들어가게 된다.
.grad는 backward()로 계산된 값이 저장되게 된다.

requires_grad¶

x=torch.randn(2,requires_grad=True)
y=x*3
gradients=torch.tensor([100,0.1],dtype=torch.float)
y.backward(gradients)
print(x.grad)

tensor([300.0000,   0.3000])

여기서 requires_grad인자는 gradient를 저장하고 계산할 수 있는 변수로 선언하는 것이다.
이 requires_grad인자가 True여야 backward()계산과 .grad속성에 값이 저장된다.

x=torch.randn(2,requires_grad=False)
y=x*3
gradients=torch.tensor([100,0.1],dtype=torch.float)
y.backward(gradients)
print(x.grad)

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-5-f6b08f09186b> in <module>
      2 y=x*3
      3 gradients=torch.tensor([100,0.1],dtype=torch.float)
----> 4 y.backward(gradients)
      5 print(x.grad)

~\anaconda3\envs\pydatavenv\lib\site-packages\torch\tensor.py in backward(self, gradient, retain_graph, create_graph)
    219                 retain_graph=retain_graph,
    220                 create_graph=create_graph)
--> 221         torch.autograd.backward(self, gradient, retain_graph, create_graph)
    222 
    223     def register_hook(self, hook):

~\anaconda3\envs\pydatavenv\lib\site-packages\torch\autograd\__init__.py in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables)
    130     Variable._execution_engine.run_backward(
    131         tensors, grad_tensors_, retain_graph, create_graph,
--> 132         allow_unreachable=True)  # allow_unreachable flag
    133 
    134 

RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

requires_grad가 False값을 가지면 backward()함수를 실행하게 되면 에러가 발생하게 된다.
에러가 나는 원인은 y가 backward()함수를 실행해서 뒤로 가보니 gradient를 담을 그릇이 없기 때문에 에러가 발생한다.

backward¶

x=torch.randn(2,requires_grad=True)
y=x*3
gradients=torch.tensor([100,0.1],dtype=torch.float)
y.backward(gradients)
print(x.grad)
y.backward(gradients)
print(x.grad)

tensor([300.0000,   0.3000])

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-6-0b8751cbd2d6> in <module>
      4 y.backward(gradients)
      5 print(x.grad)
----> 6 y.backward(gradients)
      7 print(x.grad)

~\anaconda3\envs\pydatavenv\lib\site-packages\torch\tensor.py in backward(self, gradient, retain_graph, create_graph)
    219                 retain_graph=retain_graph,
    220                 create_graph=create_graph)
--> 221         torch.autograd.backward(self, gradient, retain_graph, create_graph)
    222 
    223     def register_hook(self, hook):

~\anaconda3\envs\pydatavenv\lib\site-packages\torch\autograd\__init__.py in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables)
    130     Variable._execution_engine.run_backward(
    131         tensors, grad_tensors_, retain_graph, create_graph,
--> 132         allow_unreachable=True)  # allow_unreachable flag
    133 
    134 

RuntimeError: Trying to backward through the graph a second time, but the saved intermediate results have already been freed. Specify retain_graph=True when calling backward the first time.

backward()를 두번 호출하게 되면 에러가 발생하게 된다.
backward()를 한번 실행하게 되면 중간 리소스를 해제하는게 기본 설정이다.

x=torch.randn(2,requires_grad=True)
y=x*3
gradients=torch.tensor([100,0.1],dtype=torch.float)
y.backward(gradients,retain_graph=True)
print(x.grad)
y.backward(gradients)
print(x.grad)

tensor([300.0000,   0.3000])
tensor([600.0000,   0.6000])

backward()를 2번 실행하고 싶으면 중간 리소스를 해제 하지 않는다는 정보를 retain_graph인자로 주면 된다.
2번 실행하면 결과 값이 축적된다.

grad_fn¶

x=torch.randn(2,requires_grad=True)
y=x*3
z=x/2
w=x+y
w,y,z

(tensor([ 8.2057, -6.5477], grad_fn=<AddBackward0>),
 tensor([ 6.1543, -4.9108], grad_fn=<MulBackward0>),
 tensor([ 1.0257, -0.8185], grad_fn=<DivBackward0>))

각 출력한 텐서는 계산된 결과의 값이 출력이 된다.
이때 어떤 것으로 계산이 되었는지 grad_fn속성에 저장이 되게 된다.
나중에 backward()함수를 위해 참조하기 위해서

register_forwrd_hook¶

gradient를 구할 때 중간까지만 구하고 싶을 때 사용한다.(CAM 계산할 때 사용해야 한다.)
hook은 중간에 있는 값을 빼오게 된다.
register_hook

import numpy as np
import torch
import torch.nn as nn
import torch.nn.functional as F

먼저 간단한 3 layer를 구성해 보자¶

class SimpleNet(nn.Module):
    def __init__(self):
        super(SimpleNet,self).__init__()
        self.conv1 = nn.Conv2d(1,10,5)
        self.pool1 = nn.MaxPool2d(2,2)
        
        self.conv2 = nn.Conv2d(10,20,5)
        self.pool2 = nn.MaxPool2d(2,2)
        
        self.fc = nn.Linear(320,50)
        self.out = nn.Linear(50,10)
        
    def forward(self, input):
        x=self.pool1(F.relu(self.conv1(input)))
        x=self.pool2(F.relu(self.conv2(x)))
        x=x.view(x.size(0),-1)
        x=F.relu(self.fc(x))
        x=F.relu(self.out(x))
        return x

hook이 될때 함수를 선언해 줘야 한다.¶

def hook_func(self,input,output):
    print('Inside '+self.__class__.__name__+' forwrd')
    print('')
    print('input:',type(input))
    print('input[0] shape:',input[0].shape)
    print('output shape:',output.shape)
    print('')

type(self)는 무조건 Tensor가 되어야 한다.

이제 hook함수 등록(register_forward_hook)¶

net=SimpleNet()

# conv1에 등록
net.conv1.register_forward_hook(hook_func)

<torch.utils.hooks.RemovableHandle at 0x22aeef67508>

# conv2에 등록
net.conv2.register_forward_hook(hook_func)

<torch.utils.hooks.RemovableHandle at 0x22aeef678c8>

register_forward_hook은 foward계산이 진행 될때 걸리게 된다.

forward pass가 진행되는 동안 hook function이 자동으로 호출이 된다.¶

input=torch.randn(1,1,28,28)
out=net(input)

Inside Conv2d forwrd

input: <class 'tuple'>
input[0] shape: torch.Size([1, 1, 28, 28])
output shape: torch.Size([1, 10, 24, 24])

Inside Conv2d forwrd

input: <class 'tuple'>
input[0] shape: torch.Size([1, 10, 12, 12])
output shape: torch.Size([1, 20, 8, 8])

register_forward_pre_hook¶

def hook_pre(self,input):
    print('Inside '+self.__class__.__name__+' forward')
    print()
    print('input: ',type(input))
    print('input[0] shape: ',input[0].shape)

net=SimpleNet()
net.conv1.register_forward_pre_hook(hook_pre)

input=torch.randn(1,1,28,28)
out=net(input)

Inside Conv2d forward

input:  <class 'tuple'>
input[0] shape:  torch.Size([1, 1, 28, 28])

register_forward_pre_hook은 해당 layer에 forward가 진행되기 진적에 hook이 된다.
따라서 인자로 input값만 존재하게 된다.

register_backward_hook¶

def hook_grad(self,grad_input,grad_output):
    print('Inside '+self.__class__.__name__+' backward')
    print()
    print('grad_input len: ',len(grad_input))
    print('grad_input[1] shape: ',grad_input[1].shape)
    print('grad_input[2] shape: ',grad_input[2].shape)
    
    print('grad_output len: ',len(grad_output))
    print('grad_output[0] shape: ',grad_output[0].shape)

이 hook에서 input, output의 인자를 바꿔주면 안된다.
그래도 gradient를 바꾸고 싶으면 선택적으로 grad_input대신에 새로운 gradient를 return을 할 수 있다.

net=SimpleNet()
net.conv1.register_backward_hook(hook_grad)

input=torch.randn(1,1,28,28)
out=net(input)

target=torch.tensor([3],dtype=torch.long)
loss_fn=nn.CrossEntropyLoss()
err=loss_fn(out,target)
err.backward()

Inside Conv2d backward

grad_input len:  3
grad_input[1] shape:  torch.Size([10, 1, 5, 5])
grad_input[2] shape:  torch.Size([10])
grad_output len:  1
grad_output[0] shape:  torch.Size([1, 10, 24, 24])

hook 지우기¶

net=SimpleNet()
h=net.conv1.register_forward_hook(hook_func)
input=torch.randn(1,1,28,28)
out=net(input)

Inside Conv2d forwrd

input: <class 'tuple'>
input[0] shape: torch.Size([1, 1, 28, 28])
output shape: torch.Size([1, 10, 24, 24])

h.remove()
out=net(input)

먼저 register의 return값을 저장한다. Handle객체가 return 된다.
remove()함수를 사용하여 지울 수 있다.

✅ Autograd 예시¶

우리는 register_hook을 통해서 중간 층의 activation을 저장해야 한다.

hook 함수를 정의 한다.¶

save_feat=[]
def hook_feat(module,input,output):
    save_feat.append(output)
    return output

hook함수 등록¶

for name, module in model.get_model_shortcuts():
    if(name=='target_layer_name'):
        module.register_forward_hook(hook_feat)

forward pass¶

img=img.unsqueeze(0)
s=model(img)[0]