[PyTorch] 01. Tensor와 연산

Updated: January 27, 2022

PyTorch Documentation: A torch.Tensor is a multi-dimensional matrix containing elements of a single data type.

PyTorch에서 쓰이는 기본 연산 단위. NumPy의 ndarray와 비슷하게 동작한다.

PyTorch를 사용하기 전에 NumPy를 먼저 공부하고 PyTorch를 익히는 것을 권장한다.

Tensor의 종류와 생성 attribute, method에 대해 소개하는 시간을 가져보고자 한다.

1. PyTorch 설치

먼저 PyTorch를 사용하려면 PyTorch를 설치해주어야한다.

anaconda를 쓰면 conda install을 이용해서 pip를 사용한다면 pip3 install을 이용해서 설치할 수 있다.

PyTorch 홈페이지에 설명이 잘되어있다.

설치가 완료되면 torch라는 이름으로 import 해올 수 있다.

import torch

2. Data Type

float, complex, integer(byte, char, long, short), boolean 타입의 tensor를 만들 수 있다.

PyTorch Documentation에 해당 내용이 표로 잘 정리되어 있다.

표를 살펴보면 같은 tensor라도 두 종류로 나뉜 것을 확인할 수 있다. PyTorch는 다른 딥러닝 프레임워크와 동일하게 GPU 연산을 이용한 병렬 처리를 지원한다. 따라서 GPU 연산이 가능한 tensor data type이 존재한다. torch.cuda 아래 있는 tensor는 GPU 연산이 가능한 tensor이고 그 왼쪽은 CPU에서 사용가능한 tensor임을 알 수 있다. PyTorch에서는 GPU 연산이 가능한 것을 가리켜 cuda라고 지칭되어있다. 나중에 딥러닝 모델을 만들고 학습할 때 자주 보이므로 기억해두자. CUDA 용어에 대해서는 링크를 참조.

PyTorch는 다른 딥러닝 프레임워크와 동일하게 GPU 연산을 이용한 병렬 처리를 지원한다. 따라서 GPU 연산이 가능한 tensor data type이 존재한다.

언제 GPU tensor를 쓸 지 헷갈릴 수도 있는데 걱정 말자. tensor 뒤에 to method를 사용하여 CPU tensor를 GPU tensor로 변환이 가능하다 이 내용은 다음에 다루겠다.

3. Tensor 생성

Tensor는 NumPy의 ndarray와 유사하며 기본적인 속성은 다음과 같다.

grad: Tensor가 가진 gradient value 반환
shape: Tensor가 가진 size 반환
size(): Tensor가 가진 size 반환. shape과 동일
dim(): Tensor가 가진 dimension 반환

t = torch.tensor([[1,2,3],[4,5,6]])

print(t.dim())
print(t.shape)
print(t.size())
print(t.grad)

2
torch.Size([2, 3])
torch.Size([2, 3])
None

tensor를 생성하는 방법은 PyTorch Documentation에 4가지 정도 소개 되어 있다.

torch.tensor() 사용
torch.IntTensor(), torch.FloatTensor() 등 Tensor Class를 이용한 생성
size를 명시하는 함으로써 생성하는 방법 e.g.) torch.ones(), torch.zeros(), …
특정 tensor와 같은 size의 tensor를 만드는 방법 e.g.) torch.ones_like(), torch.zeros_like(), …
‘Tensor.new_*‘를 사용해 생성

위 4가지를 사용해 생성하는 방법을 알아보자.

3.1. torch.tensor()

torch.tensor(data, requires_grad=False, *kwargs)

인자로 들어온 data를 Tensor로 변환해주는 메소드. torch.Tensor클래스와 혼동하면 안된다.
Parameters
- data: array_like 형태의 data. list, ndarray, scalar, …
Keyword Arguments:
- requires_grad: gradient를 구해야한다면 True, gradient를 구하지 않아도 된다면 False
- dtype: data type
- 그 밖에 여러 keyword argument 존재

import numpy as np

t1 = torch.tensor(list(range(5)))
t2 = torch.tensor(np.arange(5))

print(t1)
print(t2)

tensor([0, 1, 2, 3, 4])
tensor([0, 1, 2, 3, 4])

t3 = torch.tensor([[1,2],[3,4]])
t4 = torch.tensor([])

print(t3)
print(t4)

tensor([[1, 2],
        [3, 4]])
tensor([])

3.2. Tensor Class

PyTorch Documentation에 정리된 것처럼 미리 정의된 Tensor class를 이용해 생성할 수 있다.

t1 = torch.FloatTensor([[1,2],[3,4]])
t2 = torch.IntTensor([[1,2],[3,4]])

print(t1)
print(t2)

tensor([[1., 2.],
        [3., 4.]])
tensor([[1, 2],
        [3, 4]], dtype=torch.int32)

3.3 size를 명시하는 방법으로 생성

ones, zeros, empty, full

torch.zeros(size): 원소가 모두 0이고, 크기가 size인 Tensor 생성
torch.ones(size): 원소가 모두 1이고, 크기가 size인 Tensor 생성
torch.empty(size): 크기가 size인 비어있는 Tensor 생성. (실제 출력시 쓰레기값이 들어있다.)
torch.full(size, fill_value): 원소가 모두 fill_value 이고, 크기가 size인 Tensor 생성

t = torch.zeros((3,3))

t

tensor([[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]])

t = torch.ones((3,3))

t

tensor([[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]])

t = torch.full((3,3),1.2)

t

tensor([[1.2000, 1.2000, 1.2000],
        [1.2000, 1.2000, 1.2000],
        [1.2000, 1.2000, 1.2000]])

t = torch.empty((3,3))

t

tensor([[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]])

arange, linspace

torch.arange(start, end): [start,end) 구간의 1차원 Tensor 생성
torch.linspace(start, end, steps): [start, end] 구간의 steps 크기의 1차원 Tensor 생성

t = torch.arange(6)

t

tensor([0, 1, 2, 3, 4, 5])

t = torch.linspace(1,6,5)

t

tensor([1.0000, 2.2500, 3.5000, 4.7500, 6.0000])

random sampling을 이용한 생성

torch.rand(size): [0,1)인 uniform 분포의 랜덤 숫자를 가지고 크기가 size인 Tensor 생성
torch.randn(size): 표준 정규 분포의 랜덤 숫자를 가지고, 크기가 size인 Tensor 생성
torch.randint(size): [low, high) 구간의 정수를 가지고, 크기가 size인 Tensor 생성
torch.normal(mean,std): mena, std의 정규 분포를 가지는 Tensor 생성

t = torch.rand(3,3)

t

tensor([[0.8797, 0.1597, 0.0158],
        [0.5165, 0.1912, 0.5431],
        [0.9048, 0.6050, 0.7337]])

t = torch.randn(3,3)

t

tensor([[-0.6680,  0.8055, -0.8857],
        [-0.2097, -0.9801, -0.4376],
        [-0.8114, -0.4007, -1.0440]])

t = torch.randint(1,6,(3,3))

t

tensor([[4, 4, 3],
        [1, 1, 4],
        [3, 1, 4]])

t = torch.normal(mean=torch.arange(1., 11.), std=torch.arange(1, 0, -0.1))

t

tensor([1.6487, 1.0866, 1.2763, 4.6704, 4.8656, 6.0959, 7.1062, 8.0930, 9.0874,
        9.9542])

3.4. 특정 Tensor와 같은 size를 가지는 Tensor 생성

torch.zeros_like(input): input과 동일한 size의 torch.zeros() 생성
torch.ones_like(input): input과 동일한 size의 torch.ones() 생성
torch.full_like(input,fill_value): input과 동일한 size를 가지고 fill_value로 이루어진 torch.full() 생성
torch.empty_like(input): input과 동일한 size의 torch.empty() 생성

a = torch.tensor([[1,2,3],[4,5,6]],dtype=torch.float32)

a

tensor([[1., 2., 3.],
        [4., 5., 6.]])

t = torch.zeros_like(a)

t

tensor([[0., 0., 0.],
        [0., 0., 0.]])

t = torch.ones_like(a)

t

tensor([[1., 1., 1.],
        [1., 1., 1.]])

t = torch.full_like(a,3.14)

t

tensor([[3.1400, 3.1400, 3.1400],
        [3.1400, 3.1400, 3.1400]])

t = torch.empty_like(a)

t

tensor([[ 0.0000e+00, -8.5899e+09,  0.0000e+00],
        [-8.5899e+09,  1.2612e-44,  0.0000e+00]])

4. Tensor 연산

4.1. shape 변경

Tensor.view(*shape)

Tensor를 해당 shape으로 바꾼 새로운 Tensor 반환

t = torch.arange(6)
t2 = t.view(2,3)

print(t)
print(t2)

tensor([0, 1, 2, 3, 4, 5])
tensor([[0, 1, 2],
        [3, 4, 5]])

shape 변경시 -1을 쓸 때가 있다. -1은 더이상 계산하지 않아도 확실하게 shape을 알 수 있을 때 쓴다.

size가 12인 tensor가 존재한다고 가정해보자. shape을 (2,6), (3,4), (4,3), (1,12) 등으로 다양하게 구성할 수 있을 것이다. 이 때 (2,-1)이라고 shape을 표현해보자. size가 12이고 shape중 첫번째 자리가 2로 결정되어있기 때문에 나머지 2번째 자리는 6인것을 쉽게 알 수 있다. (-1,2)이라고 shape을 표현해보자. size가 12이고 두번째 자리가 2로 결정되어있기 때문에 첫번째 자리가 6인것을 쉽게 알 수가 있다.

이렇듯 -1은 더이상 계산하지 않아도 나머지 shape 자리를 알 수 있을 때 쓴다. 그 자리를 -1로 대체해서 쓴다고 보면 된다.

-1은 특정 shape 부분을 강조하거나 size가 클 때 자주 쓰인다.

t = torch.arange(9).view(3,-1)

t

tensor([[0, 1, 2],
        [3, 4, 5],
        [6, 7, 8]])

torch.ravel(input)

input을 1차원 Tensor로 변환한 새로운 Tensor 반환
start_dim, end_dim 값이 들어온다면 start_dim ~ end_dim까지 flat하게 변함

torch.flatten(input, start_dim=0, end_dim=-1)

input을 1차원 Tensor로 변환한 새로운 Tensor 반환
start_dim, end_dim 값이 들어온다면 start_dim ~ end_dim까지 flat하게 변함

t = torch.randint(1,8,(2,2,2))
print(t)

ravel_t = torch.ravel(t)
print(ravel_t)

flat_t = torch.flatten(t)
print(flat_t)

flat_t = torch.flatten(t,start_dim=1)
print(flat_t)

tensor([[[3, 6],
         [4, 4]],

        [[6, 6],
         [4, 1]]])
tensor([3, 6, 4, 4, 6, 6, 4, 1])
tensor([3, 6, 4, 4, 6, 6, 4, 1])
tensor([[3, 6, 4, 4],
        [6, 6, 4, 1]])

4.2. indexing, slicing

torch.gather(input, dim, index)

dim을 기준으로 index에 지정된 값을 통해 input에서 원소들을 모은다.
index가 input과 차원이 동일해하고 shape보다 크면 안된다.
3차원 input 예시
- out[i][j][k] = input[index[i][j][k]][j][k] # if dim == 0
- out[i][j][k] = input[i]index[i][j][k]][k] # if dim == 1
- out[i][j][k] = input[i][j][index[i][j][k]] # if dim == 2

t = torch.tensor([[1, 2], [3, 4]])

t1 = torch.gather(t,0,torch.tensor([[0,1]]))
t2 = torch.gather(t,0,torch.tensor([[0,1],[1,1]]))
print(t1)
print(t2)

tensor([[1, 4]])
tensor([[1, 4],
        [3, 4]])

t = torch.tensor([[1, 2], [3, 4]])

t1 = torch.gather(t,1,torch.tensor([[0,0],[1,0]]))
print(t1)

tensor([[1, 1],
        [4, 3]])

Tensor.scatter_(dim, index, src)

src의 값을 index에 지정된 값을 이용해 Tensor에 기록한다.
method에 _이 붙은 것은 in-place method라고 생각하면 된다.
3차원 Tensor 예시
- self[index[i][j][k]][j][k] = src[i][j][k] # if dim == 0
- self[i]index[i][j][k]][k] = src[i][j][k] # if dim == 1
- self[i][j][index[i][j][k]] = src[i][j][k] # if dim == 2

t = torch.arange(10).view(2,-1).float()

t

tensor([[0., 1., 2., 3., 4.],
        [5., 6., 7., 8., 9.]])

index = torch.tensor([[0,1,0,1]])
print(torch.zeros(2,4).scatter_(0,index,t))

tensor([[0., 0., 2., 0.],
        [0., 1., 0., 3.]])

index = torch.tensor([[0,1,0,1]])
print(torch.zeros(2,4).scatter_(1,index,t))

tensor([[2., 3., 0., 0.],
        [0., 0., 0., 0.]])

index = torch.tensor([[0, 1, 2], [0, 1, 3]])
print(torch.zeros(2,4).scatter_(1,index,t))

tensor([[0., 1., 2., 0.],
        [5., 6., 0., 7.]])

torch.index_select(input, dim, index)

‘index’를 이용하여 input에서 dim 차원을 따라 indexing을 한 새로운 Tensor를 반환한다.

t = torch.arange(6).view(2,-1)
print(t)

t2 = torch.index_select(t,1,torch.tensor([0,2]))
print(t2)

tensor([[0, 1, 2],
        [3, 4, 5]])
tensor([[0, 2],
        [3, 5]])

4.3. data type 변경

Tensor.float(): float 형변환
Tensor.int(): integer 형변환

t = torch.arange(6)

t

tensor([0, 1, 2, 3, 4, 5])

t2 = t.float()

t2

tensor([0., 1., 2., 3., 4., 5.])

t3 = t2.int()

t3

tensor([0, 1, 2, 3, 4, 5], dtype=torch.int32)

4.4. sorting

torch.sort(input, dim=- 1)

주어진 dim을 따라 input을 정렬한다.

torch.argsort(input, dim=- 1)

주어진 dim을 따라 정렬된 input의 index를 반환한다.

t = torch.rand(6)
print(t,'\n')

t1 = torch.sort(t)
print(t1,'\n')

sorted_index = torch.argsort(t)
print(sorted_index)

tensor([0.7023, 0.7228, 0.1764, 0.1143, 0.6924, 0.7647])

torch.return_types.sort(
values=tensor([0.1143, 0.1764, 0.6924, 0.7023, 0.7228, 0.7647]),
indices=tensor([3, 2, 4, 0, 1, 5]))

tensor([3, 2, 4, 0, 1, 5])

4.5. mathematical function

NumPy ndarray와 마찬가지로 dim을 이용해 차원을 따라 연산 적용이 가능하고 element-wise이다.

+, -, *, /, ** 와 같은 사칙 연산 및 제곱 연산도 NumPy ndarray와 마찬가지로 element-wise이다.

그 외 다른 함수들은 PyTorch Documentation에서 찾을 수 있다.

예시는 따로 적지 않겠다.

torch.sum(): 합
torch.prod(): 곱
torch.mean(): 평균
torch.std(): 표준편차
torch.var(): 분산
torch.median(): 중앙값
torch.max(): 최댓값
torch.min(): 최솟값
torch.trunc(): 소수점이하 버림
torch.square(): 제곱
torch.sqrt(): 제곱근
torch.pow(): n제곱
torch.exp(): 자연상수
torch.log(): 자연로그

4.6. linear algebra

torch.linalg이 따로 구현되어있을만큼 PyTorch에서는 linear algebra 관련 다양한 함수들을 제공한다. 자세한 내용은 torch.linalg에서 찾아보면 된다.

아래는 간단히 쓸 수 있는 matrix 관련 함수들을 모아놓았다. 예시는 따로 적지 않겠다.

torch.mm(): 행렬곱
torch.dot(): 벡터 내적
torch.norm(): norm

nkw