DonHurry

step45. ๊ณ„์ธต์„ ๋ชจ์•„๋‘๋Š” ๊ณ„์ธต ๋ณธ๋ฌธ

DeZero/๐Ÿ—ป์ œ4๊ณ ์ง€

step45. ๊ณ„์ธต์„ ๋ชจ์•„๋‘๋Š” ๊ณ„์ธต

_๋„๋… 2023. 2. 24. 00:36

๐Ÿ“ข ๋ณธ ํฌ์ŠคํŒ…์€ ๋ฐ‘๋ฐ”๋‹ฅ๋ถ€ํ„ฐ ์‹œ์ž‘ํ•˜๋Š” ๋”ฅ๋Ÿฌ๋‹3์„ ๊ธฐ๋ฐ˜์œผ๋กœ ์ž‘์„ฑํ•˜์˜€์Šต๋‹ˆ๋‹ค. ๋ฐฐ์šด ๋‚ด์šฉ์„ ๊ธฐ๋กํ•˜๊ณ , ๊ฐœ์ธ์ ์ธ ๊ณต๋ถ€๋ฅผ ์œ„ํ•ด ์ž‘์„ฑํ•˜๋Š” ํฌ์ŠคํŒ…์ž…๋‹ˆ๋‹ค. ์ž์„ธํ•œ ๋‚ด์šฉ์€ ๊ต์žฌ ๊ตฌ๋งค๋ฅผ ๊ฐ•๋ ฅ ์ถ”์ฒœ๋“œ๋ฆฝ๋‹ˆ๋‹ค.

 

 

์ด์ „ ๋‹จ๊ณ„์—์„œ, Layer ํด๋ž˜์Šค๋ฅผ ํ™œ์šฉํ•˜๋ฉด ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ์ง์ ‘ ๋‹ค๋ฃจ์ง€ ์•Š์•„๋„ ๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ์ด๋ฒˆ ๋‹จ๊ณ„์—์„œ๋Š” Layer ์ธ์Šคํ„ด์Šค ๊ด€๋ฆฌ๋„ ํŽธ์˜์„ฑ์„ ๊ฐœ์„ ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค. ํ˜„์žฌ Layer ํด๋ž˜์Šค๋Š” ์—ฌ๋Ÿฌ ๊ฐœ์˜ Parameter๋ฅผ ๊ฐ€์งˆ ์ˆ˜ ์žˆ๋Š”๋ฐ, ์—ฌ๊ธฐ์— Layer ํด๋ž˜์Šค๊ฐ€ ๋‹ค๋ฅธ Layer๋„ ๋‹ด์„ ์ˆ˜ ์žˆ๊ฒŒ ํ™•์žฅํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.

 

๋‘ ๊ฐ€์ง€ ์ธก๋ฉด์—์„œ ๊ฐœ์„ ํ•ฉ๋‹ˆ๋‹ค. ์ธ์Šคํ„ด์Šค ๋ณ€์ˆ˜๋ฅผ ์„ค์ •ํ•  ๋•Œ Layer ์ธ์Šคํ„ด์Šค ์ด๋ฆ„๋„ _params์— ์ถ”๊ฐ€ํ•ฉ๋‹ˆ๋‹ค. ๋‹ค๋ฅธ ๋ณ€๊ฒฝ์ ์€ ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ๊บผ๋‚ด๋Š” ์ฒ˜๋ฆฌ ๋ถ€๋ถ„์ž…๋‹ˆ๋‹ค. _params์—์„œ name์— ํ•ด๋‹นํ•˜๋Š” ๊ฐ์ฒด๋ฅผ ๊บผ๋‚ผ ๋•Œ, ๊ทธ ๊ฐ์ฒด๊ฐ€ Layer ์ธ์Šคํ„ด์Šค๋ผ๋ฉด ์žฌ๊ท€์ ์œผ๋กœ Layer ์† Layer์—์„œ ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ๊บผ๋‚ด์˜ต๋‹ˆ๋‹ค.

class Layer:
    def __init__(self):
        self._params = set()
    
    def __setattr__(self, name, value):
        if isinstance(value, (Parameter, Layer)):  # Layer ์ถ”๊ฐ€
            self._params.add(name)
        super().__setattr__(name, value)

    def params(self):
        for name in self._params:
            obj = self.__dict__[name]

            if isinstance(obj, Layer):  # Layer์—์„œ ๋งค๊ฐœ๋ณ€์ˆ˜ ๊บผ๋‚ด๊ธฐ
                yield from obj.params()
            else:
                yield obj

 

๋ชจ๋ธ ํ˜น์€ model์ด๋ž€ ์‚ฌ๋ฌผ์˜ ๋ณธ์งˆ์„ ๋‹จ์ˆœํ•˜๊ฒŒ ํ‘œํ˜„ํ•œ ๊ฒƒ์ด๋ผ๋Š” ๋œป์„ ๊ฐ€์ง€๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ๋จธ์‹ ๋Ÿฌ๋‹์— ์‚ฌ์šฉ๋˜๋Š” ๋ชจ๋ธ ์—ญ์‹œ ๋ณต์žกํ•œ ํŒจํ„ด์ด๋‚˜ ๊ทœ์น™์ด ์ˆจ์–ด ์žˆ๋Š” ํ˜„์ƒ์„ ์ˆ˜์‹์„ ์‚ฌ์šฉํ•˜์—ฌ ๋‹จ์ˆœํ•˜๊ฒŒ ํ‘œํ˜„ํ•œ ๊ฒƒ์„ ๋งํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋ ‡๋‹ค๋ฉด ๋ชจ๋ธ์„ ํ‘œํ˜„ํ•˜๊ธฐ ์œ„ํ•ด Model ํด๋ž˜์Šค๋ฅผ ๋งŒ๋“ค๊ฒ ์Šต๋‹ˆ๋‹ค. Layer ํด๋ž˜์Šค๋ฅผ ์ƒ์† ๋ฐ›์•„ ์‹œ๊ฐํ™” ๋ฉ”์„œ๋“œ๋ฅผ ์ถ”๊ฐ€ํ•ฉ๋‹ˆ๋‹ค. ์ด๋•Œ Model ํด๋ž˜์Šค๋Š” Layer๋ฅผ ์ƒ์† ๋ฐ›๊ธฐ ๋•Œ๋ฌธ์— Layer ํด๋ž˜์Šค์ฒ˜๋Ÿผ ํ™œ์šฉ ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค.

class Model(Layer):
    def plot(self, *inputs, to_file='model.png'):
        y = self.forward(*inputs)
        return utils.plot_dot_graph(y, verbose=True, to_file=to_file)

 

์ด์ œ ์•ž์„œ ๋‹ค๋ฃจ์—ˆ๋˜ sin ํ•จ์ˆ˜๋กœ ์ƒ์„ฑํ•œ ๋ฐ์ดํ„ฐ์…‹ ํšŒ๊ท€ ๋ฌธ์ œ๋ฅผ ๋‹ค์‹œ ํ’€์–ด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค. ์ด์ „์— ๋น„ํ•ด ์ฝ”๋“œ๊ฐ€ ๊ฐ„๊ฒฐํ•ด์ง€๊ณ , ๋งค๊ฐœ๋ณ€์ˆ˜ ๊ด€๋ฆฌ๋„ ์ง์ ‘ ํ•  ํ•„์š”๊ฐ€ ์—†์–ด์กŒ์Šต๋‹ˆ๋‹ค. ์ด๋กœ์จ ์ด๋ฒˆ ๋‹จ๊ณ„์˜ ์ฃผ์š” ๋ชฉ์ ์„ ๋‹ฌ์„ฑํ•˜์˜€์Šต๋‹ˆ๋‹ค.

import numpy as np
from dezero import Model
import dezero.layers as L
import dezero.functions as F


# dataset
np.random.seed(0)
x = np.random.rand(100, 1)
y = np.sin(2 * np.pi * x) + np.random.rand(100, 1)

# Hyperparameter setting
lr = 0.2
max_iters = 10000
hidden_size = 10

# Model definition
class TwoLayerNet(Model):
    def __init__(self, hidden_size, out_size):
        super().__init__()
        self.l1 = L.Linear(hidden_size)
        self.l2 = L.Linear(out_size)

    def forward(self, x):
        y = F.sigmoid(self.l1(x))
        y = self.l2(y)
        return y


model = TwoLayerNet(hidden_size, 1)

for i in range(max_iters):
    y_pred = model(x)
    loss = F.mean_squared_error(y, y_pred)

    model.cleargrads()
    loss.backward()

    for p in model.params():
        p.data -= lr * p.grad.data
    if i % 1000 == 0:
        print(loss)

 

์ถ”๊ฐ€์ ์œผ๋กœ ์™„์ „์—ฐ๊ฒฐ๊ณ„์ธต ์‹ ๊ฒฝ๋ง์„ ๊ตฌํ˜„ํ•ด๋‘๋„๋ก ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค. ์ดˆ๊ธฐํ™” ์ธ์ˆ˜๋กœ fc_output_sizes์™€ activation์„ ๋ฐ›์Šต๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์„œ fc๋Š” full connect์˜ ์•ฝ์ž์ž…๋‹ˆ๋‹ค. ์ด ์ธ์ˆ˜๋Š” ์‹ ๊ฒฝ๋ง์„ ๊ตฌ์„ฑํ•˜๋Š” ์™„์ „์—ฐ๊ฒฐ๊ณ„์ธต๋“ค์˜ ์ถœ๋ ฅ ํฌ๊ธฐ๋ฅผ ํŠœํ”Œ์ด๋‚˜ ๋ฆฌ์ŠคํŠธ๋กœ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค. ์•„๋ž˜์—์„œ ๊ฐ„๋‹จํ•œ ์˜ˆ์‹œ๋ฅผ ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

class MLP(Model):
    def __init__(self, fc_output_sizes, activation=F.sigmoid):
        super().__init__()
        self.activation = activation
        self.layers = []
    
        for i, out_size in enumerate(fc_output_sizes):
            layer = L.Linear(out_size)
            setattr(self, 'l' + str(i), layer)
            self.layers.append(layer)
    
    def forward(self, x):
        for l in self.layers[:-1]:
            x = self.activation(l(x))
        return self.layers[-1](x)

 

๋‹ค์Œ๊ณผ ๊ฐ™์ด ์‹ ๊ฒฝ๋ง์˜ ๊ณ„์ธต์„ ๊ตฌ์„ฑํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด์ƒ์œผ๋กœ ์ด๋ฒˆ ๋‹จ๊ณ„๋ฅผ ๋งˆ์น˜๊ฒ ์Šต๋‹ˆ๋‹ค.

model = MLP((10, 1))  # 1์ธต
model = MLP((10, 20, 30, 40, 1))  # 5์ธต