DonHurry

step14. ๊ฐ™์€ ๋ณ€์ˆ˜ ๋ฐ˜๋ณต ์‚ฌ์šฉ ๋ณธ๋ฌธ

DeZero/๐Ÿ—ป์ œ2๊ณ ์ง€

step14. ๊ฐ™์€ ๋ณ€์ˆ˜ ๋ฐ˜๋ณต ์‚ฌ์šฉ

_๋„๋… 2023. 1. 15. 00:01

๐Ÿ“ข ๋ณธ ํฌ์ŠคํŒ…์€ ๋ฐ‘๋ฐ”๋‹ฅ๋ถ€ํ„ฐ ์‹œ์ž‘ํ•˜๋Š” ๋”ฅ๋Ÿฌ๋‹3์„ ๊ธฐ๋ฐ˜์œผ๋กœ ์ž‘์„ฑํ•˜์˜€์Šต๋‹ˆ๋‹ค. ๋ฐฐ์šด ๋‚ด์šฉ์„ ๊ธฐ๋กํ•˜๊ณ , ๊ฐœ์ธ์ ์ธ ๊ณต๋ถ€๋ฅผ ์œ„ํ•ด ์ž‘์„ฑํ•˜๋Š” ํฌ์ŠคํŒ…์ž…๋‹ˆ๋‹ค. ์ž์„ธํ•œ ๋‚ด์šฉ์€ ๊ต์žฌ ๊ตฌ๋งค๋ฅผ ๊ฐ•๋ ฅ ์ถ”์ฒœ๋“œ๋ฆฝ๋‹ˆ๋‹ค.

 

 

ํ˜„์žฌ์˜ DeZero๋Š” ๊ฐ™์€ ๋ณ€์ˆ˜๋ฅผ ๋ฐ˜๋ณตํ•ด์„œ ์‚ฌ์šฉํ•  ๊ฒฝ์šฐ ์˜๋„๋Œ€๋กœ ๋™์ž‘ํ•˜์ง€ ์•Š๋Š” ๋ฌธ์ œ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. ์•„๋ž˜์˜ ์˜ˆ์‹œ๋ฅผ ๋ณด์‹œ์ฃ .

 

y์˜ ๊ฐ’์€ ์ •์ƒ์ ์œผ๋กœ ๊ณ„์‚ฐ๋˜์ง€๋งŒ, ์—ญ์ „ํŒŒ์—์„œ ์ž˜๋ชป๋œ ๊ฒฐ๊ณผ๊ฐ€ ๋‚˜์™”์Šต๋‹ˆ๋‹ค. ์ œ๋Œ€๋กœ ๊ณ„์‚ฐํ•˜๋ฉด $y=x+x$์ผ ๋•Œ $y=2x$๊ฐ€ ๋˜๋ฏ€๋กœ ๋ฏธ๋ถ„๊ฐ’์€ 2๊ฐ€ ๋˜์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

x = Variable(np.array(3.0))
y = add(x, x)
print(y.data)  # 6.0

y.backward()
print(x.grad)  # 1.0

 

๋ฌธ์ œ์˜ ์›์ธ์€ Variable ํด๋ž˜์Šค์— ์žˆ์Šต๋‹ˆ๋‹ค. ์ด์ „ ์ฝ”๋“œ์—์„œ๋Š” ๊ฐ™์€ ๋ณ€์ˆ˜๋ฅผ ๋ฐ˜๋ณตํ•ด์„œ ์‚ฌ์šฉํ•  ๊ฒฝ์šฐ ์ „ํŒŒ๋˜๋Š” ๋ฏธ๋ถ„๊ฐ’์ด ๋ฎ์–ด ์จ์ง‘๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ ์ด๋ฏธ ๋ฏธ๋ถ„๊ฐ’์ด ์žˆ๋Š” ๊ฒฝ์šฐ ์ƒˆ๋กœ ์ „๋‹ฌ๋œ ๋ฏธ๋ถ„๊ฐ’์„ ๋”ํ•ด์ฃผ๋„๋ก ์ˆ˜์ •ํ•˜์—ฌ์•ผํ•ฉ๋‹ˆ๋‹ค.

# ๋ณ€๊ฒฝ ์ „
def backward(self):
        ...
            for x, gx in zip(f.inputs, gxs):
                x.grad = gx
            
                if x.creator is not None:
                    funcs.append(x.creator)

# ๋ณ€๊ฒฝ ํ›„
def backward(self):
        ...
            for x, gx in zip(f.inputs, gxs):
                if x.grad is not None:
                	x.grad = gx
                else:
                	x.grad = x.grad + gx
            
                if x.creator is not None:
                    funcs.append(x.creator)

 

์•„์ง ๋‹ค๋ฅธ ๋ฌธ์ œ๊ฐ€ ๋‚จ์•„์žˆ์Šต๋‹ˆ๋‹ค. ๊ฐ™์€ ๋ณ€์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋‹ค๋ฅธ ๊ณ„์‚ฐ์„ ํ•  ๊ฒฝ์šฐ์ž…๋‹ˆ๋‹ค. x๋ฅผ ์žฌ์‚ฌ์šฉํ•  ๊ฒฝ์šฐ ์ฒซ ๋ฒˆ์งธ ๊ฒฐ๊ณผ์— ๊ฐ’์ด ๋”ํ•ด์ ธ ์ž˜๋ชป๋œ ๊ฐ’์ด ์ €์žฅ๋ฉ๋‹ˆ๋‹ค. 3.0์ด ๋‚˜์™€์•ผํ•˜์ง€๋งŒ 5.0์ด ๋‚˜์™”์Šต๋‹ˆ๋‹ค.

x = Variable(np.array(3.0))
y = add(x, x)
y.backward()
print(y.data)  # 6.0
print(x.grad)  # 2.0


y = add(add(x, x), x)
y.backward()
print(y.data)  # 9.0
print(x.grad)  # 5.0

 

์œ„์™€ ๊ฐ™์€ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด Variable ํด๋ž˜์Šค์— ๋ฏธ๋ถ„๊ฐ’์„ ์ดˆ๊ธฐํ™”ํ•˜๋Š” ๋ฉ”์„œ๋“œ๋ฅผ ์ถ”๊ฐ€ํ•ฉ๋‹ˆ๋‹ค.

class Variable:
    ...
    
    def cleargrad(self):
        self.grad = None

 

๊ฐ™์€ ๋ณ€์ˆ˜๋ฅผ ์žฌ์‚ฌ์šฉํ•  ๋•Œ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์ด cleargrad ๋ฉ”์„œ๋“œ๋ฅผ ํ™œ์šฉํ•˜์—ฌ ๋ฏธ๋ถ„๊ฐ’์„ ์ดˆ๊ธฐํ™”ํ•˜๊ณ  ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ํ…Œ์ŠคํŠธํ•ด๋ณด๋ฉด ์ •์ƒ์ ์œผ๋กœ ์ถœ๋ ฅ๋˜๋Š” ๊ฒƒ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

x = Variable(np.array(3.0))
y = add(x, x)
y.backward()
print(y.data)  # 6.0
print(x.grad)  # 2.0

x.cleargrad()  # or x = Variable(np.array(3.0))
y = add(add(x, x), x)
y.backward()
print(y.data)  # 9.0
print(x.grad)  # 3.0