Gradient Descent / Newton Method

Notice

Recent Posts

Link

Tags more

« 2025/04 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

Archives

Today

Total

관리 메뉴

I'm Lim

Gradient Descent / Newton Method 본문

Deep Learning/Fundamental

Gradient Descent / Newton Method

imlim 2022. 12. 28. 20:52

Introduction

딥러닝은 optimizer로 Gradient Descent 기반의 기법을 사용합니다. 그러나, Quasi-Newton Method라는 다른 대안도 있습니다. 이 글에서는 왜 Quasi-Newton Method가 아닌 Gradient Descent을 사용하는지에 대해 알아보려고 합니다.

Gradient Descent

Gradient Descent의 기본적인 공식은 아래와 같습니다.

$\large {\theta = \theta - \eta \nabla_{\theta} J(\theta)}$

Gradient Descent은 극소점을 찾는 것이 그 목적입니다. 위 식을 보면 알 수 있듯이, $J(\theta)$ 즉, 기울기가 0이 되버리는 순간에는 더 이상 $\theta$ 가 변하지 않고, 이는 극점을 뜻합니다. 또한, 기울기의 반대방향으로 $\theta$ 를 업데이트 시킴으로써 극대가 아닌 극소점을 찾아가는 방식입니다.

Newton Method

Newton Method은 함수의 기울기가 0이 되는 지점을 찾는 방식입니다. 수식은 아래와 같습니다.

$\large {x_{n+1} = x_n + \dfrac {f'(x_n)}{f''(x_n)}}$

이를 행렬로 확장시키게 되면 아래와 같습니다.

$\large {X_{n+1} = X_n + \nabla^2 f(X_n)^{-1} f(X_n)}$

위 식은 이차미분을 진행해야 된다 큰 문제를 갖습니다 (이는 컴퓨터 계산량을 아주 많이 높이게 됩니다). 따라서, 이 이차미분을 근사화 시키겠다는 것이 Quasi-Newton Method 입니다.

Quasi-Newton Method

그러면, 이차미분의 문제가 풀렸으니 Quasi-Newton Method 쓰면 되지 않나? 라는 의문이 당연히 생깁니다. 그러나, Loss function이 완벽한 이차함수를 이루지 않는 이상 Quasi-Newton Method는 불안정하다고 합니다 [ 1 ]. 따라서, Loss function 대부분이 이차함수가 아닌 딥러닝에서는 Gradient Descent를 optimizer로 채택한 것 같습니다.

Reference

[ 1 ] https://stats.stackexchange.com/questions/253632/why-is-newtons-method-not-widely-used-in-machine-learning

'Deep Learning > Fundamental' 카테고리의 다른 글

Vanishing gradient / Exploding gradient (0)	2022.11.29
Activation function / Overfitting problem (0)	2022.11.28
Global Minima / Local Minima / Saddle Point (0)	2022.09.18
Multi Layer Perceptron (0)	2022.08.09
Back Propagation (0)	2022.08.08

'Deep Learning/Fundamental' Related Articles

Comments

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

I'm Lim

I'm Lim

Gradient Descent / Newton Method 본문

Gradient Descent / Newton Method

Introduction

Gradient Descent

Newton Method

Quasi-Newton Method

Reference

'Deep Learning > Fundamental' 카테고리의 다른 글

티스토리툴바

단축키

내 블로그

블로그 게시글

모든 영역