What is the difference between Gradient Descent an

I understand what Gradient Descent does. Basically it tries to move towards the local optimal solution by slowly moving down the curve. I am trying to understand what is the actual difference between the plan gradient descent and the newton's method?

From Wikipedia, I read this short line "Newton's method uses curvature information to take a more direct route." What does this intuitively mean?

标签： machine-learning data-mining mathematical-optimization gradient-descent newtons-method

4条回答

混吃等死

2楼-- · 2020-05-11 10:01

If you simply compare Gradient Descent and Newton's method, the purpose of the two methods are different.

Gradient Descent is used to find(approximate) local maxima or minima (x to make min f(x) or max f(x)). While Newton's method is to find(approximate) the root of a function, i.e. x to make f(x) = 0

In this sense, they are used to solve different problems. However, Newton's method can also be used in the context of optimization (the realm that GD is solving). Because finding maxima or minima can be approached by finding f'(x) = 0 which is exactly Newton's method is used for.

In conclusion, two methods can be used in optimization: 1)GD and 2)find x so f'(x)=0 and Newton's method is just a way to solve that second problem.

0人赞添加讨论(0) 举报

等我变得足够好

3楼-- · 2020-05-11 10:02

Put simply, gradient descent you just take a small step towards where you think the zero is and then recalculate; Newton's method, you go all the way there.

0人赞添加讨论(0) 举报

啃猪蹄的小仙女

4楼-- · 2020-05-11 10:09

Edit 2017: The original link is dead - but the way back machine still got it :) https://web.archive.org/web/20151122203025/http://www.cs.colostate.edu/~anderson/cs545/Lectures/week6day2/week6day2.pdf

this power point the main ideas are explained simply http://www.cs.colostate.edu/~anderson/cs545/Lectures/week6day2/week6day2.pdf

I hope this help :)

0人赞添加讨论(0) 举报

兄弟一词,经得起流年.

5楼-- · 2020-05-11 10:23

At a local minimum (or maximum) x, the derivative of the target function f vanishes: f'(x) = 0 (assuming sufficient smoothness of f).

Gradient descent tries to find such a minimum x by using information from the first derivative of f: It simply follows the steepest descent from the current point. This is like rolling a ball down the graph of f until it comes to rest (while neglecting inertia).

Newton's method tries to find a point x satisfying f'(x) = 0 by approximating f' with a linear function g and then solving for the root of that function explicitely (this is called Newton's root-finding method). The root of g is not necessarily the root of f', but it is under many circumstances a good guess (the Wikipedia article on Newton's method for root finding has more information on convergence criteria). While approximating f', Newton's method makes use of f'' (the curvature of f). This means it has higher requirements on the smoothness of f, but it also means that (by using more information) it often converges faster.

0人赞添加讨论(0) 举报

What is the difference between Gradient Descent an

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间