Implementing Gradient Descent Algorithm in Hadoop for large scale data

In this post I will be exploring how can we use MapReduce to implement Gradient Descent algorithm in Hadoop for large scale data. As we know Hadoop is capable of handling peta-byte scale/size of the data.

Before starting, first we need to understand what is Gradient Descent and where can we use it. Below is an excerpt from Wikipedia:

Gradient descent is a first-order iterative optimization algorithm. To find a local minimum of a function using gradient descent, one takes steps proportional to the negative of the gradient (or of the approximate gradient) of the function at the current point.… [Keep reading] “Implementing Gradient Descent Algorithm in Hadoop for large scale data”