What is Gradient Boosting?
Several weak learners are combined into strong learners using an effective boosting process known as gradient boosting. Each new model is trained using gradient descent to minimize the loss function of the preceding model, such as mean square error or cross-entropy. The algorithm calculates the gradient of the loss function with respect to the current ensemble's predictions in each iteration, and then it trains a new weak model to try to minimize this gradient. The ensemble is then updated with the new model's predictions, and the procedure is continued until a stopping requirement is satisfied. Both continuous and categorical target variables can be predicted using the gradient boosting approach (as a regressor or classifier). The cost function is Mean Square Error (MSE) when it is used as a regressor, while it is Log loss when it is used as a classifier.