Gradient Boosting

Gradient Boosting is another powerful boosting algorithm. Here is the sklearn documentation of Gradient Boosting.

To train we run

1
2
gbrt = GradientBoostingClassifier(max_depth=2, n_estimators=3, learning_rate=1.0, random_state=42)
gbrt.fit(X_train, y_train)`

and generate predictions as usual as

1
2
y_pred = gbrt.predict(X_test)
accuracy_score(y_test, y_pred)

Gradient Boosting with Early Stopping

Early stopping is one of the important methods we use to prevent over-fitting. Below is the example code of how we perform Gradient Boosting with early stopping.

1
2
gbrt = GradientBoostingClassifier(max_depth=2, subsample = 0.5, tol = 0.01, n_estimators=50, random_state=SEED)
gbrt.fit(X_train, y_train)

Find the optimal n_estimators - have lowest error on validation set

1
2
3
4
5
6
errors = [log_loss(y_val, y_pred)
          for y_pred in gbrt.staged_predict_proba(X_val)]
bst_n_estimators = np.argmin(errors) + 1

gbrt_best = GradientBoostingClassifier(max_depth=2,n_estimators=bst_n_estimators, random_state=42)
gbrt_best.fit(X_train, y_train)
1
2
min_error = np.min(errors)
min_error

Since the number of estimators we tried here is only 10, it does not satisfy the requirement of early stopping. However, you can try a larger value for n_estimators to see the impact of early stopping here.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
plt.figure(figsize=(11, 4))

plt.subplot(121)
plt.plot(errors, "b.-")
plt.plot([bst_n_estimators, bst_n_estimators], [0, min_error], "k--")
plt.plot([0, 10], [min_error, min_error], "k--")
plt.plot(bst_n_estimators, min_error, "ko")
plt.axis([0, 10, 0, 3])
plt.xlabel("Number of trees")
plt.title("Validation error", fontsize=14)
plt.show()

Question

Tune the hyper-parameter of Gradient Boost. Try your best to optimize the model performance.