Image by Editor
# Introduction
Ensemble methods like XGBoost (Extreme Gradient Boosting) are powerful implementations of gradient-boosted decision trees that aggregate several weaker estimators into a strong predictive model. These ensembles are highly popular due to their accuracy, efficiency, and strong performance on structured (tabular) data. While the widely used machine learning library scikit-learn does not provide a native implementation of XGBoost, there is a separate library, fittingly called XGBoost, that offers an API compatible with scikit-learn.
All you need to do is import it as follows:
from xgboost import XGBClassifier
Below, we outline 7 Python tricks that can help you make the most of this standalone implementation of XGBoost, particularly when aiming to build more accurate predictive models.
To illustrate these tricks, we will use the Breast Cancer dataset freely available in scikit-learn and define a baseline model with largely default settings. Be sure to run this code first before experimenting with the seven tricks that follow:
import numpy as np
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.metrics import accuracy_score
from xgboost import XGBClassifier
# Data
X, y = load_breast_cancer(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42
)
# Baseline model
model = XGBClassifier(eval_metric=”logloss”, random_state=42)
model.fit(X_train, y_train)
print(“Baseline accuracy:”, accuracy_score(y_test, model.predict(X_test)))
# 1. Tuning Learning Rate And Number Of Estimators
While not a universal rule, explicitly reducing the learning rate while increasing the number of estimators (trees) in an XGBoost ensemble often improves accuracy. The smaller learning rate allows the model to learn more gradually, while additional trees compensate for the reduced step size.
Here is an example. Try it yourself and compare the resulting accuracy to the initial baseline:
model = XGBClassifier(
learning_rate=0.01,
n_estimators=5000,
eval_metric=”logloss”,
random_state=42
)
model.fit(X_train, y_train)
print(“Model accuracy:”, accuracy_score(y_test, model.predict(X_test)))
For clarity, the final print() statement will be omitted in the remaining examples. Simply append it to any of the snippets below when testing them yourself.
# 2. Adjusting The Maximum Depth Of Trees
The max_depth argument is a crucial hyperparameter inherited from classic decision trees. It limits how deep each tree in the ensemble can grow. Restricting tree depth may seem simplistic, but surprisingly, shallow trees often generalize better than deeper ones.
This example constrains the trees to a maximum depth of 2:
model = XGBClassifier(
max_depth=2,
eval_metric=”logloss”,
random_state=42
)
model.fit(X_train, y_train)
# 3. Reducing Overfitting By Subsampling
The subsample argument randomly samples a proportion of the training data (for example, 80%) before growing each tree in the ensemble. This simple technique acts as an effective regularization strategy and helps prevent overfitting.
If not specified, this hyperparameter defaults to 1.0, meaning 100% of the training examples are used:
model = XGBClassifier(
subsample=0.8,
colsample_bytree=0.8,
eval_metric=”logloss”,
random_state=42
)
model.fit(X_train, y_train)
Keep in mind that this approach is most effective for reasonably sized datasets. If the dataset is already small, aggressive subsampling may lead to underfitting.
# 4. Adding Regularization Terms
To further control overfitting, complex trees can be penalized using traditional regularization strategies such as L1 (Lasso) and L2 (Ridge). In XGBoost, these are controlled by the reg_alpha and reg_lambda parameters, respectively.
model = XGBClassifier(
reg_alpha=0.2, # L1
reg_lambda=0.5, # L2
eval_metric=”logloss”,
random_state=42
)
model.fit(X_train, y_train)
# 5. Using Early Stopping
Early stopping is an efficiency-oriented mechanism that halts training when performance on a validation set stops improving over a specified number of rounds.
Depending on your coding environment and the version of the XGBoost library you are using, you may need to upgrade to a more recent version to use the implementation shown below. Also, ensure that early_stopping_rounds is specified during model initialization rather than passed to the fit() method.
model = XGBClassifier(
n_estimators=1000,
learning_rate=0.05,
eval_metric=”logloss”,
early_stopping_rounds=20,
random_state=42
)
model.fit(
X_train, y_train,
eval_set=[(X_test, y_test)],
verbose=False
)
To upgrade the library, run:
!pip uninstall -y xgboost
!pip install xgboost –upgrade
# 6. Performing Hyperparameter Search
For a more systematic approach, hyperparameter search can help identify combinations of settings that maximize model performance. Below is an example using grid search to explore combinations of three key hyperparameters introduced earlier:
param_grid = {
“max_depth”: [3, 4, 5],
“learning_rate”: [0.01, 0.05, 0.1],
“n_estimators”: [200, 500]
}
grid = GridSearchCV(
XGBClassifier(eval_metric=”logloss”, random_state=42),
param_grid,
cv=3,
scoring=”accuracy”
)
grid.fit(X_train, y_train)
print(“Best params:”, grid.best_params_)
best_model = XGBClassifier(
**grid.best_params_,
eval_metric=”logloss”,
random_state=42
)
best_model.fit(X_train, y_train)
print(“Tuned accuracy:”, accuracy_score(y_test, best_model.predict(X_test)))
# 7. Adjusting For Class Imbalance
This final trick is particularly useful when working with strongly class-imbalanced datasets (the Breast Cancer dataset is relatively balanced, so do not be concerned if you observe minimal changes). The scale_pos_weight parameter is especially helpful when class proportions are highly skewed, such as 90/10, 95/5, or 99/1.
Here is how to compute and apply it based on the training data:
ratio = np.sum(y_train == 0) / np.sum(y_train == 1)
model = XGBClassifier(
scale_pos_weight=ratio,
eval_metric=”logloss”,
random_state=42
)
model.fit(X_train, y_train)
# Wrapping Up
In this article, we explored seven practical tricks to enhance XGBoost ensemble models using its dedicated Python library. Thoughtful tuning of learning rates, tree depth, sampling strategies, regularization, and class weighting — combined with systematic hyperparameter search — often makes the difference between a decent model and a highly accurate one.
Iván Palomares Carrascosa is a leader, writer, speaker, and adviser in AI, machine learning, deep learning & LLMs. He trains and guides others in harnessing AI in the real world.

