模型
切片树模型
当 booster
设置为 gbtree
或 dart
时,XGBoost 构建一个树模型,这是一个树的列表,并且可以被分割成多个子模型。
from sklearn.datasets import make_classification
num_classes = 3
X, y = make_classification(n_samples=1000, n_informative=5,
n_classes=num_classes)
dtrain = xgb.DMatrix(data=X, label=y)
num_parallel_tree = 4
num_boost_round = 16
# total number of built trees is num_parallel_tree * num_classes * num_boost_round
# We build a boosted random forest for classification here.
booster = xgb.train({
'num_parallel_tree': 4, 'subsample': 0.5, 'num_class': 3},
num_boost_round=num_boost_round, dtrain=dtrain)
# This is the sliced model, containing [3, 7) forests
# step is also supported with some limitations like negative step is invalid.
sliced: xgb.Booster = booster[3:7]
# Access individual tree layer
trees = [_ for _ in booster]
assert len(trees) == num_boost_round
切片模型是选定树的副本,这意味着模型本身在切片过程中是不可变的。这一特性是早期停止回调中 save_best 选项的基础。有关如何将预测与切片树结合的实际示例,请参见 使用单个树和模型切片的预测演示。
备注
返回的模型切片不包含诸如 best_iteration
和 best_score
这样的属性。