make_reduction#

make_reduction(estimator, strategy='recursive', window_length=10, scitype='infer', transformers=None, pooling='local', windows_identical=True)[源代码][源代码]#

基于归约到表格或时间序列回归的预测器。

在拟合过程中，使用滑动窗口方法首先将时间序列转换为表格或面板数据，然后使用这些数据来拟合表格或时间序列回归估计器。在预测过程中，使用最后可用的数据作为输入，传递给拟合的回归估计器以生成预测。

请参见以下使用以下符号的 make_reduction 逻辑的图形表示：

y = 预测目标。
x = 过去用于预测 y 的 y 值（X）
* = 观察值，过去或未来的，既不属于窗口也不属于预测。

假设我们有以下训练数据（14个观测值）:

|----------------------------|
| * * * * * * * * * * * * * *|
|----------------------------|

并且希望使用 window_length = 9 和 fh = [2, 4] 进行预测。

通过构造，递归归约器总是以窗口后的第一个数据点为目标，无论请求的预测范围如何。在示例中，创建了以下5个窗口:

|----------------------------|
| x x x x x x x x x y * * * *|
| * x x x x x x x x x y * * *|
| * * x x x x x x x x x y * *|
| * * * x x x x x x x x x y *|
| * * * * x x x x x x x x x y|
|----------------------------|

直接缩减器将为每个预测范围创建多个模型。使用参数 windows_identical = True``（默认），用于训练模型的窗口由最大预测范围定义。在这个例子中只能定义两个完整的窗口 ``fh = 4``（``fh = [2, 4] 的最大值）:

|----------------------------|
| x x x x x x x x x * * * y *|
| * x x x x x x x x x * * * y|
|----------------------------|

所有其他预测时间范围也将使用这两个（最大）窗口。fh = 2:

|----------------------------|
| x x x x x x x x x * y * * *|
| * x x x x x x x x x * y * *|
|----------------------------|

使用 windows_identical = False 我们取消了为每个直接模型使用相同窗口的要求，因此可以为最大预测范围以外的地平线创建更多窗口。fh = 2:

|----------------------------|
| x x x x x x x x x * y * * *|
| * x x x x x x x x x * y * *|
| * * x x x x x x x x x * y *|
| * * * x x x x x x x x x * y|
|----------------------------|

fh = 4:

|----------------------------|
| x x x x x x x x x * * * y *|
| * x x x x x x x x x * * * y|
|----------------------------|

如果你想比较不同预测范围内的预测性能，请使用 windows_identical = True，因为所有训练的模型都将使用相同的窗口。如果你想为每个预测范围获得最高的预测准确性，请使用 windows_identical = False。

参数:

估计器一个估计器实例，可以是：

scikit-learn 回归器或兼容接口
sktime 时间序列回归器
skpro 表格概率监督回归器，仅用于直接减少，这将产生一个概率预测器

策略str, 可选 (默认值为”recursive”)

生成预测的策略。必须是以下之一：“直接”、“递归”或“多输出”。

window_lengthint, 可选 (默认=10)

滑动窗口变换中使用的窗口长度。

scitypestr, 可选 (默认值为“infer”)

向下兼容的遗留参数，不应使用。make_reduction 会自动推断 estimator 的正确类型。此内部推断可以通过 scitype 参数强制覆盖。必须是 “infer”、”tabular-regressor” 或 “time-series-regressor” 之一。如果无法推断 scitype，这是一个错误，应报告。

transformers: 转换器列表 (默认 = None)

一个合适的转换器列表，允许在使用 make_reduction 时采用整体方法。这意味着，不是使用窗口长度内的 y 的原始过去观测值，而是直接从过去的原始观测值生成合适的特征。目前仅支持 WindowSummarizer（或 WindowSummarizer 列表）来生成特征，例如过去 7 个观测值的平均值。目前仅适用于 RecursiveTimeSeriesRegressionForecaster。

pooling: str {“local”, “global”}, 可选

指定是否在每个实例（本地）级别拟合单独的模型，或者您是否希望将单个模型拟合到所有实例（“全局”）。目前仅适用于RecursiveTimeSeriesRegressionForecaster。

windows_identical: bool, (默认 = True)

仅直接预测。指定所有直接模型是否使用相同的 X 窗口从 y 中提取（True：窗口数量 = 总观测值 + 1 - 窗口长度 - 最大预测范围）或根据预测范围使用不同数量的 X 窗口（False：窗口数量 = 总观测值 + 1 - 窗口长度 - 预测范围）。更多信息请参见下面的图示。

返回:

预测者一个 sktime 预测器对象: 减少预测器，包装 estimator 类由 strategy 参数和 estimator 的类型决定。

参考文献

[1]

Bontempi, Gianluca & Ben Taieb, Souhaib & Le Borgne, Yann-Aël. (2013). 时间序列预测的机器学习策略。

示例

>>> from sktime.forecasting.compose import make_reduction
>>> from sktime.datasets import load_airline
>>> from sklearn.ensemble import GradientBoostingRegressor
>>> y = load_airline()
>>> regressor = GradientBoostingRegressor()
>>> forecaster = make_reduction(regressor, window_length=15, strategy="recursive")
>>> forecaster.fit(y)
RecursiveTabularRegressionForecaster(...)
>>> y_pred = forecaster.predict(fh=[1,2,3])