
# 显示流水线

在 Jupyter Notebook 中显示流水线的默认配置是 `'diagram'` ，即 `set_config(display='diagram')` 。要停用 HTML 表示形式，请使用 `set_config(display='text')` 。

要在流水线的可视化中查看更详细的步骤，请点击流水线中的步骤。


显示包含预处理步骤和分类器的管道



本节构建了一个包含预处理步骤 :class:`~sklearn.preprocessing.StandardScaler` 和分类器 :class:`~sklearn.linear_model.LogisticRegression` 的 :class:`~sklearn.pipeline.Pipeline` ，并展示其可视化表示。



In [None]:
from sklearn import set_config
from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler

steps = [
    ("preprocessing", StandardScaler()),
    ("classifier", LogisticRegression()),
]
pipe = Pipeline(steps)

要可视化图表，默认设置为 `display='diagram'` 。



In [None]:
set_config(display="diagram")
pipe  # click on the diagram below to see the details of each step

要查看文本管道，请更改为 `display='text'` 。



In [None]:
set_config(display="text")
pipe

恢复默认显示



In [None]:
set_config(display="diagram")

显示一个包含多个预处理步骤和分类器的管道



本节构建了一个包含多个预处理步骤的 :class:`~sklearn.pipeline.Pipeline` ，包括 :class:`~sklearn.preprocessing.PolynomialFeatures` 和 :class:`~sklearn.preprocessing.StandardScaler` ，以及一个分类器步骤 :class:`~sklearn.linear_model.LogisticRegression` ，并显示其可视化表示。



In [None]:
from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import PolynomialFeatures, StandardScaler

steps = [
    ("standard_scaler", StandardScaler()),
    ("polynomial", PolynomialFeatures(degree=3)),
    ("classifier", LogisticRegression(C=2.0)),
]
pipe = Pipeline(steps)
pipe  # click on the diagram below to see the details of each step

显示管道、降维和分类器



本节构建了一个包含降维步骤 :class:`~sklearn.decomposition.PCA` 和分类器 :class:`~sklearn.svm.SVC` 的 :class:`~sklearn.pipeline.Pipeline` ，并展示其可视化表示。



In [None]:
from sklearn.decomposition import PCA
from sklearn.pipeline import Pipeline
from sklearn.svm import SVC

steps = [("reduce_dim", PCA(n_components=4)), ("classifier", SVC(kernel="linear"))]
pipe = Pipeline(steps)
pipe  # click on the diagram below to see the details of each step

显示一个复杂的管道链列转换器



本节构建了一个复杂的 :class:`~sklearn.pipeline.Pipeline` ，其中包含一个 :class:`~sklearn.compose.ColumnTransformer` 和一个分类器 :class:`~sklearn.linear_model.LogisticRegression` ，并展示其可视化表示。



In [None]:
import numpy as np

from sklearn.compose import ColumnTransformer
from sklearn.impute import SimpleImputer
from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import Pipeline, make_pipeline
from sklearn.preprocessing import OneHotEncoder, StandardScaler

numeric_preprocessor = Pipeline(
    steps=[
        ("imputation_mean", SimpleImputer(missing_values=np.nan, strategy="mean")),
        ("scaler", StandardScaler()),
    ]
)

categorical_preprocessor = Pipeline(
    steps=[
        (
            "imputation_constant",
            SimpleImputer(fill_value="missing", strategy="constant"),
        ),
        ("onehot", OneHotEncoder(handle_unknown="ignore")),
    ]
)

preprocessor = ColumnTransformer(
    [
        ("categorical", categorical_preprocessor, ["state", "gender"]),
        ("numerical", numeric_preprocessor, ["age", "weight"]),
    ]
)

pipe = make_pipeline(preprocessor, LogisticRegression(max_iter=500))
pipe  # click on the diagram below to see the details of each step

## 展示一个包含分类器的管道的网格搜索
 本节构建了一个包含 :class:`~sklearn.ensemble.RandomForestClassifier` 的 :class:`~sklearn.pipeline.Pipeline` 上的 :class:`~sklearn.model_selection.GridSearchCV` ，并显示其可视化表示。



In [None]:
import numpy as np

from sklearn.compose import ColumnTransformer
from sklearn.ensemble import RandomForestClassifier
from sklearn.impute import SimpleImputer
from sklearn.model_selection import GridSearchCV
from sklearn.pipeline import Pipeline, make_pipeline
from sklearn.preprocessing import OneHotEncoder, StandardScaler

numeric_preprocessor = Pipeline(
    steps=[
        ("imputation_mean", SimpleImputer(missing_values=np.nan, strategy="mean")),
        ("scaler", StandardScaler()),
    ]
)

categorical_preprocessor = Pipeline(
    steps=[
        (
            "imputation_constant",
            SimpleImputer(fill_value="missing", strategy="constant"),
        ),
        ("onehot", OneHotEncoder(handle_unknown="ignore")),
    ]
)

preprocessor = ColumnTransformer(
    [
        ("categorical", categorical_preprocessor, ["state", "gender"]),
        ("numerical", numeric_preprocessor, ["age", "weight"]),
    ]
)

pipe = Pipeline(
    steps=[("preprocessor", preprocessor), ("classifier", RandomForestClassifier())]
)

param_grid = {
    "classifier__n_estimators": [200, 500],
    "classifier__max_features": ["auto", "sqrt", "log2"],
    "classifier__max_depth": [4, 5, 6, 7, 8],
    "classifier__criterion": ["gini", "entropy"],
}

grid_search = GridSearchCV(pipe, param_grid=param_grid, n_jobs=1)
grid_search  # click on the diagram below to see the details of each step