mlxtend version: 0.23.1
CopyTransformer
CopyTransformer()
返回输入数组的副本的转换器
有关使用示例,请参见
https://rasbt.github.io/mlxtend/user_guide/preprocessing/CopyTransformer/
Methods
fit(X, y=None)
Mock方法.不执行任何操作.
Parameters
-
X
: {array-like, sparse matrix}, shape = [n_samples, n_features]训练向量,其中n_samples是样本数量,n_features是特征数量.
-
y
: array-like, shape = [n_samples] (default: None)
Returns
self
fit_transform(X, y=None)
Return a copy of the input array.
Parameters
-
X
: {array-like, sparse matrix}, shape = [n_samples, n_features]训练向量,其中n_samples是样本数量,n_features是特征数量.
-
y
: array-like, shape = [n_samples] (default: None)
Returns
X_copy
: 输入X数组的副本.
get_metadata_routing()
Get metadata routing of this object.
Please check :ref:`User Guide <metadata_routing>` on how the routing
mechanism works.
Returns
-
routing
: MetadataRequestA :class:
~sklearn.utils.metadata_routing.MetadataRequest
encapsulating routing information.
get_params(deep=True)
Get parameters for this estimator.
Parameters
-
deep
: bool, default=TrueIf True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns
-
params
: dictParameter names mapped to their values.
set_params(params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects
(such as :class:`~sklearn.pipeline.Pipeline`). The latter have
parameters of the form ``<component>__<parameter>`` so that it's
possible to update each component of a nested object.
Parameters
-
**params
: dictEstimator parameters.
Returns
-
self
: estimator instanceEstimator instance.
transform(X, y=None)
Return a copy of the input array.
Parameters
-
X
: {array-like, sparse matrix}, shape = [n_samples, n_features]训练向量,其中n_samples是样本数量,n_features是特征数量.
-
y
: array-like, shape = [n_samples] (default: None)
Returns
X_copy
: 输入X数组的副本.
DenseTransformer
DenseTransformer(return_copy=True)
将稀疏数组转换为密集数组.
有关使用示例,请参见
https://rasbt.github.io/mlxtend/user_guide/preprocessing/DenseTransformer/
Methods
fit(X, y=None)
Mock方法.什么也不做.
Parameters
-
X
: {array-like, sparse matrix}, shape = [n_samples, n_features]训练向量,其中n_samples是样本数量,n_features是特征数量.
-
y
: array-like, shape = [n_samples] (default: None)
Returns
self
fit_transform(X, y=None)
返回输入数组的密集版本.
Parameters
-
X
: {类数组, 稀疏矩阵}, shape = [n_samples, n_features]训练向量,其中 n_samples 是样本数量,n_features 是特征数量.
-
y
: 类数组, shape = [n_samples] (默认: None)
Returns
X_dense
: 输入 X 数组的密集版本.
get_metadata_routing()
Get metadata routing of this object.
Please check :ref:`User Guide <metadata_routing>` on how the routing
mechanism works.
Returns
-
routing
: MetadataRequestA :class:
~sklearn.utils.metadata_routing.MetadataRequest
encapsulating routing information.
get_params(deep=True)
Get parameters for this estimator.
Parameters
-
deep
: bool, default=TrueIf True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns
-
params
: dictParameter names mapped to their values.
set_params(params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects
(such as :class:`~sklearn.pipeline.Pipeline`). The latter have
parameters of the form ``<component>__<parameter>`` so that it's
possible to update each component of a nested object.
Parameters
-
**params
: dictEstimator parameters.
Returns
-
self
: estimator instanceEstimator instance.
transform(X, y=None)
返回输入数组的密集版本.
Parameters
-
X
: {类数组, 稀疏矩阵}, shape = [n_samples, n_features]训练向量,其中 n_samples 是样本数量,n_features 是特征数量.
-
y
: 类数组, shape = [n_samples] (默认: None)
Returns
X_dense
: 输入 X 数组的密集版本.
MeanCenterer
MeanCenterer()
向量和矩阵的列中心化.
Attributes
-
col_means
: numpy.ndarray [n_columns]存储拟合MeanCenterer对象后用于中心化的均值的NumPy数组.
Examples
有关使用示例,请参见 https://rasbt.github.io/mlxtend/user_guide/preprocessing/MeanCenterer/
Methods
fit(X)
获取用于均值中心化的列均值.
Parameters
-
X
: {array-like, sparse matrix}, shape = [n_samples, n_features]数据向量数组,其中n_samples是样本数量,n_features是特征数量.
Returns
self
fit_transform(X)
拟合并转换一个数组.
Parameters
-
X
: {array-like, sparse matrix}, shape = [n_samples, n_features]数据向量数组,其中n_samples是样本数量,n_features是特征数量.
Returns
-
X_tr
: {array-like, sparse matrix}, shape = [n_samples, n_features]输入数组的副本,列已中心化.
transform(X)
中心化一个NumPy数组.
Parameters
-
X
: {array-like, sparse matrix}, shape = [n_samples, n_features]数据向量数组,其中n_samples是样本数量,n_features是特征数量.
Returns
-
X_tr
: {array-like, sparse matrix}, shape = [n_samples, n_features]输入数组的副本,列已中心化.
TransactionEncoder
TransactionEncoder()
Python列表中交易数据的编码器类
Parameters
无
Attributes
columns_: 列表
在输入列表 X
中的唯一名称列表
Examples
有关使用示例,请参见 https://rasbt.github.io/mlxtend/user_guide/preprocessing/TransactionEncoder/
Methods
fit(X)
从交易DataFrame中学习唯一的列名
Parameters
-
X
: 列表的列表一个Python列表的列表,其中外层列表存储了n个交易,内层列表存储了每个交易中的商品.
例如, [['Apple', 'Beer', 'Rice', 'Chicken'], ['Apple', 'Beer', 'Rice'], ['Apple', 'Beer'], ['Apple', 'Bananas'], ['Milk', 'Beer', 'Rice', 'Chicken'], ['Milk', 'Beer', 'Rice'], ['Milk', 'Beer'], ['Apple', 'Bananas']]
fit_transform(X, sparse=False)
拟合一个TransactionEncoder编码器并转换数据集.
get_metadata_routing()
Get metadata routing of this object.
Please check :ref:`User Guide <metadata_routing>` on how the routing
mechanism works.
Returns
-
routing
: MetadataRequestA :class:
~sklearn.utils.metadata_routing.MetadataRequest
encapsulating routing information.
get_params(deep=True)
Get parameters for this estimator.
Parameters
-
deep
: bool, default=TrueIf True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns
-
params
: dictParameter names mapped to their values.
inverse_transform(array)
将编码后的NumPy数组转换回交易记录.
Parameters
-
array
: NumPy数组 [n_transactions, n_unique_items]输入交易记录的NumPy独热编码布尔数组, 其中列表示按字母顺序排列的输入数组中找到的唯一项
例如,
array([[True , False, True , True , False, True ], [True , False, True , False, False, True ], [True , False, True , False, False, False], [True , True , False, False, False, False], [False, False, True , True , True , True ], [False, False, True , False, True , True ], [False, False, True , False, True , False], [True , True , False, False, False, False]])
对应的列标签可通过self.columns_获得, 例如,['Apple', 'Bananas', 'Beer', 'Chicken', 'Milk', 'Rice']
Returns
-
X
: 列表的列表一个Python列表的列表,其中外部列表存储 n个交易记录,内部列表存储每个交易记录中的项目.
例如,
[['Apple', 'Beer', 'Rice', 'Chicken'], ['Apple', 'Beer', 'Rice'], ['Apple', 'Beer'], ['Apple', 'Bananas'], ['Milk', 'Beer', 'Rice', 'Chicken'], ['Milk', 'Beer', 'Rice'], ['Milk', 'Beer'], ['Apple', 'Bananas']]
set_inverse_transform_request(self: mlxtend.preprocessing.transactionencoder.TransactionEncoder, , array: Union[bool, NoneType, str] = '$UNCHANGED$') -> mlxtend.preprocessing.transactionencoder.TransactionEncoder*
Request metadata passed to the inverse_transform
method.
Note that this method is only relevant if
``enable_metadata_routing=True`` (see :func:`sklearn.set_config`).
Please see :ref:`User Guide <metadata_routing>` on how the routing
mechanism works.
The options for each parameter are:
- ``True``: metadata is requested, and passed to ``inverse_transform`` if provided. The request is ignored if metadata is not provided.
- ``False``: metadata is not requested and the meta-estimator will not pass it to ``inverse_transform``.
- ``None``: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
- ``str``: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (``sklearn.utils.metadata_routing.UNCHANGED``) retains the
existing request. This allows you to change the request for some
parameters and not others.
.. versionadded:: 1.3
.. note::
This method is only relevant if this estimator is used as a
sub-estimator of a meta-estimator, e.g. used inside a
:class:`~sklearn.pipeline.Pipeline`. Otherwise it has no effect.
Parameters
-
array
: str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGEDMetadata routing for
array
parameter ininverse_transform
.
Returns
-
self
: objectThe updated object.
set_output(, transform=None)*
Set output container.
See :ref:`sphx_glr_auto_examples_miscellaneous_plot_set_output.py`
for an example on how to use the API.
Parameters
-
transform
: {"default", "pandas", "polars"}, default=NoneConfigure output of
transform
andfit_transform
."default"
: Default output format of a transformer"pandas"
: DataFrame output"polars"
: Polars outputNone
: Transform configuration is unchanged
.. versionadded:: 1.4
"polars"
option was added.
Returns
-
self
: estimator instanceEstimator instance.
set_params(params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects
(such as :class:`~sklearn.pipeline.Pipeline`). The latter have
parameters of the form ``<component>__<parameter>`` so that it's
possible to update each component of a nested object.
Parameters
-
**params
: dictEstimator parameters.
Returns
-
self
: estimator instanceEstimator instance.
set_transform_request(self: mlxtend.preprocessing.transactionencoder.TransactionEncoder, , sparse: Union[bool, NoneType, str] = '$UNCHANGED$') -> mlxtend.preprocessing.transactionencoder.TransactionEncoder*
Request metadata passed to the transform
method.
Note that this method is only relevant if
``enable_metadata_routing=True`` (see :func:`sklearn.set_config`).
Please see :ref:`User Guide <metadata_routing>` on how the routing
mechanism works.
The options for each parameter are:
- ``True``: metadata is requested, and passed to ``transform`` if provided. The request is ignored if metadata is not provided.
- ``False``: metadata is not requested and the meta-estimator will not pass it to ``transform``.
- ``None``: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
- ``str``: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (``sklearn.utils.metadata_routing.UNCHANGED``) retains the
existing request. This allows you to change the request for some
parameters and not others.
.. versionadded:: 1.3
.. note::
This method is only relevant if this estimator is used as a
sub-estimator of a meta-estimator, e.g. used inside a
:class:`~sklearn.pipeline.Pipeline`. Otherwise it has no effect.
Parameters
-
sparse
: str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGEDMetadata routing for
sparse
parameter intransform
.
Returns
-
self
: objectThe updated object.
transform(X, sparse=False)
将交易转换为一维独热编码的NumPy数组.
Parameters
-
X
: 列表的列表一个Python列表的列表,其中外部列表存储n个交易,内部列表存储每个交易中的项目.
例如, [['Apple', 'Beer', 'Rice', 'Chicken'], ['Apple', 'Beer', 'Rice'], ['Apple', 'Beer'], ['Apple', 'Bananas'], ['Milk', 'Beer', 'Rice', 'Chicken'], ['Milk', 'Beer', 'Rice'], ['Milk', 'Beer'], ['Apple', 'Bananas']]
sparse: 布尔值 (默认=False) 如果为True,transform将返回压缩稀疏行矩阵,而不是常规矩阵.
Returns
-
array
: NumPy数组 [n_transactions, n_unique_items]如果sparse=False(默认). 否则为压缩稀疏行矩阵 输入交易的独热编码布尔数组,其中列表示按字母顺序排列的输入数组中找到的唯一项目.确切的表示形式取决于sparse参数.
例如, array([[True , False, True , True , False, True ], [True , False, True , False, False, True ], [True , False, True , False, False, False], [True , True , False, False, False, False], [False, False, True , True , True , True ], [False, False, True , False, True , True ], [False, False, True , False, True , False], [True , True , False, False, False, False]]) 对应的列标签可通过self.columns_获得,例如, ['Apple', 'Bananas', 'Beer', 'Chicken', 'Milk', 'Rice']
minmax_scaling
minmax_scaling(array, columns, min_val=0, max_val=1)
pandas DataFrame 的最小-最大缩放.
Parameters
-
array
: pandas DataFrame 或 NumPy ndarray,形状 = [n_rows, n_columns]. -
columns
: 类数组,形状 = [n_columns]包含列名的类数组,例如 ['col1', 'col2', ...] 或列索引 [0, 2, 4, ...]
-
min_val
:int
或float
,可选 (默认=0
)缩放后的最小值.
-
max_val
:int
或float
,可选 (默认=1
)缩放后的最大值.
Returns
-
df_new
: pandas DataFrame 对象.具有缩放列的数组或 DataFrame 的副本.
Examples
有关使用示例,请参见 https://rasbt.github.io/mlxtend/user_guide/preprocessing/minmax_scaling/
one_hot
one_hot(y, num_labels='auto', dtype='float')
类别标签的独热编码
Parameters
-
y
: 类数组, shape = [n_classlabels]由类别标签组成的Python列表或numpy数组.
-
num_labels
: int 或 'auto'类别标签数组中唯一标签的数量.如果设置为'auto',则从输入数组推断唯一标签的数量.
-
dtype
: str输出数组的NumPy数组类型(float, float32, float64).
Returns
-
ary
: numpy.ndarray, shape = [n_classlabels]独热编码后的数组,其中每个样本在返回的数组中表示为行向量.
Examples
有关使用示例,请参见 https://rasbt.github.io/mlxtend/user_guide/preprocessing/one_hot/
shuffle_arrays_unison
shuffle_arrays_unison(arrays, random_seed=None)
同步打乱 NumPy 数组.
Parameters
-
arrays
: array-like, shape = [n_arrays]一个包含 NumPy 数组的列表.
-
random_seed
: int (默认: None)设置随机状态.
Returns
shuffled_arrays
: 打乱后的 NumPy 数组列表.
Examples
```
>>> import numpy as np
>>> from mlxtend.preprocessing import shuffle_arrays_unison
>>> X1 = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
>>> y1 = np.array([1, 2, 3])
>>> X2, y2 = shuffle_arrays_unison(arrays=[X1, y1], random_seed=3)
>>> assert(X2.all() == np.array([[4, 5, 6], [1, 2, 3], [7, 8, 9]]).all())
>>> assert(y2.all() == np.array([2, 1, 3]).all())
>>>
更多使用示例,请参见
https://rasbt.github.io/mlxtend/user_guide/preprocessing/shuffle_arrays_unison/
```
standardize
standardize(array, columns=None, ddof=0, return_params=False, params=None)
标准化 pandas DataFrame 中的列.
Parameters
-
array
: pandas DataFrame 或 NumPy ndarray,形状 = [n_rows, n_columns]. -
columns
: 类数组,形状 = [n_columns](默认: None)包含列名的类数组,例如 ['col1', 'col2', ...] 或列索引 [0, 2, 4, ...] 如果为 None,则标准化所有列.
-
ddof
: int(默认: 0)自由度修正量.计算中使用的除数是 N - ddof,其中 N 表示元素的数量.
-
return_params
: dict(默认: False)如果设置为 True,除了标准化数组外,还会返回一个字典.该参数字典包含各列的均值('avgs')和标准差('stds').
-
params
: dict(默认: None)包含列均值和标准差的字典,如
standardize
函数在return_params
设置为 True 时返回的那样.如果提供了params
字典,standardize
函数将使用这些参数而不是从当前数组中计算.
Notes
如果给定列中的所有值都相同,则这些值都设置为 0.0
.parameters
字典中的标准差因此设置为 1.0
,以避免除以零.
Returns
-
df_new
: pandas DataFrame 对象.具有标准化列的数组或 DataFrame 的副本.
Examples
有关使用示例,请参见 https://rasbt.github.io/mlxtend/user_guide/preprocessing/standardize/