dask.dataframe.to_numeric

dask.dataframe.to_numeric¶

dask.dataframe.to_numeric(arg, errors='raise', meta=None)[源代码]¶

将参数转换为数值类型。

此文档字符串是从 pandas.to_numeric 复制而来的。

Dask 版本可能存在一些不一致性。

返回类型取决于输入。如果是标量则延迟，否则与输入相同。对于错误，只允许“raise”和“coerce”。

默认的返回数据类型是 float64 或 int64，取决于提供的数据。使用 downcast 参数可以获取其他数据类型。

请注意，如果传入非常大的数字，可能会发生精度损失。由于 ndarray 的内部限制，如果传入小于 -9223372036854775808 (np.iinfo(np.int64).min) 或大于 18446744073709551615 (np.iinfo(np.uint64).max) 的数字，它们很可能会被转换为浮点数，以便可以存储在 ndarray 中。这些警告同样适用于 Series，因为它内部利用了 ndarray。

参数

参数标量、列表、元组、一维数组或序列

要转换的参数。

错误{‘忽略’, ‘引发’, ‘强制’}, 默认 ‘引发’

如果为 ‘raise’，则无效的解析将引发异常。
如果为 ‘coerce’，则无效的解析将被设置为 NaN。
如果为 ‘ignore’，则无效的解析将返回输入内容。

在 2.2 版更改.

“ignore” 已被弃用。请改为显式捕获异常。

向下转换str, 默认 None (Dask 不支持) (Dask 不支持)

可以是 ‘integer’, ‘signed’, ‘unsigned’, 或 ‘float’。如果不是 None，并且如果数据已成功转换为数值 dtype（或者如果数据一开始就是数值），则根据以下规则将该结果数据向下转换为可能的最小数值 dtype：

‘整数’ 或 ‘有符号’: 最小的有符号整数数据类型 (最小: np.int8)
‘unsigned’: 最小的无符号整数数据类型（最小值：np.uint8）
‘float’: 最小的浮点数数据类型（最小值：np.float32）

由于此行为与核心转换为数值的过程是分开的，因此在向下转换期间引发的任何错误都会被暴露，无论 ‘errors’ 输入的值是什么。

此外，只有当结果数据的 dtype 的大小严格大于要转换到的 dtype 时，才会发生向下转换，因此如果没有一个检查的 dtype 满足该规范，则不会对数据执行向下转换。

dtype_backend{‘numpy_nullable’, ‘pyarrow’}, 默认 ‘numpy_nullable’ (Dask 中不支持)

应用于结果 DataFrame 的后端数据类型（仍处于实验阶段）。行为如下：

"numpy_nullable": 返回以可空数据类型为支持的 :class:`DataFrame`（默认）。
"pyarrow": 返回由 pyarrow 支持的可空 ArrowDtype DataFrame。

2.0 新版功能.

返回

返回: 如果解析成功，则返回数值。返回类型取决于输入。如果是 Series，则返回 Series，否则返回 ndarray。

参见

DataFrame.astype: 将参数转换为指定的数据类型。
to_datetime: 将参数转换为日期时间。
to_timedelta: 将参数转换为 timedelta。
numpy.ndarray.astype: 将 numpy 数组转换为指定类型。
DataFrame.convert_dtypes: 转换数据类型。

示例

将单独的系列转换为数值，根据指示进行强制转换

>>> s = pd.Series(['1.0', '2', -3])  
>>> pd.to_numeric(s)  
0    1.0
1    2.0
2   -3.0
dtype: float64
>>> pd.to_numeric(s, downcast='float')  
0    1.0
1    2.0
2   -3.0
dtype: float32
>>> pd.to_numeric(s, downcast='signed')  
0    1
1    2
2   -3
dtype: int8
>>> s = pd.Series(['apple', '1.0', '2', -3])  
>>> pd.to_numeric(s, errors='coerce')  
0    NaN
1    1.0
2    2.0
3   -3.0
dtype: float64

支持对可空整数和浮点数据类型的向下转换：

>>> s = pd.Series([1, 2, 3], dtype="Int64")  
>>> pd.to_numeric(s, downcast="integer")  
0    1
1    2
2    3
dtype: Int8
>>> s = pd.Series([1.0, 2.1, 3.0], dtype="Float64")  
>>> pd.to_numeric(s, downcast="float")  
0    1.0
1    2.1
2    3.0
dtype: Float32

dask.dataframe.to_datetime

dask.dataframe.to_时间差