pandas.Index.to_numpy#

Index.to_numpy(dtype=None, copy=False, na_value=<no_default>, **kwargs)[源代码]#

表示此 Series 或 Index 中值的 NumPy ndarray。

参数:

dtypestr 或 numpy.dtype，可选: 要传递给 numpy.asarray() 的 dtype。
复制bool, 默认 False: 是否确保返回的值不是另一个数组的视图。请注意，copy=False 并不确保 to_numpy() 是无复制的。相反，copy=True 确保即使不是严格必要也会进行复制。
na_value任意，可选: 用于缺失值的值。默认值取决于 dtype 和数组的类型。
**kwargs: 传递给底层数组的 to_numpy 方法的附加关键字（用于扩展数组）。

返回:

numpy.ndarray: 包含此 Series 或 Index 值的 NumPy ndarray。数组的 dtype 可能不同。请参见注释。

参见

Series.array: 获取存储在其中的实际数据。
Index.array: 获取存储在其中的实际数据。
DataFrame.to_numpy: DataFrame 的类似方法

备注

返回的数组将在相等性方面相同（self 中相等的值在返回的数组中也将相等；同样适用于不相等的值）。当 self 包含一个 ExtensionArray 时，数据类型可能不同。例如，对于一个类别数据类型的 Series，to_numpy() 将返回一个 NumPy 数组，并且类别数据类型将会丢失。

对于 NumPy 数据类型，这将是对存储在此 Series 或 Index 中的实际数据的引用（假设 copy=False）。就地修改结果将修改存储在 Series 或 Index 中的数据（尽管我们不推荐这样做）。

对于扩展类型，to_numpy() 可能需要复制数据并将结果强制转换为 NumPy 类型（可能是对象），这可能会很昂贵。当你需要对底层数据的零拷贝引用时，应使用 Series.array。

此表列出了在pandas中不同dtypes下``to_numpy()``的不同dtypes和默认返回类型。

dtype	数组类型
类别[T]	ndarray[T]（与输入相同的dtype）
period	ndarray[object] (Periods)
interval	ndarray[object] (区间)
IntegerNA	ndarray[object]
datetime64[ns]	datetime64[ns]
datetime64[ns, tz]	ndarray[object] (时间戳)

例子

>>> ser = pd.Series(pd.Categorical(["a", "b", "a"]))
>>> ser.to_numpy()
array(['a', 'b', 'a'], dtype=object)

指定 dtype 以控制如何表示带时区的时间数据。使用 dtype=object 返回一个 pandas Timestamp 对象的 ndarray，每个对象都有正确的 tz。

>>> ser = pd.Series(pd.date_range("2000", periods=2, tz="CET"))
>>> ser.to_numpy(dtype=object)
array([Timestamp('2000-01-01 00:00:00+0100', tz='CET'),
       Timestamp('2000-01-02 00:00:00+0100', tz='CET')],
      dtype=object)

或者 dtype='datetime64[ns]' 以返回一个本地 datetime64 值的 ndarray。这些值被转换为 UTC 并且时区信息被丢弃。

>>> ser.to_numpy(dtype="datetime64[ns]")
... 
array(['1999-12-31T23:00:00.000000000', '2000-01-01T23:00:00...'],
      dtype='datetime64[ns]')