dask.dataframe.DataFrame.where

dask.dataframe.DataFrame.where¶

DataFrame.where(cond, other=nan)¶

替换条件为 False 的值。

此文档字符串是从 pandas.core.frame.DataFrame.where 复制而来的。

Dask 版本可能存在一些不一致性。

参数

cond布尔 Series/DataFrame，类数组，或可调用对象: 当 cond 为 True 时，保留原始值。当为 False 时，替换为 other 中的相应值。如果 cond 是可调用的，它会在 Series/DataFrame 上计算，并应返回布尔型 Series/DataFrame 或数组。可调用对象不能改变输入的 Series/DataFrame（尽管 pandas 不会检查这一点）。
其他标量、Series/DataFrame 或可调用对象: cond 为 False 的条目将被 other 中的相应值替换。如果 other 是可调用的，它将在 Series/DataFrame 上计算，并应返回标量或 Series/DataFrame。可调用对象不能改变输入的 Series/DataFrame（尽管 pandas 不会检查这一点）。如果未指定，条目将被填充为相应的 NULL 值（np.nan 用于 numpy 数据类型，pd.NA 用于扩展数据类型）。
就地bool, 默认 False (Dask 中不支持): 是否对数据进行原地操作。
轴int, 默认 None (在 Dask 中不支持): 如果需要，对齐轴。对于 Series，此参数未使用并默认为 0。
级别int, 默认 None (在 Dask 中不支持): 对齐级别（如果需要）。

返回

与调用者类型相同，如果 inplace=True 则为 None。

参见

DataFrame.mask(): 返回一个与自身形状相同的对象。

注释

where 方法是 if-then 惯用法的应用。对于调用 DataFrame 中的每个元素，如果 cond 为 True，则使用该元素；否则使用 DataFrame other 中对应的元素。如果 other 的轴与 cond 的 Series/DataFrame 的轴不一致，则未对齐的索引位置将填充为 False。

函数 DataFrame.where() 的签名与 numpy.where() 不同。大致上 df1.where(m, df2) 等同于 np.where(m, df1, df2)。

有关更多详细信息和示例，请参阅索引中的 where 文档。

对象的 dtype 优先。如果可以无损转换，填充值会被转换为对象的 dtype。

示例

>>> s = pd.Series(range(5))  
>>> s.where(s > 0)  
0    NaN
1    1.0
2    2.0
3    3.0
4    4.0
dtype: float64
>>> s.mask(s > 0)  
0    0.0
1    NaN
2    NaN
3    NaN
4    NaN
dtype: float64

>>> s = pd.Series(range(5))  
>>> t = pd.Series([True, False])  
>>> s.where(t, 99)  
0     0
1    99
2    99
3    99
4    99
dtype: int64
>>> s.mask(t, 99)  
0    99
1     1
2    99
3    99
4    99
dtype: int64

>>> s.where(s > 1, 10)  
  10
  10
  2
  3
  4
dtype: int64
>>> s.mask(s > 1, 10)  
   0
   1
  10
  10
  10
dtype: int64

>>> df = pd.DataFrame(np.arange(10).reshape(-1, 2), columns=['A', 'B'])  
>>> df  
   A  B
0  0  1
1  2  3
2  4  5
3  6  7
4  8  9
>>> m = df % 3 == 0  
>>> df.where(m, -df)  
   A  B
0  0 -1
1 -2  3
2 -4 -5
3  6 -7
4 -8  9
>>> df.where(m, -df) == np.where(m, df, -df)  
      A     B
0  True  True
1  True  True
2  True  True
3  True  True
4  True  True
>>> df.where(m, -df) == df.mask(~m, -df)  
      A     B
0  True  True
1  True  True
2  True  True
3  True  True
4  True  True

dask.dataframe.DataFrame.visualize

dask.dataframe.Series