dask.dataframe.Index.nlargest

dask.dataframe.Index.nlargest¶

Index.nlargest(n=5, split_every=None)¶

返回最大的 n 个元素。

此文档字符串是从 pandas.core.series.Series.nlargest 复制而来的。

Dask 版本可能存在一些不一致性。

参数

nint, 默认值为 5

返回这些按降序排序的值。

保持{‘first’, ‘last’, ‘all’}, 默认 ‘first’ (Dask 不支持)

当存在不能全部放入 n 元素的 Series 中的重复值时：

first : 返回按出现顺序的前 n 个出现项。
last : 返回最后 n 次出现的逆序。
all : 保留所有出现。这可能导致一个大小大于 n 的 Series。

返回

系列: Series 中最大的 n 个值，按降序排列。

参见

Series.nsmallest: 获取最小的 n 个元素。
Series.sort_values: 按值排序系列。
Series.head: 返回前 n 行。

注释

对于相对于 Series 对象大小较小的 n ，比 .sort_values(ascending=False).head(n) 更快。

示例

>>> countries_population = {"Italy": 59000000, "France": 65000000,  
...                         "Malta": 434000, "Maldives": 434000,
...                         "Brunei": 434000, "Iceland": 337000,
...                         "Nauru": 11300, "Tuvalu": 11300,
...                         "Anguilla": 11300, "Montserrat": 5200}
>>> s = pd.Series(countries_population)  
>>> s  
Italy       59000000
France      65000000
Malta         434000
Maldives      434000
Brunei        434000
Iceland       337000
Nauru          11300
Tuvalu         11300
Anguilla       11300
Montserrat      5200
dtype: int64

默认情况下，n 个最大的元素，其中 n=5。

>>> s.nlargest()  
France      65000000
Italy       59000000
Malta         434000
Maldives      434000
Brunei        434000
dtype: int64

最大的 n 个元素，其中 n=3。默认的 keep 值是 ‘first’，因此 Malta 将被保留。

>>> s.nlargest(3)  
France    65000000
Italy     59000000
Malta       434000
dtype: int64

最大的 n 个元素，其中 n=3 并保留最后的重复项。根据索引顺序，文莱将被保留，因为它是最后一个值为 434000 的元素。

>>> s.nlargest(3, keep='last')  
France      65000000
Italy       59000000
Brunei        434000
dtype: int64

最大的 n 个元素，其中 n=3，保留所有重复项。注意，由于三个重复项，返回的 Series 有五个元素。

>>> s.nlargest(3, keep='all')  
France      65000000
Italy       59000000
Malta         434000
Maldives      434000
Brunei        434000
dtype: int64

dask.dataframe.Index.ne

dask.dataframe.Index.notnull