pandas.DataFrame.nsmallest#

DataFrame.nsmallest(n, columns, keep='first')[源代码][源代码]#

按 columns 升序排列返回前 n 行。

返回 columns 中具有最小值的前 n 行，按升序排列。未指定的列也会返回，但不会用于排序。

此方法等同于 df.sort_values(columns, ascending=True).head(n)，但性能更高。

参数:

nint

要检索的项目数量。

列列表或字符串

要排序的列名或列名。

保持{‘first’, ‘last’, ‘all’}, 默认 ‘first’

在有重复值的地方：

first : 取第一次出现。
last : 取最后一次出现。
all : 保留最大项的所有关联，即使这意味着选择超过 n 个项。

返回:

DataFrame: 按 columns 列升序排列的前 n 行组成的 DataFrame。

参见

DataFrame.nlargest: 返回按 columns 降序排列的前 n 行。
DataFrame.sort_values: 按值排序 DataFrame。
DataFrame.head: 返回前 n 行而不重新排序。

例子

>>> df = pd.DataFrame(
...     {
...         "population": [
...             59000000,
...             65000000,
...             434000,
...             434000,
...             434000,
...             337000,
...             337000,
...             11300,
...             11300,
...         ],
...         "GDP": [1937894, 2583560, 12011, 4520, 12128, 17036, 182, 38, 311],
...         "alpha-2": ["IT", "FR", "MT", "MV", "BN", "IS", "NR", "TV", "AI"],
...     },
...     index=[
...         "Italy",
...         "France",
...         "Malta",
...         "Maldives",
...         "Brunei",
...         "Iceland",
...         "Nauru",
...         "Tuvalu",
...         "Anguilla",
...     ],
... )
>>> df
          population      GDP alpha-2
Italy       59000000  1937894      IT
France      65000000  2583560      FR
Malta         434000    12011      MT
Maldives      434000     4520      MV
Brunei        434000    12128      BN
Iceland       337000    17036      IS
Nauru         337000      182      NR
Tuvalu         11300       38      TV
Anguilla       11300      311      AI

在以下示例中，我们将使用 nsmallest 选择列“population”中值最小的三行。

>>> df.nsmallest(3, "population")
          population    GDP alpha-2
Tuvalu         11300     38      TV
Anguilla       11300    311      AI
Iceland       337000  17036      IS

当使用 keep='last' 时，平局按相反顺序解决：

>>> df.nsmallest(3, "population", keep="last")
          population  GDP alpha-2
Anguilla       11300  311      AI
Tuvalu         11300   38      TV
Nauru         337000  182      NR

当使用 keep='all' 时，如果最大元素有重复值，保留的元素数量可以超过 n，所有相同的元素都会被保留。

>>> df.nsmallest(3, "population", keep="all")
          population    GDP alpha-2
Tuvalu         11300     38      TV
Anguilla       11300    311      AI
Iceland       337000  17036      IS
Nauru         337000    182      NR

然而，nsmallest 并不保持 n 个不同的最小元素：

>>> df.nsmallest(4, "population", keep="all")
          population    GDP alpha-2
Tuvalu         11300     38      TV
Anguilla       11300    311      AI
Iceland       337000  17036      IS
Nauru         337000    182      NR

要按列 “population” 中的最小值排序，然后按 “GDP” 排序，我们可以在下一个示例中指定多个列。

>>> df.nsmallest(3, ["population", "GDP"])
          population  GDP alpha-2
Tuvalu         11300   38      TV
Anguilla       11300  311      AI
Nauru         337000  182      NR