pandas.core.groupby.DataFrameGroupBy.corr#

DataFrameGroupBy.corr(method='pearson', min_periods=1, numeric_only=False)[源代码][源代码]#

计算列之间的成对相关性，排除NA/null值。

参数:

方法{‘pearson’, ‘kendall’, ‘spearman’} 或可调用

相关方法：

pearson : 标准相关系数
kendall : Kendall Tau 相关系数
spearman : Spearman 等级相关
callable: 带有两个一维 ndarray 输入的可调用对象
并返回一个浮点数。注意，从 corr 返回的矩阵将对角线上的值为 1，并且无论可调用对象的行为如何，都将是对称的。

min_periodsint, 可选

每对列所需的最小观测数以获得有效结果。目前仅适用于皮尔逊和斯皮尔曼相关性。

numeric_onlybool, 默认 False

只包含 float, int 或 boolean 数据。

Added in version 1.5.0.

在 2.0.0 版本发生变更: numeric_only 的默认值现在是 False。

返回:

DataFrame: 相关矩阵。

参见

DataFrame.corrwith: 计算与另一个 DataFrame 或 Series 的成对相关性。
Series.corr: 计算两个系列之间的相关性。

备注

Pearson、Kendall 和 Spearman 相关性目前使用成对完全观测值进行计算。

Pearson 相关系数的中文翻译
肯德尔等级相关系数
斯皮尔曼等级相关系数

例子

>>> def histogram_intersection(a, b):
...     v = np.minimum(a, b).sum().round(decimals=1)
...     return v
>>> df = pd.DataFrame(
...     [(0.2, 0.3), (0.0, 0.6), (0.6, 0.0), (0.2, 0.1)],
...     columns=["dogs", "cats"],
... )
>>> df.corr(method=histogram_intersection)
      dogs  cats
dogs   1.0   0.3
cats   0.3   1.0

>>> df = pd.DataFrame(
...     [(1, 1), (2, np.nan), (np.nan, 3), (4, 4)], columns=["dogs", "cats"]
... )
>>> df.corr(min_periods=3)
      dogs  cats
dogs   1.0   NaN
cats   NaN   1.0