线性判别分析：用于降维的线性判别分析

线性判别分析的实现用于降维

> 从 mlxtend.feature_extraction 导入线性判别分析

概述

线性判别分析 (LDA) 通常作为一种降维技术，用于模式分类和机器学习应用中的预处理步骤。其目标是将数据集投影到一个低维空间中，具有良好的类可分性，以避免过拟合（“维度灾难”）并减少计算成本。

罗纳德·A·费舍尔在1936年提出了线性判别 (在分类问题中的多重测量的使用)，它在作为分类器时也有一些实际用途。最初的线性判别是针对2类问题描述的，后来由C. R. Rao在1948年推广为“多类线性判别分析”或“多重判别分析” (在生物分类问题中对多重测量的利用)。

一般的LDA方法与主成分分析（PCA）非常相似，但除了寻找最大化数据方差的成分轴（PCA）外，我们还特别关注最大化多个类别之间分离的轴（LDA）。

因此，简而言之，LDA的目标通常是将特征空间（一个数据集的n维样本）投影到一个更小的子空间$k$（其中$k \leq n-1$），同时保持类别区分信息。
一般来说，降维不仅有助于减少特定分类任务的计算成本，而且还可以通过最小化参数估计中的误差（“维度诅咒”）来帮助避免过拟合。

用5个步骤总结LDA方法

以下是执行线性判别分析的一般步骤：

计算来自数据集不同类别的$d$维均值向量。
计算散布矩阵（类间散布矩阵和类内散布矩阵）。
计算散布矩阵的特征向量（$\mathbf{e_1}, \; \mathbf{e_2}, \; ..., \; \mathbf{e_d}$）及其对应的特征值（$\mathbf{\lambda_1}, \; \mathbf{\lambda_2}, \; ..., \; \mathbf{\lambda_d}$）。
按特征值降序排列特征向量，并选择具有最大特征值的$k$个特征向量形成一个$k \times d$维度的矩阵$\mathbf{W}$（其中每一列代表一个特征向量）。
使用这个$k \times d$特征向量矩阵将样本转换到新的子空间。这可以用数学公式表示为：$\mathbf{Y} = \mathbf{X} \times \mathbf{W}$（其中$\mathbf{X}$是一个$n \times d$维度的矩阵，表示$n$个样本，而$\mathbf{y}$是新的子空间中转换后的$n \times k$维样本）。

参考文献

Fisher, Ronald A. "多重测量在分类问题中的应用。" 人类优生学年鉴 7.2 (1936): 179-188.
Rao, C. Radhakrishna. "在生物分类问题中使用多重测量。" 皇家统计学会杂志. B系列 (方法论) 10.2 (1948): 159-203.

示例 1 - 在鸢尾花数据集上应用LDA

from mlxtend.data import iris_data
from mlxtend.preprocessing import standardize
from mlxtend.feature_extraction import LinearDiscriminantAnalysis

X, y = iris_data()
X = standardize(X)

lda = LinearDiscriminantAnalysis(n_discriminants=2)
lda.fit(X, y)
X_lda = lda.transform(X)

import matplotlib.pyplot as plt

with plt.style.context('seaborn-whitegrid'):
    plt.figure(figsize=(6, 4))
    for lab, col in zip((0, 1, 2),
                        ('blue', 'red', 'green')):
        plt.scatter(X_lda[y == lab, 0],
                    X_lda[y == lab, 1],
                    label=lab,
                    c=col)
    plt.xlabel('Linear Discriminant 1')
    plt.ylabel('Linear Discriminant 2')
    plt.legend(loc='lower right')
    plt.tight_layout()
    plt.show()

png

示例 2 - 绘制类间方差解释比例

from mlxtend.data import iris_data
from mlxtend.preprocessing import standardize
from mlxtend.feature_extraction import LinearDiscriminantAnalysis

X, y = iris_data()
X = standardize(X)

lda = LinearDiscriminantAnalysis(n_discriminants=None)
lda.fit(X, y)
X_lda = lda.transform(X)

import numpy as np

tot = sum(lda.e_vals_)
var_exp = [(i / tot)*100 for i in sorted(lda.e_vals_, reverse=True)]
cum_var_exp = np.cumsum(var_exp)

with plt.style.context('seaborn-whitegrid'):
    fig, ax = plt.subplots(figsize=(6, 4))
    plt.bar(range(4), var_exp, alpha=0.5, align='center',
            label='individual explained variance')
    plt.step(range(4), cum_var_exp, where='mid',
             label='cumulative explained variance')
    plt.ylabel('Explained variance ratio')
    plt.xlabel('Principal components')
    plt.xticks(range(4))
    ax.set_xticklabels(np.arange(1, X.shape[1] + 1))
    plt.legend(loc='best')
    plt.tight_layout()

png

API

LinearDiscriminantAnalysis(n_discriminants=None)

Linear Discriminant Analysis Class

Parameters

n_discriminants : int (default: None)

The number of discrimants for transformation. Keeps the original dimensions of the dataset if None.

Attributes

w_ : array-like, shape=[n_features, n_discriminants]

Projection matrix
e_vals_ : array-like, shape=[n_features]

Eigenvalues in sorted order.
e_vecs_ : array-like, shape=[n_features]

Eigenvectors in sorted order.

Examples

For usage examples, please see https://rasbt.github.io/mlxtend/user_guide/feature_extraction/LinearDiscriminantAnalysis/

Methods

fit(X, y, n_classes=None)

Fit the LDA model with X.

Parameters

X : {array-like, sparse matrix}, shape = [n_samples, n_features]

Training vectors, where n_samples is the number of samples and n_features is the number of features.
y : array-like, shape = [n_samples]

Target values.
n_classes : int (default: None)

A positive integer to declare the number of class labels if not all class labels are present in a partial training set. Gets the number of class labels automatically if None.

Returns

self : object

transform(X)

Apply the linear transformation on X.

Parameters

X : {array-like, sparse matrix}, shape = [n_samples, n_features]

Training vectors, where n_samples is the number of samples and n_features is the number of features.

Returns

X_projected : np.ndarray, shape = [n_samples, n_discriminants]

Projected training vectors.