chapter20

LDA Linear discriminant analysis

optimal linear discriminant

  • LDA maximizes the separation by ensuring that the scatter is small
  • \(s^2_i = \sum\limits_{x_j \in D_i}(a_j-m_i)^2 = n_i\sigma^2_i\)
  • \(n_i = |D_i|\) is the size
  • \(\max\limits_w j(w) = \frac{(m_1-m_2)^2}{s^2_1+s^2_2}\)
  • goal of lda is to find the vector \(w\) that maximizes \(j(w)\). This maximizes the separation between \(m_1 \) and \(m_2\) and minimizes the total scatter \(s^2_1+s^2_2\) of the two classes
  • \((m_1-m_2)^2 = w^TBw\)
  • \(B = (\mu_1-\mu_2)(\mu_1-\mu_2)^T\) is a d x d rank one matrix (between class scatter matrix)
  • \(s^2 = w^TS_1w\)
  • \(s^2_2 = w^TS_2w\)
  • \(S_i = n_i\Sigma_i\)
  • \(\max\limits_w J(w) = \frac{w^TBw}{w^TSw}\)
  • \(bW =\lambda Sw\)
  • \(S^{-1}Bw = \lambda w\)
  • computing S takes \(O(nd^2)\) time
  • computing the dominant eigenvalue-eigenvector pair takes \(O(d^3)\) time