Background
- \(\det(A) = |A| = a_{11}a_{22} - a_{21}b_{21}\)
- Projection onto u: \(\frac{u^Tx}{u^Tu}u\)
Data, Attributes, Points
- Dimensionality
- feature vector
- Attribute type (numerical/categorical && nominal, ordinal, discrete, continuous)
Geometric View
- 1D,2D,3D
- correlation: \(\rho_{xy} = \frac{\sigma_{xy}}{\sigma_x\sigma_y} = \frac{\sigma_{xy}}{\sqrt{\sigma^2_x \sigma^2_y}} = \frac{\frac{\sum(x_i-\mu_x)(y_i-\mu_y)}{n}}{\sqrt{\frac{\sum(x-\mu_x)^2}{n} + \frac{\sum(y-\mu_y)^2}{n}}} = \frac{x' \cdot y'}{||x'||||y'||} = \cos \theta\)
Eigenvectors and values
- eigenvalue = \(\lambda\)
- \(\)
Covariance/variance
- \(\sigma_{xy} = \frac{\sum(x_i-\mu_x)(y_i-\mu_y)}{n}\) is the covariance
- covariance matrix: \(\Sigma = \begin{pmatrix} \sigma^2_{x_1} && \sigma_{x_1x_2} && ... && \sigma_{x_1x_d} \\ && \sigma^2{x_2} \\ && && ... \\ && && && \sigma^2_{x_d}\end{pmatrix} = (x - \mu)^T(x-\mu)\)
- Total variance = \(|\Sigma|\)
Power rule
- \(\lambda = \frac{Ax \cdot x}{x \cdot x}\)
- \(B = A - \lambda v v^T\) is how compression works
- exercise
PCA
- \(\mu = \frac{1}{n}\sum^n_{i=1}x_i\) compute the mean
- \(Z = D-\mu^T\) center the data
- \(\Sigma = \frac{1}{n}(Z^TZ)\) compute the covariance matrix
- compute eigenvalues
- compute eigenvectors
- compute fraction of total variance so that the smallest \(f(r) \ge \alpha\) to choose dimensionality
- \(U_r\) is the reduced basis
- transformation matrix is \(P = U_rU_r^T\)
- \(PX = X'\)
LDA
Frequent pattern mining
Vocab
- tidset
- inverted index
Naive algorithm
Manually compute
Apriori algorithm
Eclat algorithm
support vector machines
- dual formation: \(\underset \alpha \max \sum \limits^n_{i=1}\alpha_i - \frac{1}{2} \sum^n_{i=1}\sum^n_{j=1}\alpha_i\alpha_jy_iy_j(\vec x^T_i\vec x_j)\)
>>>>>>> Stashed changes