Principal Component Analysis

Definitions

Principal components are a sequence of projections of the data, mutually uncorrelated (ie orthogonal )and ordered in variance. [1]

Equivalent definition:

Goal is to identify the most meaningful basis to reexpress a data set. ie. Is there another basis that is a linear combination of the original basis that best reexpresses the data set. Reexpressing the data as a linear combination of it's basis vectors. [2]

PCA as a blackbox

Input: Data with p features
Output: Maps data to p new features, such that covariance between features is 0.

Implementation

References

[1]: Elements of Statistical Learning
[2]: A Tutorial on PCA