Challenge your understanding of advanced Principal Component Analysis concepts focused on eigenvalues and eigenvectors, including their calculation, interpretation, and applications in dimensionality reduction and data variance. Ideal for anyone seeking to deepen their foundational knowledge of PCA mechanics and linear algebra’s role in machine learning.
In Principal Component Analysis, what does a higher eigenvalue indicate about a principal component when analyzing a data set’s covariance matrix?
Explanation: A higher eigenvalue in PCA indicates that its corresponding principal component accounts for more of the variance in the data. Variance explained is a key metric for feature selection and dimension reduction tasks. Being correlated with original variables does not directly relate to the eigenvalue; this is a common misconception. The number of missing values is unrelated to eigenvalues, and principal components are always related to the covariance and mean of the dataset, not 'unrelated.'
Which property best describes eigenvectors in the context of Principal Component Analysis applied to a standardized data set?
Explanation: Eigenvectors in PCA indicate the directions, or axes, along which variance is maximized, guiding the transformation of data into principal components. They do not represent the means but instead show the directions for principal components. While eigenvectors for a symmetric covariance matrix are orthogonal to each other, they are not necessarily orthogonal to the data axes. They are also not standardized values themselves.
After extracting the top two principal components from a data set, what is the correct process for projecting the data onto this new two-dimensional space?
Explanation: Projecting data onto principal components involves a matrix multiplication of the standardized data with the selected eigenvectors. Eigenvalues are not directly used for projection but for ranking components. Adding eigenvalues or dividing by the number of observations are not relevant steps in PCA’s dimensionality reduction process.
If the first principal component explains 65% of the variance and the second explains 20%, what percentage of the original data variation is preserved by using the first two principal components?
Explanation: Variance preserved by selected principal components is the sum of their individual explained variances, which is 65% plus 20%, totaling 85%. 45% and 65% underestimate the actual variance retained. 100% would only apply if all components were included.
Why is the covariance matrix used in PCA rather than the correlation matrix when all variables have the same scale?
Explanation: Using the covariance matrix for variables with similar scales maintains their real variance and relationships, which might be lost with correlation. The correlation matrix standardizes data, but that’s only necessary when variables are on different scales. The covariance matrix does not impact dimensionality or handle outliers.
Upon performing eigenvalue decomposition of a covariance matrix in PCA, which output is directly interpreted as the importance of each principal component?
Explanation: Eigenvalues, once sorted, directly show the importance of each principal component—the higher the eigenvalue, the more variance explained. The product of the data and eigenvectors yields projected scores, not importance. Variable means and the number of rows do not measure component importance.
When transforming a data set using PCA, which mathematical object is applied to rotate the data axes to the directions of greatest variance?
Explanation: It’s the eigenvectors that define the directions of new axes (principal components) onto which the data is rotated for maximum variance. Eigenvalues rank the axes but do not perform the transformation. The identity matrix leaves the data unchanged, and cluster centers are unrelated to PCA.
Given a covariance matrix derived from four standardized variables, what does the sum of its eigenvalues represent in PCA?
Explanation: The sum of the eigenvalues of a covariance matrix equals the total variance present in the data set. The number of components refers to the number of variables, but the sum of eigenvalues is not equal to that unless variance is exactly one per variable. Minimum and average values are not represented by eigenvalues.
When selecting how many principal components to retain, which criterion is most directly informed by examining the sorted eigenvalues?
Explanation: The sorted eigenvalues reveal how much variance each principal component explains, guiding the choice based on cumulative variance. Negative eigenvalues are not possible for a positive semi-definite covariance matrix. Retaining as many components as variables defeats dimensionality reduction. Smallest eigenvalues explain the least variance.
If only the first two principal components are used to reconstruct data originally in five dimensions, what does the sum of the remaining three eigenvalues represent?
Explanation: The sum of the discarded eigenvalues quantifies the variance not retained during dimensionality reduction, representing information loss. Means are unaffected by dimensionality reduction. The product of variances is not meaningful here, and PCA does not create new samples, only new representations.