TY - JOUR
T1 - Assessing local influence in principal component analysis with application to haematology study data
AU - Fung, Wing K.
AU - Gu, Hong
AU - Xiang, Liming
AU - Yau, Kelvin K.W.
PY - 2007/6/15
Y1 - 2007/6/15
N2 - In many medical and health studies, high-dimensional data are often encountered. Principal component analysis (PCA) is a commonly used technique to reduce such data to a few components that includes most of the information provided by the original data. However, PCA is known to be very sensitive to some abnormal observations. Therefore, it is essential to assess such sensitivity in PCA. In this paper, the assessments of local influence based on generalized influence function are developed under the case-weights and additive perturbation schemes, along with a discussion of the perturbation scheme and the generalized influence function approach. When perturbing different variables of the data, it is noted that the directions of the largest joint local influence for the eigenvalues are all the same. Moreover, these directions are completely determined by the score values of the observations, to which an approximate cut-off point is given. The proposed methods are applied to analyse a set of haematology study data for illustration. Results add new insights in finding influential observations in the studied data set. Copyright © 2006 John Wiley & Sons, Ltd.
AB - In many medical and health studies, high-dimensional data are often encountered. Principal component analysis (PCA) is a commonly used technique to reduce such data to a few components that includes most of the information provided by the original data. However, PCA is known to be very sensitive to some abnormal observations. Therefore, it is essential to assess such sensitivity in PCA. In this paper, the assessments of local influence based on generalized influence function are developed under the case-weights and additive perturbation schemes, along with a discussion of the perturbation scheme and the generalized influence function approach. When perturbing different variables of the data, it is noted that the directions of the largest joint local influence for the eigenvalues are all the same. Moreover, these directions are completely determined by the score values of the observations, to which an approximate cut-off point is given. The proposed methods are applied to analyse a set of haematology study data for illustration. Results add new insights in finding influential observations in the studied data set. Copyright © 2006 John Wiley & Sons, Ltd.
KW - Influence function
KW - Local influence
KW - Perturbation
KW - Principal component analysis
UR - http://www.scopus.com/inward/record.url?scp=34249672694&partnerID=8YFLogxK
UR - https://www.scopus.com/record/pubmetrics.uri?eid=2-s2.0-34249672694&origin=recordpage
U2 - 10.1002/sim.2747
DO - 10.1002/sim.2747
M3 - RGC 21 - Publication in refereed journal
C2 - 17094070
SN - 0277-6715
VL - 26
SP - 2730
EP - 2744
JO - Statistics in Medicine
JF - Statistics in Medicine
IS - 13
ER -