High-rank Matrix Completion


Student thesis: Doctoral Thesis

View graph of relations


Related Research Unit(s)


Awarding Institution
Award date11 Sep 2018


Matrix completion is to recover the missing entries of partially observed matrices. It has been widely applied to many problems such as collaborative filtering, image inpainting and classifications. Conventional methods of matrix completion are under the assumption that the matrices are of low-rank. Hence, the missing entries can be recovered by matrix factorization or rank minimization. However, in practice, data matrices are often of high-rank or even full-rank, especially when the data are drawn from multiple subspaces or/and nonlinear latent variable models. Therefore, classical matrix factorization and rank minimization based algorithms cannot recover the missing entries of high-rank and full-rank matrices.

The thesis aims at solving the problem of high-rank or full-rank matrix completion. First, a new matrix completion framework based on self-representation (SRMC) is established. Specifically, least-square, low-rank, and sparse self-representations based matrix completion algorithms are proposed for multiple-subspace data. It is proved that the self-representation models can provide better approximations of rank. Particularly, the sparse self-representation (SSR) based algorithm is more effective than other two algorithms because the sparsity is more powerful in segmenting multiple subspaces.

Although the sparse self-representation (SSR) based matrix completion can provide high recovery accuracy for multiple-subspace data, it is not efficient on large matrices. To solve the problem, a sparse factorization based matrix completion (SFMC) is proposed. To further improve the computational efficiency, an accelerated proximal alternating linearized minimization algorithm is proposed to solve the optimization of SFMC. The convergence property is proved theoretically.

Third, a deep learning based matrix completion (DLMC) method is proposed. Specifically, AutoEncoder and stacked AutoEncoders are utilized to recover the missing entries of data. The method can also be regarded as an approach of deep learning on incomplete data. In training the neural networks, the missing entries and the network parameters are optimized simultaneously. Out of sample extension is also proposed for the deep learning based matrix complete.

Fourth, a new nonlinear matrix completion (NLMC) method based on kernel representation is proposed for data drawn from nonlinear low-dimensional latent variable models. The proposed method minimizes the rank of the matrix in the high-dimensional feature space induced by a kernel function. The rank is approximated by Schattern-p norm. Both polynomial kernel and RBF kernel are studied. The method is extended to a nonlinear rank minimization framework and applied to nonlinear denoising problems.

Finally, a robust kernel PCA (RKPCA) is proposed to decompose a partially corrupted matrix as a sparse matrix plus a high-rank or full-rank matrix but having low latent dimensionality. Theoretical guarantee is also provided. The optimization of RKPCA is challenging because the nonconvex and indifferentiable problems exist simultaneously. Nevertheless, two nonconvex optimization algorithms are proposed for RKPCA: alternating direction method of multipliers plus backtracking line search (ADMM+BTLS) and proximal linearized minimization with adaptive step size (PLM+AdSS). The convergence of PLM+AdSS is proved theoretically.

Compared with existing methods, the proposed methods in this thesis show significant improvements in various tasks such as image inpainting, noise removal, subspace clustering, collaborative filtering, and classification.