Representation learning with low-rankness and structural sparsity for multimedia understanding

多媒體理解中低秩與結構稀疏表示學習的研究

Student thesis: Doctoral Thesis

View graph of relations

Author(s)

  • Liang TAO

Related Research Unit(s)

Detail(s)

Awarding Institution
Supervisors/Advisors
Award date16 Feb 2015

Abstract

In the era of big data, there has been a growing surge of interest in leveraging tremendous amounts of data available on the Web to help train a broader range of representation learning models. In alignment with this trend, this thesis is dedicated to developing and learning good representations, which cover recent advances in learning low rank and structural sparse representations that constitute a thread running through the whole thesis. The first part of this thesis consists of two topics: low rank approximation with multiple graphs, together with the robust constrained low rank matrix recovery and dimensionality reduction using the semi-supervised side information. 1. Unsupervised Low Rank Approximation with Sparse Integration of Multiple Manifolds: Our first approach is to explore sparse integration of multiple manifold learning to aid unsupervised dimensionality reduction. Manifold regularized techniques have been extensively exploited in unsupervised learning like matrix factorization whose performance is heavily affected by the underlying graph regularization. However, there exist no principled ways to select reasonable graphs under the matrix decomposition setting, particularly in multiple heterogeneous graph sources. We deal with the issue of searching for the optimal linear combination space of multiple graphs under the low rank matrix approximation model. Specifically, efficient projection onto the probabilistic simplex is utilized to optimize the weights of graphs, resulting in the sparse pattern of coefficients. This attractive sparse property can be interpreted as a criterion for selecting graphs, i.e., identifying the most discriminative graphs with the associated nonzero graph weights and removing the noisy or uninformative graphs with the zero weights, so as to significantly boost the low rank decomposition performance. 2. Semi-Supervised Low Rank Matrix Recovery using Side Information: Current low-rank decomposition schemes have been found lacking in robustness to the occlusions, outliers and missing entries, which are pervasive in real life data sets. By exploiting the low-rank property of the input space, we propose a robust constrained low-rank matrix factorization to simultaneously achieve the clean input space and its corresponding compact low dimensional representation with the side in- formation. Specifically we leverage pairwise constraints across the low-rank approximation to improve the discriminating performance of the embedded representations. In essence, such semi-supervised pairwise constraints can guarantee that the original data within the same category are transformed into low dimensional outputs with smaller distances, whereas the distances are larger in the transformed space between the different categories. We formulate the new problem as a minimization optimization over a convex objective but with a non-convex column-orthogonal constraint. On the basis of the inexact augmented Lagrange multiplier, we then develop an efficient first order proximal gradient algorithm to deal with the presented formulation, especially for the data with high dimensionality. Empirical studies corroborate the feasibility of the new model as a practical paradigm for dimensionality reduction and data recovery in terms of the high percentage of grossly corrupted observations. In the second part of this thesis, we center on supervised and semi-supervised Canonical Correlation Analysis (CCA) feature learning models for Internet inspired vision applications. 3. Supervised CCA with Low Rank Shared Subspace and Joint Sparsity: CCA has been widely employed for modelling Internet multimedia. However, two major challenges are raised for the classical CCA. First, CCA frequently fails to remove noisy and irrelevant features. Second, CCA cannot effectively capture the correlation between semantic labels, which is especially beneficial for annotating web images. We propose a new framework that integrates structural sparsity and low-rank shared subspace into the least-squares formulation of CCA. Under this framework, multiple label interactions can be uncovered by the shared common structure of the input space. Meanwhile, a few highly discriminative features can be decided via the structural sparse norm. Owing to the presence of non-smooth structured sparsity, a new efficient iterative algorithm is derived with convergence guarantees. Extensive experiments over popular web image data collections consistently deliver the effectiveness of our new formulation in comparison with competing algorithms. 4. Robust and Scalable Semi-Supervised CCA: Canonical correlation analysis (CCA) has been regarded as an attractive model for learning good representations. We study the problem of learning sparse and low-rank collaborated patterns of canonical loadings particularly when the label space is large and noisy. Specifically, the underlying correlations among semantic labels can be elucidated by virtue of a low rank structure, and meanwhile robust feature learning can be performed by using the structural sparse-inducing norm. Further, considering that large numbers of unlabeled instances can be readily accessed on the Web, we extend CCA, being a supervised manner in essence, to a semi-supervised learning paradigm, where our learning objective couples the classifier with the latent intermediate representation that is conducive to extracting the inherent relationship among canonical loadings. By employing the simultaneous diagonalization, we develop a new efficient iterative alternating algorithm with fast and guaranteed convergence, especially for the high dimensional data, in order to avoid a large-scale eigen-decomposition that is known to be prohibitively expensive.

    Research areas

  • Multimedia systems, Computational intelligence, Knowledge representation (Information theory)