This thesis proposes novel representation and learning approaches for image understanding
tasks such as image classification, annotation, etc. To begin with, in order to
capture spatial context in the data, we propose a novel coding scheme called Robust
Regularized Coding (RRC), which fully exploits the geometrical information among
local descriptors to significantly boost the discriminating capability of the resultant
features. More specifically, both locality constraint and smoothness constraint terms
with respect to RRC codes are incorporated into the objective function to preserve the
local invariance of RRC codes, and the global similarity between local features is considered
to constrain sparsity. To make the RRC scale up to larger databases, an online
codebook learning algorithm is also proposed, which processes a small subset of the
dataset at a time and loops through the whole dataset to update the codebook incrementally.
In particular, RRC can be readily combined with other machine learning
techniques which are widely used in image understanding.
To address the problem that most graph-based semi-supervised learning approaches
did not explore the label dissimilarity knowledge, we also propose a novel graphbased
label propagation framework that effectively incorporates similarity and dissimilarity information into semi-supervised learning. The class mass normalization
is utilized to make the label decision rule match class priors. The function induction
algorithm is presented to predict the labels of test data. More importantly, by solving
quadratic optimization, the approach can give rise to closed-form solution for classification
functions of unlabeled data and out-of-sample data.
Furthermore, we propose a novel algorithm, Iterative Semi-supervised Sparse Coding
(ISSC), which jointly explores the advantages of both sparse coding and graphbased
semi-supervised learning in order to learn discriminative sparse codes as well
as an effective classification function. The ISSC algorithm fully exploits initial labels
and the subsequently predicted labels for sparse codes learning. At the same time,
during the graph-based semi-supervised learning stage, similarity matrix is firstly adjusted
through the latest learned sparse codes, and then is utilized to obtain a better
classification function. The ISSC approach can also give rise to closed-form solutions
for sparse codes and classification function, respectively.
Last but not least, as traditional one-to-one similarity measurements usually suffer
from low effectiveness in annotating the images with multiple objects/semantics,
we present an effective reconstruction based image annotation (RBIA) algorithm to
propagate the labels of training images to test image by multi-label linear embedding.
All our approaches proposed in this thesis have been extensively evaluated over several
benchmarking datasets. The experimental results demonstrate each of our proposed
approaches can achieve significant performance improvements with respect to
the state-of-the-arts.
| Date of Award | 3 Oct 2014 |
|---|
| Original language | English |
|---|
| Awarding Institution | - City University of Hong Kong
|
|---|
| Supervisor | Ho Shing Horace IP (Supervisor) |
|---|
- Supervised learning (Machine learning)
- Coding theory
- Digital techniques
- Image processing
Image understanding based on sparse coding and semi-supervised learning
ZHENG, H. (Author). 3 Oct 2014
Student thesis: Doctoral Thesis