Beyond Model Adaptation: Transforming a Complete Probability Distribution of Model Parameters across Different Domains in Transfer Learning

Project: Research

View graph of relations

Description

The major challenge impeding the progress of machine learning is the lack of sufficient labeled data. Effectively addressing this problem is particularly critical for applications in which human lives are at stake, as in the development of an intelligent traffic management system through training a deep neural network. If the trained network is expected to have the capability of detecting pedestrians and oncoming vehicles, and issuing warning when they are too close to each other, the inadequacy of labeled training examples may lead to tragic consequences if the system fails to detect pedestrians in the path of approaching vehicles. To address this challenge, a possible approach is to adapt a trained model from a source domain, with plenty of labeled data, to a related target domain in which few labeled instances are available. However, the main problem of this transfer learning approach is the mismatch between the source and target data distribution, as in the case of adapting a model trained with synthetic images to analyze real images. If the mismatch is severe, no amount of adaptation based on re-training a single model can bridge this domain gap. Adopting the Bayesian interpretation of a model as an instance sampled from a posterior probability distribution of model parameters, given a training dataset, we introduce a new transfer learning framework in this project. Specifically, we re-formulate transfer learning as the transformation of a complete source domain model distribution to a target domain distribution, such that knowledge can be effectively transferred across the domains without extensive iterative training. The importance of this framework is in its capability of bridging significant domain discrepancy through generating and combining a diverse class of models, which cannot possibly be achieved through training a single model. To model this distribution-to-distribution transformation, the generative adversarial network (GAN) model will be applied. The generator and discriminator of the GAN model will compete with each other, such that the transformed source model parameter distribution can generate a target model indistinguishable from the actual one. The proposed framework will question the necessity of requiring a data-rich environment and extensive iterative training for effective machine learning, and introduce the new approach of creating a hypothesis-rich learning environment, with the sparse dataset serving as constraints for the hypothesis generation process. This will result in a significant impact to the entire field of machine learning and many of its related applications, including object detection, scene understanding, and face analysis/verification.

Detail(s)

Project number9042954
Grant typeGRF
StatusActive
Effective start/end date1/01/21 → …