A Unified Framework for Multivariate Gaussian Process Models for Computer Vision
DescriptionCurrent state-of-the-art computer vision solutions are largely based on the adoption and adaptation of machine learning algorithms, such as boosting, support vector machines, and probabilistic graphical models. The main advantage with using machine learning is that it allows the computer to discover an optimal model (e.g., object detector) directly from the data, which often performs better than models designed by ”hand” by the researcher. In recent years, Gaussian processes (GPs), a non-parametric Bayesian approach to regression and classification, have been gaining popularity in computer vision. GPs have several properties that are desirable for solving computer vision tasks, such as robust learning on small training sets, probabilistic predictions, and the capability of using diverse image representations. Because of these advantages, GP regression and classification have been applied to many computer vision problems, such as object classification, human action recognition, crowd analysis, stereo vision, and anomaly detection. However, despite their successes, many of these methods attempt to “shoe-horn” their computer vision task into the standard GP regression framework. Hence, heuristics are required to convert the real-valued GP prediction to a valid task-specific output, which is not optimal in the Bayesian setting. For example in crowd counting, the real-valued GP prediction must be truncated and rounded to generate a proper count prediction. Furthermore, many methods regress functions with multivariate outputs (e.g., human pose angles, bounding boxes) by assuming statistical independence between the output dimensions. However, this is contrary to the actual computer vision data, which tends to have significant structure. Therefore, current computer vision methods based on GPs are not able to fully exploit the Bayesian framework for prediction or for modeling the underlying structure of the multivariate observation space. The goal of this project is to develop new Gaussian process models that allow computer vision solutions to take full advantage of non-parametric Bayesian modeling. In particular, the project aims to develop a parameterized family of multivariate GP models that unifies existing GP models into a single framework. Using this unifying framework, algorithms (e.g., approximate inference) developed for specific GP models will be generalized to all other models, and new GP models can be easily created by changing the parameters of the family. The result is a non-parametric Bayesian multivariate regression framework that can be easily adapted to a diverse set of computer vision tasks (e.g., pose recognition, tracking, reflectance spectra estimation, crowd counting, and surveillance).
|Effective start/end date||1/01/13 → 2/06/17|