Chinese Food Scanner: Modeling Ingredient Composition from Multimedia Perspective

Project: ResearchGRF

View graph of relations


The ambition of this project is to scale up food recognition through ingredient labels prediction, which is helpful for applications such as nutrition estimation and dietary assessment. While there are more researches carried out recently for recognition of Western and Japanese food, little efforts have been dedicated to the domain of Chinese food. Chinese dishes are rich in color, texture and shape. Different from other food domains, ingredients are often fuzzily and wildly composed to create novel dish presentation. The difficulties of recognition originate from not just the richness in visual appearances, but also the “recreations” of recipes as a result of evolution along the space (e.g., geography region) and temporal (e.g., season) dimensions. Recognizing food categories, without understanding of ingredient composition, is limited to standardized cooked food such as in restaurants, and furthermore, hardly scales up and out to dishes prepared “in wild” in the free-living environment. This project addresses two problems: fine-grained ingredient recognition and cooking recipe retrieval. The general idea is to model ingredient composition for search of recipes, which can provide nutritional clues such as non-visible ingredients (e.g., salt, sugar) and quantities of ingredients. The challenges are threefold, nevertheless. First, the appearance of an ingredient varies depending on ways of cutting and cooking. Second, there is no direct mapping between ingredient composition and dish. Specifically, the same set of ingredients may mix and end up in different dish presentations and names. Third, the preparation of dish depends on preference of tastes and availability of ingredients, which varies across occasions and results in creation of different recipe versions. Finding the exact or best-match recipe for a query food picture, analogous to translation of visual-to-text or vice versa, is a non-trivial problem. The project considers these problems from the perspective of multimedia computing, more concretely, by integrating a variety of dish-relevant cues, ranging from ingredient and cooking procedure to food context, to disambiguate food recognition. Not only ingredients, but also the process and consequence of composition are computationally modeled. In addition, recipe search is posted as a zero-shot retrieval problem to scale up and out recognition by searching recipes for “out of vocabulary” dishes or dishes not previously known. The key contributions of this project are modeling of ingredient composition and extension beyond which to recognize a pre-defined set of food categories – both are new problems not seriously explored in the literature and essential for Chinese dish recognition. ?


Effective start/end date1/01/18 → …