Cross-domain Cross-modal Food Transfer
Research output: Chapters, Conference Papers, Creative and Literary Works › RGC 32 - Refereed conference paper (with host publication) › peer-review
Author(s)
Related Research Unit(s)
Detail(s)
Original language | English |
---|---|
Title of host publication | MM '20 - Proceedings of the 28th ACM International Conference on Multimedia |
Place of Publication | New York |
Publisher | Association for Computing Machinery |
Pages | 3762-3770 |
ISBN (print) | 9781450379885 |
Publication status | Published - Oct 2020 |
Publication series
Name | MM - Proceedings of the ACM International Conference on Multimedia |
---|
Conference
Title | 28th ACM International Conference on Multimedia (MM 2020) |
---|---|
Location | Virtual |
Place | United States |
City | Seattle |
Period | 12 - 16 October 2020 |
Link(s)
Abstract
The recent works in cross-modal image-to-recipe retrieval pave a new way to scale up food recognition. By learning the joint space between food images and recipes, food recognition is boiled down as a retrieval problem by evaluating the similarity of embedded features. The major drawback, nevertheless, is the difficulty in applying an already-trained model to recognize different cuisines of dishes unknown to the model. In general, model updating with new training examples, in the form of image-recipe pairs, is required to adapt a model to new cooking styles in a cuisine. Nevertheless, in practice, acquiring sufficient number of image-recipe pairs for model transfer can be time-consuming. This paper addresses the challenge of resource scarcity in the scenario that only partial data instead of a complete view of data is accessible for model transfer. Partial data refers to missing information such as absence of image modality or cooking instructions from an image-recipe pair. To cope with partial data, a novel generic model, equipped with various loss functions including cross-modal metric learning, recipe residual loss, semantic regularization and adversarial learning, is proposed for cross-domain transfer learning. Experiments are conducted on three different cuisines (Chuan, Yue and Washoku) to provide insights on scaling up food recognition across domains with limited training resources.
Research Area(s)
- Food recognition, Cross-modal food retrieval, Cross-domain transfer
Citation Format(s)
Cross-domain Cross-modal Food Transfer. / Zhu, Bin; Ngo, Chong-Wah; Chen, Jing-jing.
MM '20 - Proceedings of the 28th ACM International Conference on Multimedia. New York: Association for Computing Machinery, 2020. p. 3762-3770 (MM - Proceedings of the ACM International Conference on Multimedia).
MM '20 - Proceedings of the 28th ACM International Conference on Multimedia. New York: Association for Computing Machinery, 2020. p. 3762-3770 (MM - Proceedings of the ACM International Conference on Multimedia).
Research output: Chapters, Conference Papers, Creative and Literary Works › RGC 32 - Refereed conference paper (with host publication) › peer-review