Abstract
Food is rich of visible (e.g., colour, shape) and procedural (e. cutting, cooking) attributes. Proper leveraging of these attribut particularly the interplay among ingredients, cutting and cooki methods, for health-related applications has not been previous explored. This paper investigates cross-modal retrieval of recip specifically to retrieve a text-based recipe given a food picture query. As similar ingredient composition can end up with wild different dishes depending on the cooking and cutting procedur the difficulty of retrieval originates from fine-grained recogniti of rich attributes from pictures. With a multi-task deep learni model, this paper provides insights on the feasibility of predicti ingredient, cutting and cooking attributes for food recognition a recipe retrieval. In addition, localization of ingredient regions also possible even when region-level training examples are n provided. Experiment results validate the merit of rich attribut when comparing to the recently proposed ingredient-only retriev techniques.
| Original language | English |
|---|---|
| Title of host publication | MM 2017 - Proceedings of the 2017 ACM Multimedia Conference |
| Publisher | Association for Computing Machinery |
| Pages | 1771-1779 |
| ISBN (Print) | 9781450349062 |
| DOIs | |
| Publication status | Published - 23 Oct 2017 |
| Event | 25th ACM International Conference on Multimedia (MM 2017) - Mountain View, United States Duration: 23 Oct 2017 → 27 Oct 2017 |
Conference
| Conference | 25th ACM International Conference on Multimedia (MM 2017) |
|---|---|
| Place | United States |
| City | Mountain View |
| Period | 23/10/17 → 27/10/17 |
Research Keywords
- Cooking and cutting recognition
- Cross-modal retrieval
- Ingredient recognition
- Recipe retrieval