CookGAN : Causality Based Text-to-Image Synthesis
Research output: Chapters, Conference Papers, Creative and Literary Works › RGC 32 - Refereed conference paper (with host publication) › peer-review
Author(s)
Related Research Unit(s)
Detail(s)
Original language | English |
---|---|
Title of host publication | Proceedings - 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2020) |
Publisher | Institute of Electrical and Electronics Engineers, Inc. |
Pages | 5518-5526 |
ISBN (electronic) | 978-1-7281-7168-5 |
Publication status | Published - Jun 2020 |
Publication series
Name | Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition |
---|---|
Publisher | IEEE Computer Society |
ISSN (Print) | 1063-6919 |
Conference
Title | 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2020) |
---|---|
Location | Virtual |
Place | United States |
City | Seattle |
Period | 13 - 19 June 2020 |
Link(s)
DOI | DOI |
---|---|
Document Link | |
Link to Scopus | https://www.scopus.com/record/display.uri?eid=2-s2.0-85094218835&origin=recordpage |
Permanent Link | https://scholars.cityu.edu.hk/en/publications/publication(63905128-ff09-49f7-bbd4-21e12b294178).html |
Abstract
This paper addresses the problem of text-to-image synthesis from a new perspective, i.e., the cause-and-effect chain in image generation. Causality is a common phenomenon in cooking. The dish appearance changes depending on the cooking actions and ingredients. The challenge of synthesis is that a generated image should depict the visual result of action-on-object. This paper presents a new network architecture, CookGAN, that mimics visual effect in causality chain, preserves fine-grained details and progressively upsamples image. Particularly, a cooking simulator sub-network is proposed to incrementally make changes to food images based on the interaction between ingredients and cooking methods over a series of steps. Experiments on Recipe1M verify that CookGAN manages to generate food images with reasonably impressive inception score. Furthermore, the images are semantically interpretable and manipulable.
Research Area(s)
- causality effect, text-to-image synthesis, recipe manipulation, image-to-recipe retrieval
Bibliographic Note
Research Unit(s) information for this publication is provided by the author(s) concerned.
Citation Format(s)
CookGAN: Causality Based Text-to-Image Synthesis. / Zhu, Bin; Ngo, Chong-Wah.
Proceedings - 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2020). Institute of Electrical and Electronics Engineers, Inc., 2020. p. 5518-5526 (Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition).
Proceedings - 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2020). Institute of Electrical and Electronics Engineers, Inc., 2020. p. 5518-5526 (Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition).
Research output: Chapters, Conference Papers, Creative and Literary Works › RGC 32 - Refereed conference paper (with host publication) › peer-review