CookGAN: Causality Based Text-to-Image Synthesis

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

64 Citations (Scopus)

Abstract

This paper addresses the problem of text-to-image synthesis from a new perspective, i.e., the cause-and-effect chain in image generation. Causality is a common phenomenon in cooking. The dish appearance changes depending on the cooking actions and ingredients. The challenge of synthesis is that a generated image should depict the visual result of action-on-object. This paper presents a new network architecture, CookGAN, that mimics visual effect in causality chain, preserves fine-grained details and progressively upsamples image. Particularly, a cooking simulator sub-network is proposed to incrementally make changes to food images based on the interaction between ingredients and cooking methods over a series of steps. Experiments on Recipe1M verify that CookGAN manages to generate food images with reasonably impressive inception score. Furthermore, the images are semantically interpretable and manipulable.

Publication series

NameProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
PublisherIEEE Computer Society
ISSN (Print)1063-6919

Conference

Conference2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2020)
Abbreviated titleCVPR2020
PlaceUnited States
CitySeattle
Period13/06/2019/06/20
Internet address

Bibliographical note

Research Unit(s) information for this publication is provided by the author(s) concerned.

Research Keywords

  • causality effect
  • text-to-image synthesis
  • recipe manipulation
  • image-to-recipe retrieval

Fingerprint

Dive into the research topics of 'CookGAN: Causality Based Text-to-Image Synthesis'. Together they form a unique fingerprint.

Cite this