Computational Perception for Graphic Design and Elements Generation

關於平面設計及其元素生成的感知計算模型

Student thesis: Doctoral Thesis

View graph of relations

Author(s)

Related Research Unit(s)

Detail(s)

Awarding Institution
Supervisors/Advisors
Award date4 Sep 2020

Abstract

Graphic design plays a big role as a communication tool to convey information in modern life. However, creating an aesthetically pleasing design while achieving some high-level subjective goals (e.g., theme and personality) to catch audience’s attention is a challenging task. The task requires a lot of expertise and rich design experience, which can often be overwhelming for novices under hundreds of thousands of options on design elements. In this thesis, we design computational models to understand determined factors for human perception on graphic design (e.g., personality and style), and show how they can benefit a broad range of design applications.

The thesis consists of two parts. In the first part of the thesis, we focus on understanding graphic design as a whole by answering the question of what characterizes the personality of a graphic design. We propose a deep ranking framework for exploring the effects of various design factors on the perceived personalities of graphic designs. Our framework learns a personality scoring network to estimate the personality scores of graphic designs by ranking the web data. With the framework, we perform quantitative and qualitative analyses to investigate how various design factors (e.g., color, font, and layout) affect design personality across different scales (from pixels, regions to elements). We further present two novel applications enabled by the framework, including element-level design suggestions and example-based personality transfer.

In the second part of the thesis, we step deeper into two fundamental elements of graphic design: font and icon. We first aim to solve the problem of selecting proper fonts to fit the context of a design. However, selecting proper fonts for a design is a tedious task, as each font has many properties, such as font face, color, and size, resulting in a very large search space. We thus propose a novel, multi-task deep neural network to jointly predict font face, color, and size for each text element on a design, by considering multi-scale visual features and semantic tags of the design. We demonstrate the effectiveness of our model on web designs. To train the model, we have created our own CTXFont (Context Font) dataset, consisting of 1k professional web designs, with labeled font properties. Experiments show that our model outperforms the baseline methods. We also conduct a user study to demonstrate the usability of our method in a font selection task.

We then aim to solve the problem of automatically synthesizing novel compound icons, given some compound concepts (e.g., ”no smoking” and ”health insurance”). However, designing compound icons requires experience and creativity, in order to efficiently navigate the semantics, space, and style features of icons. To this end, we develop ICONATE, a novel system that automatically generates compound icons based on textual inputs and allows users to explore and customize the generated icons. ICONATE works in a way by finding commonly used icons for sub-concepts and arranging them according to the inferred conventions. We have collected Compicon1k, a new dataset consisting of 1k compound icons annotated with semantic labels to enable our pipeline. We conduct several user studies to demonstrate the effectiveness of our tool for both professionals and novices.