Skip to main navigation Skip to search Skip to main content

Image captioning in chinese for construction activity scene understanding using a pre-trained cross-modal language model

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

217 Downloads (CityUHK Scholars)

Abstract

With the popularity of surveillance cameras, many vision-based artificial intelligence (AI) agents have been applied to construction projects, significantly improving management efficiency and workers' productivity. However, only a few works study scene understanding because it is one of the most challenging topics of intelligent monitoring. Besides, as a big construction country, China lacks corresponding AI research based on Chinese in the construction field, which seriously hinders the further development of China's construction industry. Therefore, this paper proposes a Vision-based BERT (V-BERT) model for construction activity scene understanding. A Chinese caption dataset named Images of Jobsite Daily Activity and Chinese Captions (IJDACC) is created to verify V-BERT's performance. Some data augmentation operations are then used to enlarge the training set further. Two evaluation systems are established to evaluate V-BERT's comprehensive performance. The experimental results show the V-BERT achieves state-of-the-art performance in the construction area with an average performance improvement of 171.20%.
Original languageEnglish
Title of host publicationEG-ICE 2021 Proceedings
Subtitle of host publicationWorkshop on Intelligent Computing in Engineering
EditorsJimmy Abualdenien, André Bormann , Lucian Constantin Ungureanu , Timo Hartmann
PublisherUniversitätsverlag der TU Berlin
Pages508-519
Number of pages12
ISBN (Electronic)978-3-7983-3212-6
ISBN (Print)978-3-7983-3211-9
DOIs
Publication statusPublished - 2021
Event28th International Workshop on Intelligent Computing in Engineering (EG-ICE 2021) - Fabrik23 & Virtual, Berlin, Germany
Duration: 30 Jun 20212 Jul 2021
https://berlin-2021.eg-ice.org/

Publication series

NameEG-ICE Workshop on Intelligent Computing in Engineering, Proceedings

Conference

Conference28th International Workshop on Intelligent Computing in Engineering (EG-ICE 2021)
PlaceGermany
CityBerlin
Period30/06/212/07/21
Internet address

Bibliographical note

Research Unit(s) information for this publication is provided by the author(s) concerned.

Publisher's Copyright Statement

  • This full text is made available under CC-BY 4.0. https://creativecommons.org/licenses/by/4.0/

Fingerprint

Dive into the research topics of 'Image captioning in chinese for construction activity scene understanding using a pre-trained cross-modal language model'. Together they form a unique fingerprint.

Cite this