Abstract
With the popularity of surveillance cameras, many vision-based artificial intelligence (AI) agents have been applied to construction projects, significantly improving management efficiency and workers' productivity. However, only a few works study scene understanding because it is one of the most challenging topics of intelligent monitoring. Besides, as a big construction country, China lacks corresponding AI research based on Chinese in the construction field, which seriously hinders the further development of China's construction industry. Therefore, this paper proposes a Vision-based BERT (V-BERT) model for construction activity scene understanding. A Chinese caption dataset named Images of Jobsite Daily Activity and Chinese Captions (IJDACC) is created to verify V-BERT's performance. Some data augmentation operations are then used to enlarge the training set further. Two evaluation systems are established to evaluate V-BERT's comprehensive performance. The experimental results show the V-BERT achieves state-of-the-art performance in the construction area with an average performance improvement of 171.20%.
| Original language | English |
|---|---|
| Title of host publication | EG-ICE 2021 Proceedings |
| Subtitle of host publication | Workshop on Intelligent Computing in Engineering |
| Editors | Jimmy Abualdenien, André Bormann , Lucian Constantin Ungureanu , Timo Hartmann |
| Publisher | Universitätsverlag der TU Berlin |
| Pages | 508-519 |
| Number of pages | 12 |
| ISBN (Electronic) | 978-3-7983-3212-6 |
| ISBN (Print) | 978-3-7983-3211-9 |
| DOIs | |
| Publication status | Published - 2021 |
| Event | 28th International Workshop on Intelligent Computing in Engineering (EG-ICE 2021) - Fabrik23 & Virtual, Berlin, Germany Duration: 30 Jun 2021 → 2 Jul 2021 https://berlin-2021.eg-ice.org/ |
Publication series
| Name | EG-ICE Workshop on Intelligent Computing in Engineering, Proceedings |
|---|
Conference
| Conference | 28th International Workshop on Intelligent Computing in Engineering (EG-ICE 2021) |
|---|---|
| Place | Germany |
| City | Berlin |
| Period | 30/06/21 → 2/07/21 |
| Internet address |
Bibliographical note
Research Unit(s) information for this publication is provided by the author(s) concerned.Publisher's Copyright Statement
- This full text is made available under CC-BY 4.0. https://creativecommons.org/licenses/by/4.0/
Fingerprint
Dive into the research topics of 'Image captioning in chinese for construction activity scene understanding using a pre-trained cross-modal language model'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver