Projects per year
Abstract
Unsafe construction behavior, one of the leading factors of accidents and casualties, can be reduced by strengthening construction inspection. However, current methods use either manual inspection or inefficient cross-modal models based on multiple backbone networks. To alleviate the problems, a “rule-question” transformation and annotation system is formulated, and the unsafe behavior detection is turned into a visual reasoning task: visual question answering (VQA). The VQA model is developed based on a vision-and-language Transformer, and the unsafe behavior could be identified based on the output answers. A dataset containing 16 safety rules and 2386 related construction images is used to fine-tune and validate the VQA model. The results show that the developed VQA model achieves an average recall of 0.81 at a faster reasoning speed. Finally, an applet for safety report generation is implemented to demonstrate the feasibility and practicability of the safety compliance checking based on VQA.
| Original language | English |
|---|---|
| Article number | 104580 |
| Journal | Automation in Construction |
| Volume | 144 |
| Online published | 7 Oct 2022 |
| DOIs | |
| Publication status | Published - Dec 2022 |
Funding
The Shenzhen Science and Technology Innovation Committee Grant #JCYJ20180507181647320 and Research Grant Council # 11211622 jointly supported this work. The conclusions herein are those of the authors and do not necessarily reflect the views of the sponsoring agencies.
Research Keywords
- Construction safety management
- Cross-modal model
- Safety compliance checking
- Vision-and-language Transformer
- Visual question answering
- Visual reasoning
Publisher's Copyright Statement
- COPYRIGHT TERMS OF DEPOSITED POSTPRINT FILE: © 2022. This manuscript version is made available under the CC-BY-NC-ND 4.0 license https://creativecommons.org/licenses/by-nc-nd/4.0/.
RGC Funding Information
- RGC-funded
Fingerprint
Dive into the research topics of 'Safety compliance checking of construction behaviors using visual question answering'. Together they form a unique fingerprint.Projects
- 1 Active
-
GRF: Automatic Detection of Safety Violations using Vision and Knowledge
LUO, X. (Principal Investigator / Project Coordinator) & SONG, L. (Co-Investigator)
1/09/22 → …
Project: Research