YOLO-OCR: End-to-end Compound Figure Separation and Label Recognition of Images in Scientific Publications

Shuo Meng, Xinshuo Liang, Shuai Zhang, Leqi Lei, Hanbai Wu, Saira Iqbal, Jinlian Hu*

*Corresponding author for this work

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

Abstract

Scientific publications, especially biomedical publications, contain a large number of compound figures, which are composed of multiple graphs, plots, and drawings. With the growing interest in data mining, scientific image understanding, and retrieval, compound figure separation and label recognition have become vital steps for various downstream tasks. However, existing studies are difficult to apply to increasingly complex scenarios, and they usually treat these two tasks separately. In this work, we propose a new model called YOLO-OCR to do compound figure separation and label recognition simultaneously. The YOLO-OCR realizes object detection, text detection, and text recognition altogether in a unified end-to-end trainable network. Benefiting from shared convolution features, the model has fewer computation costs and higher performance. To reduce annotation costing, we train the model on a synthesized compound figure dataset and then finetune the model in actual compound figure datasets based on an active learning strategy. The results show that the proposed method achieves a new state-of-the-art performance on the ImageCLEF 2016 dataset and our dataset. In addition, we developed an online system based on the proposed model to help researchers conveniently separate compound figures. The project is publicly available at https://www.chatfigures.com/figure-separation. Compound figure separation, Label recognition, Information retrieval, Object detection, Text recognition. Copyright © 2024 by SIAM.
Original languageEnglish
Title of host publicationProceedings of the 2024 SIAM International Conference on Data Mining (SDM)
EditorsShashi Shekhar, Vagelis Papalexakis, Jing Gao, Zhe Jiang, Matteo Riondato
PublisherSociety for Industrial and Applied Mathematics
Pages118-126
ISBN (Electronic)9781611978032
DOIs
Publication statusPublished - Apr 2024
Event2024 SIAM International Conference on Data Mining (SDM24) - Houston, United States
Duration: 18 Apr 202420 Apr 2024
https://www.siam.org/conferences/cm/conference/sdm24
https://www.siam.org/conferences-events/past-event-archive/sdm24/

Publication series

NameProceedings of the SIAM International Conference on Data Mining, SDM

Conference

Conference2024 SIAM International Conference on Data Mining (SDM24)
Country/TerritoryUnited States
CityHouston
Period18/04/2420/04/24
Internet address

Research Keywords

  • Compound figure separation
  • Label recognition
  • Information retrieval
  • Object detection
  • Text recognition

Fingerprint

Dive into the research topics of 'YOLO-OCR: End-to-end Compound Figure Separation and Label Recognition of Images in Scientific Publications'. Together they form a unique fingerprint.

Cite this