Document page segmentation and layout analysis using soft ordering

Research output: Journal Publications and Reviews (RGC: 21, 22, 62)22_Publication in policy or professional journal

9 Scopus Citations
View graph of relations

Author(s)

Related Research Unit(s)

Detail(s)

Original languageEnglish
Pages (from-to)458-461
Journal / PublicationProceedings - International Conference on Pattern Recognition
Volume15
Issue number1
Publication statusPublished - 2000

Abstract

This paper presents a novel algorithm for layout analysis of document images. A major component of this algorithm is the independent segmentation algorithm that identifies text and graphics regions. The segmentation algorithm first locates document patterns and then performs classification using run-length characteristics, spread analysis and adjacency relations. A key feature of the layout analysis algorithm is soft ordering which provides a means of ordering regions in a more logical way, and allows for some overlapping between separate regions. This is very useful for processing documents that are slightly skewed or irregular in layout. The algorithm has been tested on many different documents, and can successfully recognise single and multicolumn documents, even when the column format varies several times on one page. Furthermore, it can process documents with text tightly wrapped around graphics and documents that are slightly skewed. © 2000 IEEE.