Abstract
We present an algorithm that can determine the layout of an arbitrary document with great flexibility. The bottom-up approach of pattern extraction and classification provides good segmentation and is insensitive to skew. Soft ordering is a feature that improves segmentation by allowing distinct regions to physically overlap. It is also used to determine the correct order of the document regions. The algorithm can extract and place all the distinct document regions into a logical layout and column structure. © 2002 Society of Photo-Optical Instrumentation Engineers.
| Original language | English |
|---|---|
| Pages (from-to) | 2831-2843 |
| Journal | Optical Engineering |
| Volume | 41 |
| Issue number | 11 |
| DOIs | |
| Publication status | Published - Nov 2002 |
Research Keywords
- Document layout analysis
- Page segmentation
- Pattern classification
- Soft ordering