Compression of Chinese document images based on morphologic analysis and pattern matching

Research output: Journal Publications and Reviews (RGC: 21, 22, 62)21_Publication in refereed journalpeer-review

3 Scopus Citations
View graph of relations



Original languageEnglish
Article number107001
Journal / PublicationOptical Engineering
Issue number10
Publication statusPublished - Oct 2006


We propose a highly efficient content-lossless compression scheme for Chinese document images. The scheme combines morphologic analysis with pattern matching to cluster patterns. In order to achieve the error maps with minimal error numbers, the morphologic analysis is applied to decomposing and recomposing the Chinese character patterns. In the pattern matching, the criteria are adapted to the characteristics of Chinese characters. Since small-size components sometimes can be inserted into the blank spaces of large-size components, we can achieve small-size pattern library images. Arithmetic coding is applied to the final compression. Our method achieves much better compression performance than most alternative methods, and assures content-lossless reconstruction. © 2006 Society of Photo-Optical Instrumentation Engineers.

Research Area(s)

  • Document image compression, Morphology, Pattern matching