Skip to main navigation Skip to search Skip to main content

Pattern-based content lossless compression of Chinese document images

Maggie M. K. Tsui, Alan Wee-Chung Liew, Hong Yan

    Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

    Abstract

    Compression of scanned text document images is important in modern document management, communications and retrieval systems. However, most existing compression techniques have been studied extensively only for documents in English or similar alphabet-based languages. In this paper, we purpose a content-lossless scheme for compression of Chinese text documents. This method utilizes the radical characteristics, which is unique to Chinese characters, to minimize the size of compressed documents. Our method consists of two main parts. The first part is the development of a radical pattern library. The second part is to utilize the radical pattern library to match character patterns in a document. The technique has been tested with many Chinese text document images with good results.
    Original languageEnglish
    Title of host publication2004 International Symposium on Intelligent Multimedia, Video and Speech Processing, ISIMP 2004
    Pages607-610
    Publication statusPublished - 2004
    Event2004 International Symposium on Intelligent Multimedia, Video and Speech Processing, ISIMP 2004 - Hong Kong, China, Hong Kong, China
    Duration: 20 Oct 200422 Oct 2004

    Conference

    Conference2004 International Symposium on Intelligent Multimedia, Video and Speech Processing, ISIMP 2004
    PlaceHong Kong, China
    CityHong Kong, China
    Period20/10/0422/10/04

    Fingerprint

    Dive into the research topics of 'Pattern-based content lossless compression of Chinese document images'. Together they form a unique fingerprint.

    Cite this