A novel dual wing harmonium model aided by 2-D wavelet transform subbands for document data mining

Haijun Zhang, Tommy W.S. Chow, M. K M Rahman

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

3 Citations (Scopus)

Abstract

A novel dual wing harmonium model that integrates multiple features including term frequency features and 2-D wavelet transform features into a low dimensional semantic space is proposed for the applications of document classification and retrieval. Terms are extracted from the graph representation of document by employing weighted feature extraction method. 2-D wavelet transform is used to compress the graph due to its sparseness while preserving the basic document structure. After transform, low-pass subbands are stacked to represent the term associations in a document. We then develop a new dual wing harmonium model projecting these multiple features into low dimensional latent topics with different probability distributions assumption. Contrastive divergence algorithm is used for efficient learning and inference. We perform extensive experimental verification in document classification and retrieval, and comparative results suggest that the proposed method delivers better performance than other methods. © 2009 Elsevier Ltd. All rights reserved.
Original languageEnglish
Pages (from-to)4403-4412
JournalExpert Systems with Applications
Volume37
Issue number6
DOIs
Publication statusPublished - Jun 2010

Research Keywords

  • 2-D wavelet
  • Document data
  • Dual wing harmonium
  • Graph representation
  • Multiple features
  • Term association

Fingerprint

Dive into the research topics of 'A novel dual wing harmonium model aided by 2-D wavelet transform subbands for document data mining'. Together they form a unique fingerprint.

Cite this