Parallel web text clustering with a modular self-organizing map system

Research output: Journal Publications and Reviews (RGC: 21, 22, 62)21_Publication in refereed journalpeer-review

View graph of relations

Author(s)

  • Lean Yu
  • Shouyang Wang
  • Kin Keung Lai

Related Research Unit(s)

Detail(s)

Original languageEnglish
Pages (from-to)909-916
Journal / PublicationJournal of Computational Information Systems
Volume3
Issue number3
Publication statusPublished - Mar 2007

Abstract

A multistage modular self-organizing map (SOM) model is proposed for parallel web text clustering. In the first stage, the large textual datasets are divided into some small disjoint datasets (i.e., task decomposition). In the second stage, each small data set is input into different unitary SOM models for word clustering map (i.e., modularization learning). In the third stage, based upon the outputs of each SOM module in the previous stage, another SOM model is used to integrate different word clustering results to formulate a text category map (i.e., module fusion). In the proposed model, word clustering map is embedded into text category map and thus a hierarchically modular SOM model is formulated. For illustration and verification purpose, a practical text clustering experiment is performed.

Research Area(s)

  • Modularization design, Parallel computing, Self-organizing map, Web text clustering

Citation Format(s)

Parallel web text clustering with a modular self-organizing map system. / Yu, Lean; Wang, Shouyang; Lai, Kin Keung.

In: Journal of Computational Information Systems, Vol. 3, No. 3, 03.2007, p. 909-916.

Research output: Journal Publications and Reviews (RGC: 21, 22, 62)21_Publication in refereed journalpeer-review