Parallel web text clustering with a modular self-organizing map system
Research output: Journal Publications and Reviews (RGC: 21, 22, 62) › 21_Publication in refereed journal › peer-review
Author(s)
Related Research Unit(s)
Detail(s)
Original language | English |
---|---|
Pages (from-to) | 909-916 |
Journal / Publication | Journal of Computational Information Systems |
Volume | 3 |
Issue number | 3 |
Publication status | Published - Mar 2007 |
Link(s)
Abstract
A multistage modular self-organizing map (SOM) model is proposed for parallel web text clustering. In the first stage, the large textual datasets are divided into some small disjoint datasets (i.e., task decomposition). In the second stage, each small data set is input into different unitary SOM models for word clustering map (i.e., modularization learning). In the third stage, based upon the outputs of each SOM module in the previous stage, another SOM model is used to integrate different word clustering results to formulate a text category map (i.e., module fusion). In the proposed model, word clustering map is embedded into text category map and thus a hierarchically modular SOM model is formulated. For illustration and verification purpose, a practical text clustering experiment is performed.
Research Area(s)
- Modularization design, Parallel computing, Self-organizing map, Web text clustering
Citation Format(s)
Parallel web text clustering with a modular self-organizing map system. / Yu, Lean; Wang, Shouyang; Lai, Kin Keung.
In: Journal of Computational Information Systems, Vol. 3, No. 3, 03.2007, p. 909-916.Research output: Journal Publications and Reviews (RGC: 21, 22, 62) › 21_Publication in refereed journal › peer-review