TY - JOUR
T1 - Parallel web text clustering with a modular self-organizing map system
AU - Yu, Lean
AU - Wang, Shouyang
AU - Lai, Kin Keung
PY - 2007/3
Y1 - 2007/3
N2 - A multistage modular self-organizing map (SOM) model is proposed for parallel web text clustering. In the first stage, the large textual datasets are divided into some small disjoint datasets (i.e., task decomposition). In the second stage, each small data set is input into different unitary SOM models for word clustering map (i.e., modularization learning). In the third stage, based upon the outputs of each SOM module in the previous stage, another SOM model is used to integrate different word clustering results to formulate a text category map (i.e., module fusion). In the proposed model, word clustering map is embedded into text category map and thus a hierarchically modular SOM model is formulated. For illustration and verification purpose, a practical text clustering experiment is performed.
AB - A multistage modular self-organizing map (SOM) model is proposed for parallel web text clustering. In the first stage, the large textual datasets are divided into some small disjoint datasets (i.e., task decomposition). In the second stage, each small data set is input into different unitary SOM models for word clustering map (i.e., modularization learning). In the third stage, based upon the outputs of each SOM module in the previous stage, another SOM model is used to integrate different word clustering results to formulate a text category map (i.e., module fusion). In the proposed model, word clustering map is embedded into text category map and thus a hierarchically modular SOM model is formulated. For illustration and verification purpose, a practical text clustering experiment is performed.
KW - Modularization design
KW - Parallel computing
KW - Self-organizing map
KW - Web text clustering
UR - http://www.scopus.com/inward/record.url?scp=34250734662&partnerID=8YFLogxK
UR - https://www.scopus.com/record/pubmetrics.uri?eid=2-s2.0-34250734662&origin=recordpage
M3 - RGC 21 - Publication in refereed journal
SN - 1553-9105
VL - 3
SP - 909
EP - 916
JO - Journal of Computational Information Systems
JF - Journal of Computational Information Systems
IS - 3
ER -