Parallel web text clustering with a modular self-organizing map system

Lean Yu, Shouyang Wang, Kin Keung Lai

    Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

    Abstract

    A multistage modular self-organizing map (SOM) model is proposed for parallel web text clustering. In the first stage, the large textual datasets are divided into some small disjoint datasets (i.e., task decomposition). In the second stage, each small data set is input into different unitary SOM models for word clustering map (i.e., modularization learning). In the third stage, based upon the outputs of each SOM module in the previous stage, another SOM model is used to integrate different word clustering results to formulate a text category map (i.e., module fusion). In the proposed model, word clustering map is embedded into text category map and thus a hierarchically modular SOM model is formulated. For illustration and verification purpose, a practical text clustering experiment is performed.
    Original languageEnglish
    Pages (from-to)909-916
    JournalJournal of Computational Information Systems
    Volume3
    Issue number3
    Publication statusPublished - Mar 2007

    Research Keywords

    • Modularization design
    • Parallel computing
    • Self-organizing map
    • Web text clustering

    Fingerprint

    Dive into the research topics of 'Parallel web text clustering with a modular self-organizing map system'. Together they form a unique fingerprint.

    Cite this