一种基于文本相似度矩阵运算的非结构化海量投诉数据分类算法

Translated title of the contribution: A Text Similarity Matrix Operation-Based Classification Algorithm for Large-scale Unstructured Complaint Data

李青*, 陳陽, 謝浩然, 蒙聖光

*Corresponding author for this work

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

Abstract

With the fast development of the Internet and information technology nowadays, the growth of the volume of unstructured data is exponential. In particular, the prevalence of the Web 2.0 network community further enlarges the growth tendency. Therefore, how to manage and organize large-scale unstructured data effectively, so as to facilitate end-user information access, becomes an urgent and important research topic. In this paper, based on the text of unstructured data modeling and text similarity, the existing large-scale unstructured data classification algorithms are surveyed and discussed, and they are applied to a China Mobile user complaint data classification system. Upon the latter, the effectiveness of processing the complaint data is shown to have been much improved, and the usage of our proposed classification algorithm and system architecture is verified.
Translated title of the contributionA Text Similarity Matrix Operation-Based Classification Algorithm for Large-scale Unstructured Complaint Data
Original languageChinese (Simplified)
Pages (from-to)103-107
Journal计算机工程与科学
Volume34
Issue number1 (总第 205)
DOIs
Publication statusPublished - Jan 2012

Research Keywords

  • 文本相似度
  • 非结构化数据
  • 投诉数据分类系统
  • Text similarity
  • unstructured data
  • complaint data classification system

Fingerprint

Dive into the research topics of 'A Text Similarity Matrix Operation-Based Classification Algorithm for Large-scale Unstructured Complaint Data'. Together they form a unique fingerprint.

Cite this