Abstract
With the fast development of the Internet and information technology nowadays, the growth of the volume of unstructured data is exponential. In particular, the prevalence of the Web 2.0 network community further enlarges the growth tendency. Therefore, how to manage and organize large-scale unstructured data effectively, so as to facilitate end-user information access, becomes an urgent and important research topic. In this paper, based on the text of unstructured data modeling and text similarity, the existing large-scale unstructured data classification algorithms are surveyed and discussed, and they are applied to a China Mobile user complaint data classification system. Upon the latter, the effectiveness of processing the complaint data is shown to have been much improved, and the usage of our proposed classification algorithm and system architecture is verified.
Translated title of the contribution | A Text Similarity Matrix Operation-Based Classification Algorithm for Large-scale Unstructured Complaint Data |
---|---|
Original language | Chinese (Simplified) |
Pages (from-to) | 103-107 |
Journal | 计算机工程与科学 |
Volume | 34 |
Issue number | 1 (总第 205) |
DOIs | |
Publication status | Published - Jan 2012 |
Research Keywords
- 文本相似度
- 非结构化数据
- 投诉数据分类系统
- Text similarity
- unstructured data
- complaint data classification system