A Scalable Framework for Stylometric Analysis Query Processing

Research output: Chapters, Conference Papers, Creative and Literary Works (RGC: 12, 32, 41, 45)32_Refereed conference paper (with ISBN/ISSN)peer-review

7 Scopus Citations
View graph of relations

Author(s)

Related Research Unit(s)

Detail(s)

Original languageEnglish
Title of host publicationProceedings : 16th IEEE International Conference on Data Mining
EditorsFrancesco Bonchi, Josep Domingo-Ferrer, Ricardo Baeza-Yates, Zhi-Hua Zhou, Xindong Wu
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1125-1130
ISBN (Electronic)978-1-5090-5473-2
ISBN (Print)978-150905472-5
Publication statusPublished - Dec 2016

Publication series

Name
ISSN (Print)1550-4786
ISSN (Electronic)2374-8486

Conference

Title16th IEEE International Conference on Data Mining (ICDM 2016)
LocationWorld Trade Center
PlaceSpain
CityBarcelona, Catalonia
Period12 - 15 December 2016

Abstract

Stylometry is the statistical analyses of variations in the author's literary style. The technique has been used in many linguistic analysis applications, such as, author profiling, authorship identification, and authorship verification. Over the past two decades, authorship identification has been extensively studied by researchers in the area of natural language processing. However, these studies are generally limited to (i) a small number of candidate authors, and (ii) documents with similar lengths. In this paper, we propose a novel solution by modeling authorship attribution as a set similarity problem to overcome the two stated limitations. We conducted extensive experimental studies on a real dataset collected from an online book archive, Project Gutenberg. Experimental results show that in comparison to existing stylometry studies, our proposed solution can handle a larger number of documents of different lengths written by a larger pool of candidate authors with a high accuracy.

Citation Format(s)

A Scalable Framework for Stylometric Analysis Query Processing. / Nutanong, Sarana; Yu, Chenyun; Sarwar, Raheem; Xu, Peter; Chow, Dickson.

Proceedings : 16th IEEE International Conference on Data Mining. ed. / Francesco Bonchi; Josep Domingo-Ferrer; Ricardo Baeza-Yates; Zhi-Hua Zhou; Xindong Wu. Institute of Electrical and Electronics Engineers Inc., 2016. p. 1125-1130 7837960.

Research output: Chapters, Conference Papers, Creative and Literary Works (RGC: 12, 32, 41, 45)32_Refereed conference paper (with ISBN/ISSN)peer-review