A Scalable Framework for Stylometric Analysis of Multi-author Documents

Research output: Chapters, Conference Papers, Creative and Literary Works (RGC: 12, 32, 41, 45)32_Refereed conference paper (with ISBN/ISSN)peer-review

8 Scopus Citations
View graph of relations

Author(s)

  • Norawit Urailertprasert
  • Nattapol Vannaboot
  • Thanawin Rakthanmanon

Related Research Unit(s)

Detail(s)

Original languageEnglish
Title of host publicationDatabase Systems for Advanced Applications
Subtitle of host publication23rd International Conference, DASFAA 2018, Proceedings
EditorsJian Pei, Yannis Manolopoulos, Shazia Sadiq, Jianxin Li
PublisherSpringer, Cham
Pages813-829
Volume1
ISBN (Electronic)9783319914527
ISBN (Print)9783319914510
Publication statusPublished - 2018

Publication series

NameLecture Notes in Computer Science
PublisherSpringer, Cham
VolumeLNCS 10827
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Title23rd International Conference on Database Systems for Advanced Applications (DASFAA 2018)
PlaceAustralia
CityGold Coast
Period21 - 24 May 2018

Abstract

Stylometry is a statistical technique used to analyze the variations in the author’s writing styles and is typically applied to authorship attribution problems. In this investigation, we apply stylometry to authorship identification of multi-author documents (AIMD) task. We propose an AIMD technique called Co-Authorship Graph (CAG) which can be used to collaboratively attribute different portions of documents to different authors belonging to the same community. Based on CAG, we propose a novel AIMD solution which (i) significantly outperforms the existing state-of-the-art solution; (ii) can effectively handle a larger number of co-authors; and (iii) is capable of handling the case when some of the listed co-authors have not contributed to the document as a writer. We conducted an extensive experimental study to compare the proposed solution and the best existing AIMD method using real and synthetic datasets. We show that the proposed solution significantly outperforms existing state-of-the-art method.

Research Area(s)

  • Authorship identification, Co-Authorship Graph, Multi-author documents, Stylometry

Citation Format(s)

A Scalable Framework for Stylometric Analysis of Multi-author Documents. / Sarwar, Raheem; Yu, Chenyun; Nutanong, Sarana; Urailertprasert, Norawit; Vannaboot, Nattapol; Rakthanmanon, Thanawin.

Database Systems for Advanced Applications: 23rd International Conference, DASFAA 2018, Proceedings. ed. / Jian Pei; Yannis Manolopoulos; Shazia Sadiq; Jianxin Li. Vol. 1 Springer, Cham, 2018. p. 813-829 (Lecture Notes in Computer Science; Vol. LNCS 10827).

Research output: Chapters, Conference Papers, Creative and Literary Works (RGC: 12, 32, 41, 45)32_Refereed conference paper (with ISBN/ISSN)peer-review