Finding similar questions in collaborative question answering archives: Toward bootstrapping-based equivalent pattern learning

Tianyong Hao, Eugene Agichtein

    Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

    21 Citations (Scopus)

    Abstract

    Many questions submitted to Collaborative Question Answering (CQA) sites have similar questions answered before. We propose a precise approach of automatically finding an answer to such questions by automatically identifying "equivalent" questions submitted and answered, in the past. Our method is based on automatically generating equivalent question patterns by grouping together questions that have previously obtained the same answers. The generated patterns are used as seed patterns to match more questions to extract large number of equivalent patterns by a new bootstrapping-based learning method. The resulting patterns can be applied to match a new question to an equivalent one that has already been answered, and thus suggest potential answers automatically. We experimented with this approach over a large collection of more than 200,000 real questions drawn from the Yahoo! Answers archive, automatically acquiring over 16,991 groups of equivalent question patterns. These patterns allow our method to obtain over 57% recall and over 54% precision on suggesting an answer automatically to new questions, significantly improving over baseline methods. © 2012 Springer Science+Business Media, LLC.
    Original languageEnglish
    Pages (from-to)332-353
    JournalInformation Retrieval
    Volume15
    Issue number3-4
    DOIs
    Publication statusPublished - Jun 2012

    Research Keywords

    • Bootstrapping
    • Collaborative question answering
    • Equivalent pattern
    • Pattern extension

    Fingerprint

    Dive into the research topics of 'Finding similar questions in collaborative question answering archives: Toward bootstrapping-based equivalent pattern learning'. Together they form a unique fingerprint.

    Cite this