A Corpus-based Study of Assessments in Daily Conversations

基於語料庫的日常會話評價功能研究

Student thesis: Doctoral Thesis

View graph of relations

Author(s)

  • Yanjiao LI

Detail(s)

Awarding Institution
Supervisor(s)
Award date17 Jun 2016

Link(s)

Abstract

This thesis describes a corpus-based investigation into one dialogue act (DA), assessment, in daily conversations. The study draws on large collections of naturally occurring telephone conversations, compiled in the Switchboard Dialogue Act (SwDA) Corpus, to account for how assessments are used in interactive speech by native speakers. Four main goals are addressed: 1) to specify the lexical items used in expressing assessments, 2) to identify the syntactic structures of assessment utterances, 3) to explore the relationship between linguistic forms and communicative functions, and 4) to differentiate assessments from other functions by three parameters: particular expressions, turn positions, and preceding contexts. The work conducted in the current thesis enriches our current understanding of assessments in several ways. First, it presents a comprehensive empirical overview of assessment utterances in daily conversations using a large-scale spoken corpus. Second, it describes in detail the syntactic structures used to express assessments in interactive speech. Finally, it offers a first attempt to verify and test assessment expressions and structures against a large corpus of naturally occurring data. This dissertation delves deeply into the relationship between linguistic forms and functions, providing critical empirical information on the multi-functionality and ambiguity of assessment utterances. In addition, subtle similarities and distinctions between assessments and other functions are identified that support an enriched understanding of DA taxonomy. These issues are particularly crucial given that the taxonomy is considered as essential part of the human-machine dialogue system.
In the first stage of this dissertation, 1,126 tagged conversations are examined to extract utterances identified as assessment (coded as “ba”). These utterances are divided into a range of categories according to POS tags and syntactic constructions, including clauses, interjections, nominal phrases, adjective phrases, prepositional phrases and adverb phrases. Clauses, which make up the largest proportion of the utterances, are further divided into subcategories based on their structural characteristics. In this way, a formal classification of assessment utterances is developed based on corpus evidence.
In the second part of the dissertation, 18 different syntactic structures are identified, corresponding to four major syntactic classes: adjective phrases, nominal phrases, non- copular verb clauses, and reduced-form clauses. Five structures (Structure I-V) characterize predicative adjective phrases, which signal assessment meaning in 1,925 utterances (42% of the dataset). Five structures (VI-X) characterize nominal phrases, which signal assessment meaning in 481 utterances (11%). Non-copular verb utterances and reduced-form utterances are used to convey assessments in 496 utterances and 75 utterances respectively (11% and 2%); they are generalized into five structures (XI-XV) and three structures (XVI-XVIII). No structures can be generalized for other categories, including interjections, prepositional phrases, adverb phrases, wh-interrogatives and imperatives.
In the third part of the dissertation, these structures are applied to the tagged conversations in the corpus to analyze the relationships between forms and functions. Utterances in the same form can be associated with a range of communicative functions, and moreover, the more frequent the form is, the more function it can serve. Of the 18 structures, Structure II-a (3,299 instances) is the most frequent structure and also serves the largest number of communicative functions (28 different functions). By contrast, Structures VIII, X, XIV, XV and XVIII have a small number of utterances each, which are used to convey three or less than three functions. Regarding the proportions of assessments, utterances in Structures XV and X-B are consistently used for assessments and never serve other functions. Moreover, assessments are also robustly represented in Structures III and VIII, accounting for 95% and 93% of utterances, respectively. It is concluded that these four structures can be treated as indicators of assessments in interaction since they are greatly associated with this function in the corpus.
Finally, the distinction between assessments and other functions is investigated along three parameters, including particular expressions, position in a turn, and immediately preceding contexts. Fisher's exact test is employed to test for statistical significance. Results show that yes-no interrogatives (i.e. Structures IV, IX) and reduced-form utterances (i.e. Structures XVI, XVII and XVIII) cannot be used to characterize assessments, since utterances in these structures either prefer to serve non-assessment functions, or do not show preference to any function (noted as “functionally ambingus cases”). For other structures, sets of expressions have been achieved, which have greater likelihood to occur in assessments than non-assessment functions. Moreover, assessment utterances are found to show preference to turn-inial position (“Utt1” or “Utt2”); they are more likely to respond to statement-non-opinion (“sd”) in the preceding context. The empirical evidence presented in this study strengthens our ability to automatically predicate assessments based on their lexical and syntactic features.