Abstract
Virtual agents are increasingly used for delivering health information in general, and mental health assistance in particular. This paper presents a corpus designed for training a virtual counsellor in Cantonese, a variety of Chinese. The corpus consists of a domain-independent subcorpus that supports small talk for rapport building with users, and a domain-specific subcorpus that provides material for a particular area of counselling. The former consists of ELIZA style responses, chitchat expressions, and a dataset of general dialog, all of which are reusable across counselling domains. The latter consists of example user inputs and appropriate chatbot replies relevant to the specific domain. In a case study, we created a chatbot with a domain-specific subcorpus that addressed 25 issues in test anxiety, with 436 inputs solicited from native speakers of Cantonese and 150 chatbot replies harvested from mental health websites. Preliminary evaluations show that Word Mover’s Distance achieved 56% accuracy in identifying the issue in user input, outperforming a number of baselines.
Original language | English |
---|---|
Title of host publication | Proceedings of the LREC 2020 1st Joint SLTU and CCURL Workshop (SLTU-CCURL 2020) |
Editors | Dorothee Beermann, Laurent Besacier, Sakriani Sakti |
Publisher | European Language Resources Association (ELRA) |
Pages | 358-361 |
ISBN (Electronic) | 9791095546351 |
Publication status | Published - May 2020 |
Event | 1st Joint SLTU and CCURL Workshop (SLTU-CCURL 2020) - Marseille, France Duration: 11 May 2020 → 16 May 2020 |
Conference
Conference | 1st Joint SLTU and CCURL Workshop (SLTU-CCURL 2020) |
---|---|
Abbreviated title | SLTU-CCURL 2020 |
Country/Territory | France |
City | Marseille |
Period | 11/05/20 → 16/05/20 |
Research Keywords
- Cantonese
- chatbot
- counselling
- test anxiety
Publisher's Copyright Statement
- European Language Resources Association (ELRA), licensed under CC-BY-NC.