Designing a Voice User Interface for Illiterate and Semiliterate Users: A Cognitive Approach
為文盲和半文盲人士設計的語音用戶界面:基於認知科學方法
Student thesis: Doctoral Thesis
Author(s)
Detail(s)
Awarding Institution | |
---|---|
Supervisors/Advisors |
|
Award date | 8 Jan 2016 |
Link(s)
Permanent Link | https://scholars.cityu.edu.hk/en/theses/theses(90aeef5a-4c37-4bb5-a9d8-8214660de5ee).html |
---|---|
Other link(s) | Links |
Abstract
Based on the Unesco Institute for Statistics 2014 survey, nearly 15.7% of the world's adult population is still illiterate due to poverty and social inequality (UIS, 2014). Information communication and technologies (ICT) can contribute significantly to the socio-economic development of underdeveloped communities. There have been attempts to provide ICT access to members of communities who are illiterate and semiliterate. Unfortunately, there are still many challenges in their implementations. Since literacy plays an important role in the development of one's cognitive skills, effective interfaces for illiterate and semiliterate users should require very few to no text instructions and must be adjusted to the users' cognitive skills. In order to respond to the challenge of providing ICT access to illiterate and semiliterate people, this research was conducted. There were five experiments in this study which aimed at investigating the best way to deliver verbal instructions in regard to cognitive processes, specifically that of memory and abstract thinking.
The first experiment examined the effect of the length of a verbal instruction and its context on users' ability to recite it correctly. The illiterate and semiliterate participants' performance in this experiment was statistically different from those who were literate. This result supports the hypothesis that literacy skills have an impact on cognitive abilities, i.e. memory capacity. The highest averages for the total number of instructions recited by illiterate and semiliterate participants were 2.00 and 2.25 respectively. This shows that the memory capacity of illiterate and semiliterate participants was significantly less than George Miller's 7±2 short term memory laws. Based on this finding, it is not recommended to deliver instructions with more than two sentences at a time. Moreover, the results of this experiment showed that participants' ability to recite instructions was also influenced by the context. Participants' scores were significantly lower on instructions related to interacting with ICT because these instructions were unfamiliar to them. It can be inferred that participants' comprehension of the instructions affected their ability to recite them correctly.
The presentation rate of a verbal instruction does not only influence speech intelligibility but may also determine participants' ability to recite it. The second experiment thus investigated the effective presentation rate for verbal instructions. The speech rate used in this experiment was based on news presenters' speaking rate, normal speaking rate, and existing speech interface system rate. The result of this experiment revealed that the relationship between speech rate and participants' ability to recite the instructions correctly was non-linear. The instructions in the voice interface should be delivered with speech rate that is slower than the normal speaking rate to enable users to hear and comprehend the instructions. Furthermore, subjective evaluations also revealed that most participants preferred a slower presentation rate. However, if the instructions were delivered at a rate much too slow, participants' performance declined. This could be caused by the decay theory. In this experiment, the best instruction rate was delivered at 251 syllables/min. Similar to the first experiment, the performance of literate participants out performed those who were semiliterate and illiterate.
The third experiment explored the effectiveness of redundant presentations in verbal instructions. The purpose of these redundant presentations was to help participants understand the interface terminology, which might be unfamiliar to them. The result showed that the use of redundant presentations significantly increased the performance of illiterate and semiliterate participants. The use of graphics helped participants in identifying the interface components, while animated graphs helped participants' understanding of what was happening behind the abstraction of the interface. This experiment also found that providing graphics and animated graphs as cues helped participants recall instructions correctly.
The fourth experiment examined the effectiveness of verbal instruction delivery methods on learnability, information retention and transfer process. The instruction delivery methods applied were step-by-step and whole-step instructions. Overall, participants receiving step-by-step instructions had higher performance averages in the retention process and learning effect. On the other hand, participants receiving whole step instructions had higher performance averages for the transfer process. Further observation showed that whole-step instructions had a consistent effect on literate participants. It improved their performance by reducing the number of steps and time needed to complete a task. The same effect could not be found in illiterate and semiliterate participants. This might be due to their inability to construct a mental model. The step-by-step instructions (one instruction at a time) may have enabled illiterate and semiliterate users to follow instructions correctly in completing a task. However, those step-by-step instructions may not increase their ability to transfer their knowledge to solve different tasks.
The last experiment investigated the effectiveness of voice instruction navigation structures for illiterate and semiliterate users. Participants were asked to complete a task by navigating through a voice interface system using linear, semilinear and hierarchical structures. The experiment demonstrated that choice of navigation structure had a significant effect on the number of steps taken and time needed to complete a task. On the semilinear structure, participants had difficulties in constructing the menu in their minds because they did not know all the choices available. Yet, presenting all the choices available as in the hierarchical structure would result in having longer instructions; thus, it may exceed participants' memory capacity. Illiterate participants were also found to have difficulties in understanding how the choices were categorized in the hierarchical structure. In this study, illiterate and semiliterate participants performed better on the hierarchical structure compared to the semilinear structure. The experiment also demonstrated that literacy had a significant effect on particpants' performance for the semilinear and hierarchical structure. On most tasks, the performance of illiterate participants' was far below that of semiliterate and literate participants. Illiterate participants seemed to have challenges with abstract concepts possibly because they had less structured knowledge.
This study successfully showed that illiterate and semiliterate users were lacking cognitive abilities needed for interacting with a voice-based interface. The deliverables of this study provide interface designers with recommendations for developing effective voice-based interfaces for illiterate and semiliterate users.
The first experiment examined the effect of the length of a verbal instruction and its context on users' ability to recite it correctly. The illiterate and semiliterate participants' performance in this experiment was statistically different from those who were literate. This result supports the hypothesis that literacy skills have an impact on cognitive abilities, i.e. memory capacity. The highest averages for the total number of instructions recited by illiterate and semiliterate participants were 2.00 and 2.25 respectively. This shows that the memory capacity of illiterate and semiliterate participants was significantly less than George Miller's 7±2 short term memory laws. Based on this finding, it is not recommended to deliver instructions with more than two sentences at a time. Moreover, the results of this experiment showed that participants' ability to recite instructions was also influenced by the context. Participants' scores were significantly lower on instructions related to interacting with ICT because these instructions were unfamiliar to them. It can be inferred that participants' comprehension of the instructions affected their ability to recite them correctly.
The presentation rate of a verbal instruction does not only influence speech intelligibility but may also determine participants' ability to recite it. The second experiment thus investigated the effective presentation rate for verbal instructions. The speech rate used in this experiment was based on news presenters' speaking rate, normal speaking rate, and existing speech interface system rate. The result of this experiment revealed that the relationship between speech rate and participants' ability to recite the instructions correctly was non-linear. The instructions in the voice interface should be delivered with speech rate that is slower than the normal speaking rate to enable users to hear and comprehend the instructions. Furthermore, subjective evaluations also revealed that most participants preferred a slower presentation rate. However, if the instructions were delivered at a rate much too slow, participants' performance declined. This could be caused by the decay theory. In this experiment, the best instruction rate was delivered at 251 syllables/min. Similar to the first experiment, the performance of literate participants out performed those who were semiliterate and illiterate.
The third experiment explored the effectiveness of redundant presentations in verbal instructions. The purpose of these redundant presentations was to help participants understand the interface terminology, which might be unfamiliar to them. The result showed that the use of redundant presentations significantly increased the performance of illiterate and semiliterate participants. The use of graphics helped participants in identifying the interface components, while animated graphs helped participants' understanding of what was happening behind the abstraction of the interface. This experiment also found that providing graphics and animated graphs as cues helped participants recall instructions correctly.
The fourth experiment examined the effectiveness of verbal instruction delivery methods on learnability, information retention and transfer process. The instruction delivery methods applied were step-by-step and whole-step instructions. Overall, participants receiving step-by-step instructions had higher performance averages in the retention process and learning effect. On the other hand, participants receiving whole step instructions had higher performance averages for the transfer process. Further observation showed that whole-step instructions had a consistent effect on literate participants. It improved their performance by reducing the number of steps and time needed to complete a task. The same effect could not be found in illiterate and semiliterate participants. This might be due to their inability to construct a mental model. The step-by-step instructions (one instruction at a time) may have enabled illiterate and semiliterate users to follow instructions correctly in completing a task. However, those step-by-step instructions may not increase their ability to transfer their knowledge to solve different tasks.
The last experiment investigated the effectiveness of voice instruction navigation structures for illiterate and semiliterate users. Participants were asked to complete a task by navigating through a voice interface system using linear, semilinear and hierarchical structures. The experiment demonstrated that choice of navigation structure had a significant effect on the number of steps taken and time needed to complete a task. On the semilinear structure, participants had difficulties in constructing the menu in their minds because they did not know all the choices available. Yet, presenting all the choices available as in the hierarchical structure would result in having longer instructions; thus, it may exceed participants' memory capacity. Illiterate participants were also found to have difficulties in understanding how the choices were categorized in the hierarchical structure. In this study, illiterate and semiliterate participants performed better on the hierarchical structure compared to the semilinear structure. The experiment also demonstrated that literacy had a significant effect on particpants' performance for the semilinear and hierarchical structure. On most tasks, the performance of illiterate participants' was far below that of semiliterate and literate participants. Illiterate participants seemed to have challenges with abstract concepts possibly because they had less structured knowledge.
This study successfully showed that illiterate and semiliterate users were lacking cognitive abilities needed for interacting with a voice-based interface. The deliverables of this study provide interface designers with recommendations for developing effective voice-based interfaces for illiterate and semiliterate users.