Advanced Multimodal Interface Design for Dual-Task Environments: Effects of Redundant Stimulus Presentation and Stimulus Distraction

雙任務環境下多模態人機介面交互設計:冗餘刺激信號呈現以及干擾信號分散注意力的效用研究

Student thesis: Doctoral Thesis

View graph of relations

Author(s)

Detail(s)

Awarding Institution
Supervisors/Advisors
Award date6 Jun 2017

Abstract

In the context of human–machine interaction, the spatial compatibility of signal and response arrays is one of the most important determinants in interface design for effective information processing and improved system performance. This study aims to investigate human performance and response preferences for processing multisensory signals in four-choice spatial stimulus-response (S-R) tasks. However, many past spatial compatibility studies were limited to a single-task paradigm with the demand of attentional resource restricted to only one (mostly visual) modality and the responses made with hands. No studies have been conducted on dual-task processing, such as giving discrete responses with hands to a multi-choice spatial compatibility task while visual attentional resource is being drawn to a concurrent primary task, for example, continuous manual tracking. Currently, increasingly complex control systems require simultaneous handling of a growing number and variety of signal modalities and control devices. Advanced and sophisticated interfaces should permit more complex displays and control configurations, as well as create challenges for designers of multimodal systems. Increasingly complex human–machine interfaces, which are commonly used in control rooms today, have greatly added to the urgency to solve the problems of attentional resource competition and capacity limitations of the operators in multimodal information processing.

In the first experiment, the single modality with visual, auditory, or vibrotactile signals as inputs was interacted with a manual tracking task, resulting in intra- (visual–visual) and cross- (visual–auditory / visual–vibrotactile) modality configurations for testing. The duration of conflicting demands, which can cause keen competition for resources, influenced the interruption between the main tracking and secondary S-R tasks. Moreover, the continuous flow of information needed under both compatible S-R mapping (BC) for the discrete task was the smallest. The discrete response task involved differences in transverse and longitudinal compatibilities, and no right–left prevalence effect was found in this study, thereby showing that compared with two hand effectors, two finger effectors (from one hand) were very likely not enough to provide a salient frame of reference in the horizontal right– left dimension. However, longitudinal compatibility brought more benefits [e.g., shorter reaction times (RTs), smaller root mean square tracking errors] compared with transverse compatibility. The cross-modal (VA and VT) configuration here was compared with the intra-modal (VV) one. Results suggested that when the distance and position of the two visual tasks in the intra-modal configuration were controlled carefully, such that focal and ambient vision could be utilized simultaneously, the intra-modal configuration was found to surpass the cross-modal one. The differences between the intra-modal and cross-modal configurations were very robust, at least for the dual-task prototype that involved continuous tracking and discrete spatial compatibility choice tasks.

The second experiment examined how varying the method of signal presentation can improve performance for the same dual-task paradigm. Results indicated that the redundant presentation of identical information in various modalities may be applied to enhance task performance rather than presenting the information in a single modality. The benefit of redundant displays for stimulus presentation is attributed to the possible mitigation of the competition for a scarce sensory resource by using a redundant modality to process part of the information. In previous studies, redundant presentation was mostly used in discrete communication tasks and limited to auditory and visual modality combinations. However, no systematic studies have been conducted examining redundant presentation of the commonly encountered dual-task prototype, for example, with primary tracking and discrete spatial S-R compatibility tasks. One of the aims of this proposed study was to examine the effects of redundant presentation on spatial information processing using a spatial compatibility task, which is by nature different from a communication task. The different combinations of the three modalities, namely, redundancy across visual–auditory, visual–tactile, auditory–tactile, and visual–auditory–tactile, were explored in this experiment. The experimental setup was almost identical to that of Experiment 1, except that the stimulus presentation in the spatial compatibility task was increased to more than one modality. The results of Experiment 2 demonstrated that resource competition and marked spatial S-R compatibility effects under dual-task working environments of the type were used. A reverse in front-rear position of the stimulus–key relation was the most confusing to the participants, and additional translation time was required for them to recognize the correct responses. Therefore, if an incompatibility condition exists, then a reverse in the left-right position should be a better choice than that in the front-rear position. In addition, redundant modality presentation results in shorter RT and less EP than those of single modality presentation. Among all redundant modality combinations, the visual–tactile display method may be as effective as the more conventional auditory display in a redundant surrounding, though the vibrotactile setting should be taken seriously to ensure people can recognize the direction information transferred by tactile modality. Responses to the stimuli in the left visual field [front-left (fl) and rear-left (rl) stimuli] and the perceived laterally presented stimuli were significantly faster, respectively indicating the existence of a left-field advantage and resulting in a left-field advantage for responses. Compared with rear keys [rear-left (RL) and rear-right (RR)], front keys [front-left (FL) and front-right (FR) keys] had shorter RTs. Therefore, the best operation area should be in the front-left position.

In Experiment 3, the same dual-task paradigm as that in Experiments 1 and 2 (except that the signal presentation attended a particular signal and ignored others) was used to investigate the dual-task performance in different mapping conditions. Among four types of mapping conditions, more compatible conditions contributed to better performance. Participants were likewise able to align the cursor with the target more precisely and accurately under the relatively low coding demands of the spatial compatibility task with the BC mapping condition. Three tracking speeds were tested to examine how selective attention works under different workloads. With the condition of over-demanding workload of attending one signal with distractions, the most appropriate tracking speed should be slow to minimize the RT for encoding. Tracking performance under a single signal in dual-task environments was always superior to that under redundant signals and signal with distraction. In addition, an optical level to track the speed of arousing an operator's attention was assumed. More accurate and quick responses were given when targets were presented simultaneously on both modalities compared with when the target was presented alone. Thus, attending one signal with distraction brought participants more mental workload than single and redundant signal presentations. Among three conditions (attend–visual, attend–auditory, and attend–tactile), the highest efficiency was obtained in the visual signal with distractor signals of auditory and tactile. This result implies that vision is more applicable for conveying spatial location, and auditory modality seems to be better for representing temporally structured events than spatial information in contrast to vision.

The outcomes of this research provide crucial and helpful ergonomics design implications and consequent recommendations for multimodal information processing to facilitate human–machine/computer system design and improve overall system performance. The ergonomics recommendations generated from this study may be summarized as follows:

1) Spatial compatibility of the discrete choice response task accounts for the opportunity for enhanced time-sharing for dual-task processing. Among the four spatial S-R compatibility conditions, the participants in all conditions (i.e., single, redundant, and attending to one signal with distractions; slow, medium, and fast tracking speeds) were able to align the cursor with the target more precisely and accurately under the relatively low coding demands of the spatial compatibility task with the BC mapping condition. Therefore, incompatible spatial mappings between displays and controls must be avoided to facilitate fast RTs and low response errors. If a spatially incompatible condition is inevitable for a single-hand response with two fingers, then longitudinal compatibility must be retained rather than transverse compatibility.

2) Minimal evidence was obtained concerning the horizontal (transverse) dimension that dominates the vertical (longitudinal) dimension in this experiment. Responses must be made utilizng two hands instead of two fingers of one hand whenever possible to establish a salient frame of reference in the horizontal right–left dimension. A reverse in the front-rear position of the stimulus–key relation was the most confusing condition to the participants, and additional translation time was required for them to recognize the correct responses. Thus, if an incompatibility condition exists, then a reverse in the left-right position must be the better option than that of the front-rear.

3) The dual tasks must be closely spaced whenever possible, such that the focal and ambient vision can be utilized simultaneously. The dual-task performance of the intra-modal configurations in this scenario can surpass that of the cross-modal configurations.

4) The performance relationship between primary tracking and secondary spatial S-R compatibility tasks is largely determined by the mental load that the primary task requires. The tracking performance under a single signal in dual-task environments was generally superior to that under redundant signals and signals with distraction. Given the over-demanding workload of attending to one signal with distractions condition, the most appropriate tracking speed must be slow speed, which could minimize the RT for encoding and contest for the same resource. The moving speed of the tracking target (i.e., tracking difficulty) must be determined carefully to attain the optimal level of arousal of the operator's attention and optimize his/her activity on the discrete task.

5) The redundant modality presentation results in shorter RT and less EP than the single modality presentation. Among all the redundant modality combinations, the visual–tactile interface can be as suitable as more traditional auditory interfaces of information in a redundant environment. However, the vibrotactile setting must be seriously controlled to ensure that people can sense the location conveyed by the tactile channel.

6) Stimuli in the left visual field (i.e., fl and rl) were responded to significantly faster, indicating the existence of a left-field advantage and perceived laterally presented stimuli (Nobre et al. 1997, Siman-Tov et al. 2007), thereby resulting in a left-field advantage for the responses. The front keys (FL and FR) had shorter RTs compared with the rear keys (RL and RR). Thus, the best operation area must be in the front-left position.

7) Attending to one signal with distractions provided more mental workload to the participants than the single and redundant signal presentations. The responses were more accurate and faster when the targets were presented simultaneously on both modalities than when the target was presented alone.

8) Among the three conditions (i.e., attend–visual, attend–auditory, and attend–tactile), the highest efficiency was obtained for the visual signal with auditory and tactile distractor signals. This result implies that vision is much more suitable for the sensing of spatial location, whereas auditory modality can better represent temporally structured events than spatial information unlike vision.