Enhancing Spatio-Temporal Auditory Attention Decoding with ST-AADNet

Ruofan Yan, Shu Peng, Zhige Chen, Zhi-an Huang, Rui Liu, Kay Chen Tan, Jibin Wu*

*Corresponding author for this work

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

Abstract

Individuals with hearing impairments often struggle to isolate and focus on a single speaker in multi-speaker environments. Neuroscience research has uncovered distinct patterns of brain activity associated with auditory attention that can be detected using electroencephalography (EEG) measurements. Existing deep learning methods developed for AAD encounter challenges in extracting effective spatial-temporal features and handling the data scarcity issues. To address these issues, this work proposes a novel neural network architecture named ST-AADNet, which integrates a convolutional neural network and a long short-term memory network to extract useful spatial-temporal features from EEG signals. Furthermore, we introduce a series of data augmentation methods tailored to enhance the model’s generalization capacity. Experimental studies on both audio-only and audio-video datasets demonstrate the superior performance of the proposed methods, securing the second position in The First Chinese Auditory Attention Decoding Challenge. © 2024 IEEE.
Original languageEnglish
Title of host publication2024 14th International Symposium on Chinese Spoken Language Processing (ISCSLP)
EditorsYanmin Qian, Qin Jin, Zhijian Ou, Zhenhua Ling, Zhiyong Wu, Ya Li, Lei Xie, Jianhua Tao
PublisherIEEE
Pages334-338
Number of pages5
ISBN (Electronic)9798331516826
ISBN (Print)979-8-3315-1683-3
DOIs
Publication statusPublished - 2024
Event14th International Symposium on Chinese Spoken Language Processing (ISCSLP 2024) - Beijing Conference Center, Beijing, China
Duration: 7 Nov 202410 Nov 2024
http://www.iscslp2024.com/home

Publication series

NameInternational Symposium on Chinese Spoken Language Processing, ISCSLP

Conference

Conference14th International Symposium on Chinese Spoken Language Processing (ISCSLP 2024)
PlaceChina
CityBeijing
Period7/11/2410/11/24
Internet address

Funding

This work was supported by the Research Grants Council of the Hong Kong SAR (Grant No. PolyU25216423, PolyU11211521, PolyU15218622, PolyU15215623, and C5052-23G), The Hong Kong Polytechnic University (Project IDs: P0043563, P0046094), and the National Natural Science Foundation of China (Grant No. 62306259 and U21A20512).

Research Keywords

  • Auditory Attention Decoding
  • EEG Signal Processing
  • Human-computer Interface

RGC Funding Information

  • RGC-funded

Fingerprint

Dive into the research topics of 'Enhancing Spatio-Temporal Auditory Attention Decoding with ST-AADNet'. Together they form a unique fingerprint.

Cite this