Abstract
Individuals with hearing impairments often struggle to isolate and focus on a single speaker in multi-speaker environments. Neuroscience research has uncovered distinct patterns of brain activity associated with auditory attention that can be detected using electroencephalography (EEG) measurements. Existing deep learning methods developed for AAD encounter challenges in extracting effective spatial-temporal features and handling the data scarcity issues. To address these issues, this work proposes a novel neural network architecture named ST-AADNet, which integrates a convolutional neural network and a long short-term memory network to extract useful spatial-temporal features from EEG signals. Furthermore, we introduce a series of data augmentation methods tailored to enhance the model’s generalization capacity. Experimental studies on both audio-only and audio-video datasets demonstrate the superior performance of the proposed methods, securing the second position in The First Chinese Auditory Attention Decoding Challenge. © 2024 IEEE.
| Original language | English |
|---|---|
| Title of host publication | 2024 14th International Symposium on Chinese Spoken Language Processing (ISCSLP) |
| Editors | Yanmin Qian, Qin Jin, Zhijian Ou, Zhenhua Ling, Zhiyong Wu, Ya Li, Lei Xie, Jianhua Tao |
| Publisher | IEEE |
| Pages | 334-338 |
| Number of pages | 5 |
| ISBN (Electronic) | 9798331516826 |
| ISBN (Print) | 979-8-3315-1683-3 |
| DOIs | |
| Publication status | Published - 2024 |
| Event | 14th International Symposium on Chinese Spoken Language Processing (ISCSLP 2024) - Beijing Conference Center, Beijing, China Duration: 7 Nov 2024 → 10 Nov 2024 http://www.iscslp2024.com/home |
Publication series
| Name | International Symposium on Chinese Spoken Language Processing, ISCSLP |
|---|
Conference
| Conference | 14th International Symposium on Chinese Spoken Language Processing (ISCSLP 2024) |
|---|---|
| Place | China |
| City | Beijing |
| Period | 7/11/24 → 10/11/24 |
| Internet address |
Funding
This work was supported by the Research Grants Council of the Hong Kong SAR (Grant No. PolyU25216423, PolyU11211521, PolyU15218622, PolyU15215623, and C5052-23G), The Hong Kong Polytechnic University (Project IDs: P0043563, P0046094), and the National Natural Science Foundation of China (Grant No. 62306259 and U21A20512).
Research Keywords
- Auditory Attention Decoding
- EEG Signal Processing
- Human-computer Interface
RGC Funding Information
- RGC-funded
Fingerprint
Dive into the research topics of 'Enhancing Spatio-Temporal Auditory Attention Decoding with ST-AADNet'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver