Abstract
We propose AuralNet, a novel 3D multi-source binaural sound source localization approach that localizes overlapping sources in both azimuth and elevation without prior knowledge of the number of sources. AuralNet employs a gated coarse-to-fine architecture, combining a coarse classification stage with a fine-grained regression stage, allowing for flexible spatial resolution through sector partitioning. The model incorporates a multi-head self-attention mechanism to capture spatial cues in binaural signals, enhancing robustness in noisy-reverberant environments. A masked multi-task loss function is designed to jointly optimize sound detection, azimuth, and elevation estimation. Extensive experiments in noisy-reverberant conditions demonstrate the superiority of AuralNet over recent methods. © 2025 International Speech Communication Association. All rights reserved.
| Original language | English |
|---|---|
| Title of host publication | Proceedings of Interspeech 2025 |
| Pages | 938-942 |
| Number of pages | 5 |
| DOIs | |
| Publication status | Published - Aug 2025 |
| Event | 26th Interspeech Conference 2025 - Rotterdam, Netherlands Duration: 17 Aug 2025 → 21 Aug 2025 https://www.interspeech2025.org/ |
Conference
| Conference | 26th Interspeech Conference 2025 |
|---|---|
| Place | Netherlands |
| City | Rotterdam |
| Period | 17/08/25 → 21/08/25 |
| Internet address |
Funding
This work was supported by the Science, Technology, and Innovation Commission of Shenzhen Municipality, China (Grant No. ZDSYS20220330161800001), the Shenzhen Science and Technology Program (Grant No. KQTD20221101093557010), and the Guangdong Science and Technology Program (Grant No. 2024B1212010002).
Research Keywords
- 3D localization
- binaural sound source localization
- coarse-to-fine architecture
- overlapping sources
- self-attention
Fingerprint
Dive into the research topics of 'AuralNet: Hierarchical Attention-based 3D Binaural Localization of Overlapping Speakers'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver