Abstract
Speaker verification aims to recognize target speakers with very few enrollment utterances. Conventional approaches learn a representation model to extract the speaker embeddings for verification. Recently, there are several new approaches in meta-learning which try to learn a shared metric space. Among these approaches, prototypical networks aim at learning a non-linear mapping from the input space to an embedding space with a predefined distance metric. In this paper, we investigate the use of prototypical networks in a small footprint text-independent speaker verification task. Our work is evaluated on SRE10 evaluation set. Experiments show that prototypical networks outperform the conventional method when the amount of data per training speaker is limited.
Original language | English |
---|---|
Title of host publication | 2020 IEEE International Conference on Acoustics, Speech, and Signal Processing |
Subtitle of host publication | Proceedings |
Publisher | IEEE |
Pages | 6804-6808 |
Volume | 2020-May |
ISBN (Electronic) | 978-1-5090-6631-5 |
ISBN (Print) | 978-1-5090-6632-2 |
DOIs | |
Publication status | Published - May 2020 |
Event | 45th International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2020 - virtual Duration: 4 May 2020 → 8 May 2020 https://2020.ieeeicassp.org/ |
Publication series
Name | International Conference on Acoustics, Speech, and Signal Processing (ICASSP) |
---|---|
Publisher | IEEE |
ISSN (Print) | 1520-6149 |
ISSN (Electronic) | 2379-190X |
Conference
Conference | 45th International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2020 |
---|---|
Period | 4/05/20 → 8/05/20 |
Internet address |
Research Keywords
- Meta learning
- Prototypical networks
- Speaker verification