January 16, 2019 at 2pm
- Md Sahidullah (Inria – LORIA, Multispeech team)
Date: Wednesday, 16th January 2019 at 2pm
Place: LORIA, room A008
Speaker: Md Sahidullah (Inria – LORIA, Multispeech team)
Title: Speaker embeddings: from i-vector to x-vector and beyond
Speaker recognition is the task of recognizing a human from his/her voice. The state-of-the-art speaker recognition technology uses a speaker embedding method for representing a speech utterance of arbitrary length in the form of a fixed-dimensional vector. The recent advancements in deep neural network (DNN) research have enabled the development of robust and efficient speaker embedding techniques. In this talk, I will first provide a brief overview of speaker recognition basics. It will be followed by the description of the conventional speaker embedding method popularly known as i-vector. Then I will present various attempts to develop speech signal representations with DNN-based discriminative training. I will explain the recently introduced x-vector embedding which showed promising speaker recognition performance. This talk will end with a discussion on potential future directions in the speaker embedding research including our ongoing work.