Paul Magron

Date: Wednesday, 17th October 2018 at 2pm
Place: LORIA, room A008
Speaker: Paul Magron (Tampere University of Technology)

Title: Probabilistic modeling of the phase for audio source separation

Abstract:
Many audio source separation techniques act on a time-frequency representation of the data, such as the short-time Fourier transform (STFT), since it reveals the underlying structure of sounds. These methods usually discard the phase information and process spectrogram-like quantities only. The sources are finally retrieved by means of a Wiener-like filter, which assigns the phase of the original mixture to each isolated source. However, this introduces interference and artifacts in the estimates, which highlights the need for more sophisticated phase recovery techniques. In this talk, we will present our recent work on phase-aware probabilistic models for audio source separation. Firstly, we will model the phase as a non-uniform random variable based on the von Mises distribution. This allows us to incorporate some prior knowledge about the phase, e.g., that arise from a signal model (sums of sinusoids). In particular, we will show that the traditional uniform model and the von Mises model are not contradictory, but rather rely on different assumptions about the phase. Secondly, we will present mixture models based on the anisotropic Gaussian distribution, from which we can derive phase-aware estimators of the sources in the STFT domain. This results in an anisotropic Wiener filter, which preserves some of the interesting statistical properties of the Wiener filter, while enabling one to account for a phase model. Finally, we will propose techniques for jointly inferring the magnitude and the phase based on this framework. Indeed, by structuring the variance parameters of these models through e.g., nonnegative matrix factorization or deep neural networks, we can derive complete and phase-aware source separation systems.

Emmanuel Dupoux

Date: Wednesday, 21st November 2018 at 2pm
Place: LORIA, room A008
Speaker: Emmanuel Dupoux (EHESS, Laboratoire de Sciences Cognitives et Psycholinguistique)

Title: Towards developmental AI

Abstract:
Even though current machine learning techniques yield systems that achieve parity with humans on several high level tasks, the learning algorithms themselves are orders of magnitude less data efficient than those used by humans, as evidenced by the speed and resilience with which infants learn language and common sense. I review some of our recent attempts to reverse engineer such abilities in the area of unsupervised or weakly supervised learning of speech representations and speech terms, and the learning the laws of intuitive physics by observation of videos. I argue that a triple effort in data collection, algorithm development and fine grained human/machine comparisons is needed to uncover these developmental algorithms.