Victor Bisot

Date: Wednesday, 8th November 2017 at 2pm
Place: LORIA, room A008
Speaker: Victor Bisot (Télécom Paristech)

Title: Learning nonnegative representations for environmental sound analysis

Abstract: The growing interest for environmental sound analysis tasks is followed by an important increase of deep learning-based approaches. As most focus on finding adapted neural network architecture or algorithms, few works question the choice of the input representation. In this talk, we will discuss how nonnegative matrix factorization-based feature learning techniques are suited for such tasks, especially when used as input of deep neural networks. We start by highlighting the usefulness of unsupervised NMF variants for feature learning in multi-source environments. Next, we introduce a supervised variant of NMF known as TNMF (Task-driven NMF). The TNMF model aims at improving the quality of the decomposition by learning the nonnegative dictionaries that minimize the target classification loss. In a second part, we will exhibit similarities between NMF, TNMF and standard NN layers, justifying the potential of NMF-based features as input to various DNN models. The proposed models are evaluated on acoustic scene classification and overlapping sound event detection datasets and show improvements over standard CNN approaches for such tasks. Finally we will discuss how NMF dictionaries and neural networks parameters can be trained jointly in a common optimization problem by relying on the TNMF framework.

Lire la suite

Aurélien Bellet

Date: Wednesday, 25th October 2017 at 2pm
Place: LORIA, room C005
Speaker: Aurélien Bellet (Inria Lille – Nord Europe)

Title: Private algorithms for decentralized collaborative machine learning

Abstract: With the advent of connected devices with computation and storage capabilities, it becomes possible to run machine learning on-device to provide personalized services. However, the currently dominant approach is to centralize data from all users on an external server for batch processing, sometimes without explicit consent from users and with little oversight. This centralization poses important privacy issues in applications involving sensitive data such as speech, medical records or geolocation logs.

In this talk, I will discuss an alternative setting where many agents with local datasets collaborate to learn models by engaging in a fully decentralized peer-to-peer network. We introduce and analyze asynchronous algorithms in the gossip and broadcast communication model that allow agents to improve upon their locally trained model by exchanging information with other agents that have similar objectives. Our first approach aims to smooth pre-trained local models over the network while accounting for the confidence that each agent has in its initial model. In our second approach, agents jointly learn and propagate their model by making iterative updates based on both their local dataset and the behavior of their neighbors. I will also describe how to make such algorithms differentially private to avoid leaking information about the local datasets, and analyze the resulting privacy-utility trade-off.

Lire la suite

Andreas Vlachos

Date: Wednesday, January 11 at 2 pm
Place: LORIA, room A008
Speaker: Andreas Vlachos (University of Sheffield, UK)

Title: Imitation learning for structure prediction in natural language processing

Abstract: Imitation learning is a learning paradigm originally developed to learn robotic controllers from demonstrations by humans, e.g. autonomous helicopters from pilot’s demonstrations. Recently, algorithms for structure prediction were proposed under this paradigm and have been applied successfully to a number of tasks such as dependency parsing, information extraction, coreference resolution and semantic parsing. Key advantages are the ability to handle large output search spaces and to learn with non-decomposable loss functions. In this talk I will give a detailed overview of imitation leaning and some recent applications, including its use in training recurrent neural networks.

Lire la suite

Enrique Alfonseca

Date: Wednesday, February 8 at 2 pm
Place: LORIA, room A008
Speaker: Enrique Alfonseca (Google, Zurich)

Title: Sentence Compression for Conversational Search


In this talk, I will discuss some of the challenges that conversational search faces in providing information to users. Some of them stem from the fact that many questions are answered with snippets from the web, which may not be the most appropriate to be inserted inside conversations. Secondly, I will describe in more  detail the sentence compression and summarization approaches that are running  in production in Google Home for improving the user experience around question answering.

Enrique Alfonseca is a Staff Research Scientist at Google Research Europe, where he manages a team working on language understanding. Over the past ten years, he  has worked at Google in ads quality, search quality and natural language processing, with a current focus on conversational search.

Lire la suite

Marie-Francine Moens

Date: Wednesday, March 8 at 2 pm
Place: LORIA, room A008
Speaker: Marie-Francine Moens (KU Leuven)

Title: Acquiring Knowledge from Multimodal Sources to Aid Language Understanding

Abstract: Human language understanding (HLU) by a machine is of large economic and social value. In this lecture we consider language understanding of written text. First, we give an overview of the latest methods for HLU that map language to a formal knowledge representation which facilitates other automated tasks. Most current HLU systems are trained on texts that are manually annotated, which are often lacking in open domain applications. In addition, much content is left implicit in a text, which when humans read a text is inferred by relying on their world and common sense knowledge. We go deeper into the field of representation learning that nowadays is very much studied in computational linguistics. This field investigates methods for representing language as statistical concepts or as vectors, allowing straightforward methods of compositionality. The methods often use deep learning and its underlying neural network technologies to learn concepts from large text collections in an unsupervised way (i.e., without the need for manual annotations). We show how these methods can help, but also demonstrate that these methods are still insufficient to automatically acquire the necessary background knowledge and more specifically world and common sense knowledge needed for language understanding. We go deeper in on how we can learn knowledge jointly from textual and visual data to help language understanding, which will be illustrated with the first results obtained in the MUSTER CHIST-ERA project.

Lire la suite

Michel Vacher

Date: Wednesday, 15th March 2017 at 2pm
Place: LORIA, room A008
Speaker: Michel Vacher (CNRS – LIG)

Title: Automatic recognition of atypical speech in smart homes: application to AAL

Abstract: About one third of the French population will be over 65 by the year 2050. Due to the lack of space in dedicated institutions for the elderly, home care services are a major concern and would benefit from technological assistance to relieve the work of the caregivers. This is the goal pursued by intelligent housing, which consists in providing houses equipped with computer technology to assist their inhabitants in the various situations of domestic life as well as in terms of comfort and security. Automatic Speech Recognition (ASR) could be an essential input in critical or abnormal situations, that is, when a surveillance system is most useful.

An ASR system adapted to this use must be adapted to the vocal characteristics of these persons. This implies meeting two requirements: on the one hand recognizing the voices of elderly and even aged persons and on the other hand recognizing calls for help made by a person in distress.

After introducing the context of intelligent housing and our methodology, we will present our studies on automatic recognition concerning aged voice and expressive voice in this context. Given the lack of available data, we recorded suitable corpora in both cases. After highlighting the decline in performance introduced using a generic system, we will show some solutions using MLLR type adaptation.

Finally, experiments on voice command for home automation in intelligent houses involving potential users will highlight the challenges that still need to be addressed in order to allow the use of this technology by individuals living alone at home.

Lire la suite

Martin Heckmann

Date: Wednesday, 29th March 2017 at 2pm
Place: LORIA, room A008
Speaker: Martin Heckmann (Honda Research Institute Europe)

Title: Personalized speech interfaces

Abstract: In this presentation I will highlight recent results obtained at the Honda Research Institute Europe GmbH in the context of personalization of speech-based human-machine interfaces. I will first talk about the detection of word prominence. Thereby, I will discuss the performance of prominence detection from noisy audio signals, the contribution of additional visual information on the speaker’s face and head movements as well as different strategies to fuse the two modalities. After that I will present a method to adapt the prominence detection to an individual speaker. The method is inspired by fMLLR, a well-known method in GMM/HMM-based speech recognition systems, and adapted to the SVM-based prominence detection. Next, I will talk about an advanced driver assistance systems (ADAS) which we currently develop to support the driver in inner-city driving and which is controlled via speech. This system will allow the driver to flexibly formulate his requests for assistance while the situation develops. In particular, when facing a left turn at an intersection the driver can delegate the task of observing the right side traffic to the system as he would do to a co-driver. The system will then inform him when there is an appropriate gap in the traffic to make the turn. Results of a user study we performed show that drivers largely prefer our proposed system to an alternative visual system or driving without any assistance. In this context I will show results on the estimation of the individual driver’s left turning behavior. Based on these driver models the interaction with the driver can be personalized to further improve the usefulness of the system.

Lire la suite