Past seminars in 2017

Andreas Vlachos on Wednesday, January 11

Date: Wednesday, January 11 at 2 pm
Place: LORIA, room A008
Speaker: Andreas Vlachos (University of Sheffield, UK)

Title: Imitation learning for structure prediction in natural language processing

Abstract: Imitation learning is a learning paradigm originally developed to learn robotic controllers from demonstrations by humans, e.g. autonomous helicopters from pilot’s demonstrations. Recently, algorithms for structure prediction were proposed under this paradigm and have been applied successfully to a number of tasks such as dependency parsing, information extraction, coreference resolution and semantic parsing. Key advantages are the ability to handle large output search spaces and to learn with non-decomposable loss functions. In this talk I will give a detailed overview of imitation leaning and some recent applications, including its use in training recurrent neural networks.

Enrique Alfonseca on Wednesday, February 8

Date: Wednesday, February 8 at 2 pm
Place: LORIA, room A008
Speaker: Enrique Alfonseca (Google, Zurich)

Title: Sentence Compression for Conversational Search

Abstract:

In this talk, I will discuss some of the challenges that conversational search faces in providing information to users. Some of them stem from the fact that many questions are answered with snippets from the web, which may not be the most appropriate to be inserted inside conversations. Secondly, I will describe in more detail the sentence compression and summarization approaches that are running in production in Google Home for improving the user experience around question answering.

Bio:

Enrique Alfonseca is a Staff Research Scientist at Google Research Europe, where he manages a team working on language understanding. Over the past ten years, he has worked at Google in ads quality, search quality and natural language processing, with a current focus on conversational search.

Marie-Francine Moens on Wednesday, March 8

Date: Wednesday, March 8 at 2 pm
Place: LORIA, room A008
Speaker: Marie-Francine Moens (KU Leuven)

Title: Acquiring Knowledge from Multimodal Sources to Aid Language Understanding

Abstract: Human language understanding (HLU) by a machine is of large economic and social value. In this lecture we consider language understanding of written text. First, we give an overview of the latest methods for HLU that map language to a formal knowledge representation which facilitates other automated tasks. Most current HLU systems are trained on texts that are manually annotated, which are often lacking in open domain applications. In addition, much content is left implicit in a text, which when humans read a text is inferred by relying on their world and common sense knowledge. We go deeper into the field of representation learning that nowadays is very much studied in computational linguistics. This field investigates methods for representing language as statistical concepts or as vectors, allowing straightforward methods of compositionality. The methods often use deep learning and its underlying neural network technologies to learn concepts from large text collections in an unsupervised way (i.e., without the need for manual annotations). We show how these methods can help, but also demonstrate that these methods are still insufficient to automatically acquire the necessary background knowledge and more specifically world and common sense knowledge needed for language understanding. We go deeper in on how we can learn knowledge jointly from textual and visual data to help language understanding, which will be illustrated with the first results obtained in the MUSTER CHIST-ERA project.

Michel Vacher on Wednesday, March 15

Date: Wednesday, 15th March 2017 at 2pm
Place: LORIA, room A008
Speaker: Michel Vacher (CNRS – LIG)

Title: Automatic recognition of atypical speech in smart homes: application to AAL

Abstract: About one third of the French population will be over 65 by the year 2050. Due to the lack of space in dedicated institutions for the elderly, home care services are a major concern and would benefit from technological assistance to relieve the work of the caregivers. This is the goal pursued by intelligent housing, which consists in providing houses equipped with computer technology to assist their inhabitants in the various situations of domestic life as well as in terms of comfort and security. Automatic Speech Recognition (ASR) could be an essential input in critical or abnormal situations, that is, when a surveillance system is most useful.

An ASR system adapted to this use must be adapted to the vocal characteristics of these persons. This implies meeting two requirements: on the one hand recognizing the voices of elderly and even aged persons and on the other hand recognizing calls for help made by a person in distress.

After introducing the context of intelligent housing and our methodology, we will present our studies on automatic recognition concerning aged voice and expressive voice in this context. Given the lack of available data, we recorded suitable corpora in both cases. After highlighting the decline in performance introduced using a generic system, we will show some solutions using MLLR type adaptation.

Finally, experiments on voice command for home automation in intelligent houses involving potential users will highlight the challenges that still need to be addressed in order to allow the use of this technology by individuals living alone at home.

Martin Heckmann on Wednesday, March 29

Date: Wednesday, 29th March 2017 at 2pm
Place: LORIA, room A008
Speaker: Martin Heckmann (Honda Research Institute Europe)

Title: Personalized speech interfaces

Abstract: In this presentation I will highlight recent results obtained at the Honda Research Institute Europe GmbH in the context of personalization of speech-based human-machine interfaces. I will first talk about the detection of word prominence. Thereby, I will discuss the performance of prominence detection from noisy audio signals, the contribution of additional visual information on the speaker’s face and head movements as well as different strategies to fuse the two modalities. After that I will present a method to adapt the prominence detection to an individual speaker. The method is inspired by fMLLR, a well-known method in GMM/HMM-based speech recognition systems, and adapted to the SVM-based prominence detection. Next, I will talk about an advanced driver assistance systems (ADAS) which we currently develop to support the driver in inner-city driving and which is controlled via speech. This system will allow the driver to flexibly formulate his requests for assistance while the situation develops. In particular, when facing a left turn at an intersection the driver can delegate the task of observing the right side traffic to the system as he would do to a co-driver. The system will then inform him when there is an appropriate gap in the traffic to make the turn. Results of a user study we performed show that drivers largely prefer our proposed system to an alternative visual system or driving without any assistance. In this context I will show results on the estimation of the individual driver’s left turning behavior. Based on these driver models the interaction with the driver can be personalized to further improve the usefulness of the system.

Aurélien Bellet on Wednesday, October 25

Date: Wednesday, 25th October 2017 at 2pm
Place: LORIA, room C005
Speaker: Aurélien Bellet (Inria Lille – Nord Europe)

Title: Private algorithms for decentralized collaborative machine learning

Abstract: With the advent of connected devices with computation and storage capabilities, it becomes possible to run machine learning on-device to provide personalized services. However, the currently dominant approach is to centralize data from all users on an external server for batch processing, sometimes without explicit consent from users and with little oversight. This centralization poses important privacy issues in applications involving sensitive data such as speech, medical records or geolocation logs.

In this talk, I will discuss an alternative setting where many agents with local datasets collaborate to learn models by engaging in a fully decentralized peer-to-peer network. We introduce and analyze asynchronous algorithms in the gossip and broadcast communication model that allow agents to improve upon their locally trained model by exchanging information with other agents that have similar objectives. Our first approach aims to smooth pre-trained local models over the network while accounting for the confidence that each agent has in its initial model. In our second approach, agents jointly learn and propagate their model by making iterative updates based on both their local dataset and the behavior of their neighbors. I will also describe how to make such algorithms differentially private to avoid leaking information about the local datasets, and analyze the resulting privacy-utility trade-off.

Victor Bisot on Wednesday, November 8

Date: Wednesday, 8th November 2017 at 2pm
Place: LORIA, room A008
Speaker: Victor Bisot (Télécom Paristech)

Title: Learning nonnegative representations for environmental sound analysis

Abstract: The growing interest for environmental sound analysis tasks is followed by an important increase of deep learning-based approaches. As most focus on finding adapted neural network architecture or algorithms, few works question the choice of the input representation. In this talk, we will discuss how nonnegative matrix factorization-based feature learning techniques are suited for such tasks, especially when used as input of deep neural networks. We start by highlighting the usefulness of unsupervised NMF variants for feature learning in multi-source environments. Next, we introduce a supervised variant of NMF known as TNMF (Task-driven NMF). The TNMF model aims at improving the quality of the decomposition by learning the nonnegative dictionaries that minimize the target classification loss. In a second part, we will exhibit similarities between NMF, TNMF and standard NN layers, justifying the potential of NMF-based features as input to various DNN models. The proposed models are evaluated on acoustic scene classification and overlapping sound event detection datasets and show improvements over standard CNN approaches for such tasks. Finally we will discuss how NMF dictionaries and neural networks parameters can be trained jointly in a common optimization problem by relying on the TNMF framework.

Yves Lepage on Friday, December 1

Date: Friday, 1st December 2017 at 10am
Place: LORIA, room B013
Speaker: Yves Lepage (Waseda University, Japan)

Title: Analogy for natural language processing (NLP) and machine translation (MT)

Abstract: In this talk, we introduce the notion of analogy and show its application to language data or to various NLP tasks.

We start from an algebraic definition between vector representations and apply it to pixel images representing Chinese characters. We illustrate a fast algorithm to rapidly structure data into analogical clusters.

By adding the notion of edit distance, we show how to capture analogies between strings of symbols and generalise analogical clusters to analogical grids, a structure similar to paradigm tables in morphology. Such a structure can be used to predict or explain word forms. In particular, we report work on explaining unseen words in Indonesian.

As solving analogical equations between strings of symbols is a key problem to address some tasks like machine translation, we report results on this topic obtained by using standard techniques or neural networks.

We finish by presenting a machine translation system by analogy and sketch future research on it.

Yannick Parmentier on Wednesday, December 6

Date: Wednesday, 6th December 2017 at 2pm
Place: LORIA, room A008
Speaker: Yannick Parmentier, (LORIA, Synalp team)

Title: From Description Languages to the Description of Description Languages

Abstract:

Having a fine-grained description of natural language is a prerequisite for many NLP applications such as dialogue systems or machine translation. Such descriptions were originally hand-crafted, and thus costly to build and extend. In the late 90ies, some attempts were made to semi-automatically generate these linguistic descriptions. Among these attempts, one may cite the metagrammar approach which consists in describing a linguistic description using a formal language. The XMG language and related compiler were designed in this context. XMG offers means to abstract over redundant lexicalised linguistic descriptions. While limited to tree-based linguistic description, it opened up new perspectives in linguistic resource engineering.

In this talk, we introduce XMG2, a modular and extensible tool for various linguistic description tasks. Based on the notion of meta-compilation (that is, compilation of compilers), XMG2 reuses the main concepts underlying XMG, namely logic programming and constraint satisfaction, to generate on-demand XMG-like compilers by assembling elementary units called language bricks. This brick-based definition of compilers permits users to design description languages in a highly flexible way. In particular, it makes it possible to support several levels of linguistic description (e.g. syntax, morphology) within a single description language.

Jonathan Berant on Wednesday, December 13

Date: Wednesday, 13th December 2017 at 2pm
Place: LORIA, room A008
Speaker: Jonathan Berant, Tel Aviv University (Israel)

Title: Talking to your Virtual Assistant about anything

Abstract: Conversational interfaces and virtual assistants are now part of our lives due to services such as Amazon Alexa, Google Voice, Microsoft Cortana, etc. Thus, translating natural language queries and commands into an executable form, also known as semantic parsing, is one of the prime challenges nowadays in natural language understanding. In this talk I would like to highlight the main challenges and limitations in the field of semantic parsing, and to describe ongoing work that addresses those challenges. First, semantic parsers require information to be stored in a knowledge-base, which substantially limits their coverage and applicability. Conversely, the web has huge coverage but search engines that access the web do not handle well language compositionality. We propose to treat the web as a KB and compute answers to complex questions in broad domains by decomposing the question into a sequence of simple questions, extract answers with a search engine, and recompose the answers to obtain a final result. Second, deploying virtual assistants in many domains (cars, homes, calendar, etc.) requires the ability to quickly develop semantic parsers. However, most past work trains semantic parsers from scratch for any domain, while disregarding training data from other domains. We propose a zero-shot approach for semantic parsing, where we decouple the structure of language from the contents of the domain and learn a domain-independent semantic parser. Last, one of the most popular setups for training semantic parsers is to use denotations as supervision. However, training from denotations results in a difficult search problem as well as a spuriousness issue where incorrect programs evaluate to the correct denotation. I will describe some recent work in which we try to address the challenges of search and spuriousness.