- Lina Maria Rojas Barahona (Cambridge University)
- Fabien Ringeval (Université Grenoble Alpes)
- Hans van Ditmarsch (LORIA)
- Denis Paperno (LORIA)
Date: Wednesday , 28th September 2016 at 2pm
Place: LORIA, room C005
Speaker: Lina Maria Rojas Barahona (Cambridge University)
Title: Exploiting Sentence and Context Representations in Deep Neural Models for Spoken Language Understanding
This paper presents a deep learning architecture for the semantic decoder component of a Statistical Spoken Dialogue System. In a slot-filling dialogue, the semantic decoder predicts the dialogue act and a set of slot-value pairs from a set of n-best hypotheses returned by the Automatic Speech Recognition. Most current models for spoken language understanding assume (i) word-aligned semantic annotations as in sequence taggers and (ii) delexicalisation, or a mapping of input words to domain-specific concepts using heuristics that try to capture morphological variation but that do not scale to other domains nor to language variation (e.g., morphology, synonyms, paraphrasing ). In this work the semantic decoder is trained using unaligned semantic annotations and it uses distributed semantic representation learning to overcome the limitations of explicit delexicalisation. The proposed architecture uses a convolutional neural network for the sentence representation and a long-short term memory network for the context representation. Results are presented for the publicly available DSTC2 corpus and an In-car corpus which is similar to DSTC2 but has a significantly higher word error rate (WER).
Date: Wednesday October 5 at 2 pm
Place: LORIA, room C005
Speaker: Fabien Ringeval (Université Grenoble Alpes)
Title: Affective computing from speech: towards robust recognition of emotions in ecologically valid situations
Technologies for the automatic recognition of emotion from speech have gained a significant increasing attention in the last decade, from both academic and industry, as it has found many applications in domains as various as, health care, education, serious games, brand reputation, advertisement, and robotics. Whereas good performance has been reported in the literature for acted emotions, the automatic recognition of spontaneous emotions, as expressed in ecologically valid situations, still remains an open-challenge, because such emotions are subtle, their expression and meaning depend on the speaker, the language and the culture, and they might be produced in noisy environments, which complicates the extraction of relevant cues from the speech signal. In this talk, I will present the most recent advances in the field and will show that, deep learning based methods such as long short-term memory recurrent neural networks (LSTM-RNNs), can help to contextualise relevant cues and tackle asynchrony issues for the “time- and value-continuous » prediction of emotion, but also enhance both acoustic waveform and low-level descriptors when captured in noisy conditions. Finally, I will show that, even though end-to-end learning by convolutional and LSTM-RNNs can provide promising results, they do not announce, yet, the end of signal processing for hand-engineered features extraction, as such features combined with non context-aware predictors can generalise even better than those learned by end-to-end methods, providing that they are carefully designed.
Date: Wednesday, October 19 at 2 pm
Place: LORIA, room C005
Speaker: Hans van Ditmarsch (LORIA, Cello team)
Title: Epistemic Gossip Protocols
A well-studied phenomenon in network theory since the 1970s are optimal schedules to distribute information by one-to-one communication between nodes. One can take these communicative actions to be telephone calls, and protocols to spread information this way are known as gossip protocols or epidemic protocols. Statistical approaches to gossip have taken a large flight since then, witness for example the survey « Epidemic Information Dissemination in Distributed Systems » by Eugster et al. (IEEE Computer, 2004). It is typical to assume a global scheduler who executes a possibly non-deterministic or randomized protocol. A departure from this methodology is to investigate epistemic gossip protocols, where an agent (node) will call another agent not because it is so instructed by a scheduler, but based on its knowledge or ignorance of the distribution of secrets over the network and of other agents’ knowledge or ignorance of that. Such protocols are distributed and do not need a central scheduler. This comes at a cost: they may take longer to terminate than non-epistemic, globally scheduled, protocols. A number of works have appeared over the past years (Apt et al., Attamah et al., van Ditmarsch et al., van Eijck & Gattinger, Herzig & Maffre) of which we present a survey, including open problems yet to be solved by the community.