Multimodal emotion recognition in HCI environments


Research on computational models of emotion and emotion recognition has been in the forefront of interest for more than a decade. The abundance of non-intrusive sensors (mainly cameras and microphones), data and ubiquitous computing power caters for real-time results in areas where this was deemed impossible a few years ago. As a result, emotion recognition and peripheral or related issues (body and hand gesture and gait analysis, speech recognition, eye gaze and head pose related to attention estimation in multi-person environments, etc.) can now benefit from the available resources, as well as the interest shown to these applications from major authorities in psychology like K. Scherer and P. Ekman.

In addition to this, several research initiatives in the EU (TMR, FP5, 6 and 7: ICT, e-Health, Technology-Enhanced Learning and recently Digital Content and Libraries) promote research in this field and encourage researchers to establish strong connections with theoretician via Networks of Excellence (Humaine, Similar, SSPNET) as well as provide tangible results of application that benefit from this technology (IP Callas, STREP Feelix-Growing, STREP Agent-Dysl, etc.) Another indication of the interest in emotion-related research is the fact that papers on affective computing appear in more than 90 conferences across disciplines and almost 30 special issues in high-impact journals have been published or prepared; the momentum is such that more than 500 researchers participate in the Humaine Association (, a follow-up initiative of the Humaine Network of Excellence which also plans to produce a journal on related topics in association with the IEEE.


The tutorial will be divided in three axes, each corresponding to the words in the title:

  • Multimodal
    • What is multimodality?
    • The characteristics and intricacies of multimodal interaction
    • Available modalities in HCI
    • Fusing modalities
    • Handling uncertainty/noise
  • Emotion
    • Emotion, mood, personality - terminology clarification
    • Psychological theories of emotion related to HCI
    • Emotion in interaction - do we need it, can we benefit from it, how is it defined?
    • Computational models of emotion
    • From signals to signs of emotion
    • Embodying emotion: robots, interfaces, ECAs
  • Recognition
    • Databases of natural expressivity
    • Application scenarios
    • Unimodal recognition of features from visual and prosody information
    • Results from processing natural databases
    • Natural interaction vs. robust recognition
    • Fallback to less detailed recognition in the presence of noise
    • From emotional episodes to understanding behavior
    • What else can we recognize? What can't we?
    • Open issues in emotion-related research

Intended Audience

Researchers in the fields of:

  • Image processing/computer vision
  • Speech processing
  • Machine learning
  • Neural networks
  • Human factors
  • Human-computer interaction
  • Assistive computing
  • Robotics


Dr. Kostas Karpouzis
Image, Video and Multimedia Systems Lab
National Technical University of Athens
Email: kkarpou[AT]cs[dot]ntua[dot]gr

Organizing Institutions