Modeling Human Interaction Dynamics
Multimodal Computing and Interaction

Modeling Human Communication Dynamics

Face-to-face communication is a highly dynamic process where participants mutually exchange and interpret linguistic and gestural signals. Even when only one person speaks at the time, other participants exchange information continuously with each other and with the speaker through gesture, gaze, posture and facial expressions. To correctly interpret the high-level communicative signal, an observer needs to jointly integrate all spoken words, subtle prosodic changes and simultaneous gestures from all participants. This project will create a new generation of tools for computationally modeling the interdependence between linguistic symbols and nonverbal signals during social interactions. Our Social-Symbols-Signals (S3) framework not only encompasses the recent advances in machine learning, natural language processing and computer vision, but also defines a computational framework specifically designed to jointly analyze the language, gestures and social signals expressed by all participants. Our unified approach to S3 paves the way for new robust and efficient computational perception algorithms that recognize high-level communicative behaviors, and will enable new computational tools for researchers in behavioral sciences.