Humans and robots are increasingly working together at home and in the workplace, and successful interaction depends on the robot picking up on the human's cues and communicating according to social norms. The associated challenges are discussed in this talk, "Multimodal Human–Robot Interaction in Situated Task Descriptions", by Stephanie Gross of the Austrian Research Institute for Artificial Intelligence (OFAI). The talk is part of OFAI's 2022 Lecture Series.
Members of the public are cordially invited to attend the talk on Wednesday, 2 November at 18:30 CET (UTC+1). Attendance is possible in person at OFAI Headquarters (Freyung 6/6/7, 1010 Vienna); note that it is recommended to wear an FFP2 mask while on the premises. Alternatively, you may attend online via Zoom:
Meeting ID: 842 8244 2460
Talk abstract: Application areas in which robots and humans work together are rapidly growing, for example in private households or in industry. In these areas, a major communication context is situated task descriptions, where humans naturally use verbal as well as non-verbal channels to transmit information to their interlocutor. Therefore to successfully interact with humans, robots need to (1) share representations of concepts with their communication partner, (2) identify human communicative cues and extract and merge information transmitted via different channels, and (3) generate multimodal communicative behavior which is understandable for humans and complies with social norms. In this talk, I will discuss several challenges on the way, including the type of data used for modelling multimodal HRI, generating non-verbal social signals, or multimodal reference resolution in situated tasks.
Speaker biography: Stephanie Gross is a research scientist at the Austrian Research Institute for Artificial Intelligence (OFAI). She has been involved as PI and Co-PI in different national and international research projects. Her main research interests lie in the field of task-based human-human and human-robot interaction, including the identification and interpretation of relevant human multimodal signals in situated interaction and generating relevant multimodal signals on robots, as well as modelling robotic learning language through the interaction with humans.