Arbeitsbereich WSVFachbereich InformatikUniversität Hamburg
18.203 Vorlesung: CINACS-Ringvorlesung: Cross-modal interaction in natural and artificial cognitive systems
Sommersemester 2007
Veranstalter
Christopher Habel, Wolfgang Menzel, Jianwei Zhang
Zeit/Ort
Mo 12-14 F-534
KVV-Eintrag
Inhalt
Natural cognitive systems - as humans - profit from combining the input of the different sensory systems not only because each modality provides information about different aspects of the world but also because the different senses can jointly encode particular aspects of events, e.g. the location or meaning of an event. However, the gains of cross-modal integration come at a cost: since each modality uses very specific representations, information needs to be transferred into a code that allows the different senses to interact. Corresponding problems arise in human communication when information about one topic is expressed using combinations of different formats such as written or spoken language and graphics.
In this lecture, we will focus on models and methods suitable to realize processes and representations for cross-modal interactions in artificial cognitive systems, i.e. computational systems. After introducing in the core phenomena of cross-modal interaction we exemplify the mono-modal basis of cross-modal interaction and the current development of informatics-oriented research in this field with three topics:
  • Cross modal information fusion for a range of non-sensory, i.e. categorial data in the area of speech and language processing, where visual stimuli have to be merged with the available acoustic evidence. Among the language-related information sources certainly lip reading provides one of the major contributions of additional evidence, but more recently eyebrow movement and its relationship to suprasegmental features of human speech has attracted considerable attention as well.
  • The interaction of representational modalities - as language, diagrams and maps - in the interdependence to sensory modalities, in particular, to vision, auditory perception and haptics. The computational analysis of multi-modal documents or dialogues is a prerequisite for advanced intelligent information systems as well as for human-computer interaction, in particular human-robot interaction. Furthermore, such computational devices can be used in systems giving assistance to impaired, e.g. blind or visual impaired, or deaf, people.
  • Multimodal memory plays an important role for the next generation of mobile robots and service robots. Using grounded memories of robot actions - use real-world visual, audio and tactile data collected by the robot - instead of solely a sensorimotor controller, the robot's memory can be enriched and thus robustness of both representations and retrieval process of autonomous agents will increase.
Literatur
Literatur wird in der Vorlesung bekannt gegeben.
Folien