People usually communicate through multimodal dialogue. Multimodal interaction is in fact flexible and natural because it uses all five senses in parallel. For this reason we need to consider multimodal language definition and processing adopting the techniques and approaches used in Natural Language Processing (NLP). We describe the characteristics of a multimodal language by NLP, considering that the speech mode appears to be the most complete (it is considered the predominant mode). Users communicate and interact through reference to a set of key concepts. These can be expressed with different modes and/or by more than one mode simultaneously. When defining a multimodal language, these key concepts must be extracted. They are then processed using a natural language approach: any concept expressed in any mode can be "translated" into natural language. This implies that speech acts as a "ground layer" that all the modes refer to. We propose a tool to define multimodal languages, which allows the user to define the language in his/her own way to express concepts of a particular domain in the different modes.

Multimodal language processing using NLP approaches

Ferri Fernando;Grifoni Patrizia;
2007

Abstract

People usually communicate through multimodal dialogue. Multimodal interaction is in fact flexible and natural because it uses all five senses in parallel. For this reason we need to consider multimodal language definition and processing adopting the techniques and approaches used in Natural Language Processing (NLP). We describe the characteristics of a multimodal language by NLP, considering that the speech mode appears to be the most complete (it is considered the predominant mode). Users communicate and interact through reference to a set of key concepts. These can be expressed with different modes and/or by more than one mode simultaneously. When defining a multimodal language, these key concepts must be extracted. They are then processed using a natural language approach: any concept expressed in any mode can be "translated" into natural language. This implies that speech acts as a "ground layer" that all the modes refer to. We propose a tool to define multimodal languages, which allows the user to define the language in his/her own way to express concepts of a particular domain in the different modes.
2007
Istituto di Ricerche sulla Popolazione e le Politiche Sociali - IRPPS
978-1-59904-929-8
Multimodality
multimodal language
fusion
ambiguity
File in questo prodotto:
File Dimensione Formato  
prod_89301-doc_58263.pdf

solo utenti autorizzati

Descrizione: Irma_Multimodal_NLP
Dimensione 1.24 MB
Formato Adobe PDF
1.24 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/111265
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact