The information retrieval (IR) world has changed immensely in recent years, with the enormous increase in availability of searchable full text and of powerful engines for searching the text. In this new framework a semantic tool like the thesaurus still continues to be of value and can be seen as a successful tool in facilitating a punctual IR. Traditionally, a thesaurus is defined as a controlled vocabulary in which semantic relationships between terms (hierarchy, association and equivalence) are made explicit and employed to improve recall and precision in the retrieval of information. Taking into account information access on a planetary scale, a thesaurus is also primarily a cultural product reflecting the peculiarities of the environment in which it has been built and is culturally embedded. In building a multilingual thesaurus, one represents knowledge and establishes links between not only different languages, but different cultures and worldviews, so that relationships between terms could be not universally valid. Not rarely apparently exact equivalences are indeed closer to a relation of similarity. Words extrapolated from a language are, in fact, verbal expressions for concepts engraved in the culture of that language, that do not necessarily have an exactly corresponding concept and term in any other language/culture. Against this background the focus of this paper will be on how to structurally organize knowledge for operational purposes, not betraying any of the treated languages. Transposing this inside a thesaurus implies choosing how to plan the thesaural semantic structurization and can lead to designing a multilingual thesaurus in which the semantic structures of all languages are managed symmetrically or non-symmetrically. In the former, identical versions for each language are established, so that each descriptor has an equivalent one in any of the linguistic versions of the tool, presupposing a high level of compatibility between all idioms represented in the tool, and all terms are connected through the same relationships. In the latter case, the semantic structure of each language version reflects the conceptual organization of the culture expressed through that idiom, i.e. every language has a separately developed semantic structure, which is compared to the other languages/cultures registered in the thesaurus only in a second stage, in order to identify equivalences where they exist, without presupposing them, and apparently allowing a more faithful cultural representativeness. Both approaches will be examined also in the framework of examples with the aim of highlighting their respective peculiarities, supporting arguments and those against their adoption.

Semantic knowledge representation within thesauri accross cultures

Mazzocchi F
2010

Abstract

The information retrieval (IR) world has changed immensely in recent years, with the enormous increase in availability of searchable full text and of powerful engines for searching the text. In this new framework a semantic tool like the thesaurus still continues to be of value and can be seen as a successful tool in facilitating a punctual IR. Traditionally, a thesaurus is defined as a controlled vocabulary in which semantic relationships between terms (hierarchy, association and equivalence) are made explicit and employed to improve recall and precision in the retrieval of information. Taking into account information access on a planetary scale, a thesaurus is also primarily a cultural product reflecting the peculiarities of the environment in which it has been built and is culturally embedded. In building a multilingual thesaurus, one represents knowledge and establishes links between not only different languages, but different cultures and worldviews, so that relationships between terms could be not universally valid. Not rarely apparently exact equivalences are indeed closer to a relation of similarity. Words extrapolated from a language are, in fact, verbal expressions for concepts engraved in the culture of that language, that do not necessarily have an exactly corresponding concept and term in any other language/culture. Against this background the focus of this paper will be on how to structurally organize knowledge for operational purposes, not betraying any of the treated languages. Transposing this inside a thesaurus implies choosing how to plan the thesaural semantic structurization and can lead to designing a multilingual thesaurus in which the semantic structures of all languages are managed symmetrically or non-symmetrically. In the former, identical versions for each language are established, so that each descriptor has an equivalent one in any of the linguistic versions of the tool, presupposing a high level of compatibility between all idioms represented in the tool, and all terms are connected through the same relationships. In the latter case, the semantic structure of each language version reflects the conceptual organization of the culture expressed through that idiom, i.e. every language has a separately developed semantic structure, which is compared to the other languages/cultures registered in the thesaurus only in a second stage, in order to identify equivalences where they exist, without presupposing them, and apparently allowing a more faithful cultural representativeness. Both approaches will be examined also in the framework of examples with the aim of highlighting their respective peculiarities, supporting arguments and those against their adoption.
2010
Istituto dei Sistemi Complessi - ISC
978-0-934068-17-8
knowledge representation
information management
multilingual thesauri
globalization
cultural diversity
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/435067
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact