CNR Institutional Research Information System

An attempt to integrate different techniques and various perspectives on lexical knowledge acquisition from text corpora is illustrated. In this program we use three distinct methodologies to handle text data, summarized as follows: (1) Simple and traditional stochastic techniques working on pairs of words. (2) A lexicographic approach guided by the techniques mentioned in Section 1, aiming at a formal description of sense disambiguation in terms of rules. (3) More complex and sophisticated statistical methods working on sets of words (possibly belonging to the same semantic field), which allow us to gain a new perspective on the problem of sense disambiguation. The three approaches are complementary to each other and can be contextually used. The overall objective of our work is to try to integrate data and information coming from different sources, i.e. machine-readable dictionaries, text corpora, linguists' or lexicographers' knowledge, within a computational lexicon. We stress the necessity of convergence of (1) lexical and textual projects, (2) computational and traditional lexicography, and (3) statistical and rule based approaches.

Corpora and Computational Lexica: Integration of Different Methodologies of Lexical Knowledge Acquisition

Remo Bindi;Nicoletta Calzolari;Monica Monachini;Vito Pirrelli;Antonio Zampolli

1994

Abstract

An attempt to integrate different techniques and various perspectives on lexical knowledge acquisition from text corpora is illustrated. In this program we use three distinct methodologies to handle text data, summarized as follows: (1) Simple and traditional stochastic techniques working on pairs of words. (2) A lexicographic approach guided by the techniques mentioned in Section 1, aiming at a formal description of sense disambiguation in terms of rules. (3) More complex and sophisticated statistical methods working on sets of words (possibly belonging to the same semantic field), which allow us to gain a new perspective on the problem of sense disambiguation. The three approaches are complementary to each other and can be contextually used. The overall objective of our work is to try to integrate data and information coming from different sources, i.e. machine-readable dictionaries, text corpora, linguists' or lexicographers' knowledge, within a computational lexicon. We stress the necessity of convergence of (1) lexical and textual projects, (2) computational and traditional lexicography, and (3) statistical and rule based approaches.

Scheda breve

Scheda completa

Scheda completa (DC)

Campo DC	Valore	Lingua
dc.authority.ancejournal	LITERARY & LINGUISTIC COMPUTING	-
dc.authority.people	Remo Bindi	it
dc.authority.people	Nicoletta Calzolari	it
dc.authority.people	Monica Monachini	it
dc.authority.people	Vito Pirrelli	it
dc.authority.people	Antonio Zampolli	it
dc.collection.id.s	b3f88f24-048a-4e43-8ab1-6697b90e068e	*
dc.collection.name	01.01 Articolo in rivista	*
dc.contributor.appartenenza	Istituto di linguistica computazionale "Antonio Zampolli" - ILC	*
dc.contributor.appartenenza.mi	918	*
dc.date.accessioned	2024/02/18 11:05:10	-
dc.date.available	2024/02/18 11:05:10	-
dc.date.issued	1994	-
dc.description.abstracteng	An attempt to integrate different techniques and various perspectives on lexical knowledge acquisition from text corpora is illustrated. In this program we use three distinct methodologies to handle text data, summarized as follows: (1) Simple and traditional stochastic techniques working on pairs of words. (2) A lexicographic approach guided by the techniques mentioned in Section 1, aiming at a formal description of sense disambiguation in terms of rules. (3) More complex and sophisticated statistical methods working on sets of words (possibly belonging to the same semantic field), which allow us to gain a new perspective on the problem of sense disambiguation. The three approaches are complementary to each other and can be contextually used. The overall objective of our work is to try to integrate data and information coming from different sources, i.e. machine-readable dictionaries, text corpora, linguists' or lexicographers' knowledge, within a computational lexicon. We stress the necessity of convergence of (1) lexical and textual projects, (2) computational and traditional lexicography, and (3) statistical and rule based approaches.	-
dc.description.affiliations	Istituto di Linguistica Computazionale "Antonio Zampolli", CNR - Pisa	-
dc.description.allpeople	Bindi, Remo; Calzolari, Nicoletta; Monachini, Monica; Pirrelli, Vito; Zampolli, Antonio	-
dc.description.allpeopleoriginal	Remo Bindi, Nicoletta Calzolari, Monica Monachini, Vito Pirrelli and Antonio Zampolli	-
dc.description.fulltext	none	en
dc.description.numberofauthors	5	-
dc.identifier.doi	10.1093/llc/9.1.29	-
dc.identifier.uri	https://hdl.handle.net/20.500.14243/115891	-
dc.language.iso	eng	-
dc.relation.firstpage	29	-
dc.relation.lastpage	46	-
dc.relation.volume	9(1)	-
dc.title	Corpora and Computational Lexica: Integration of Different Methodologies of Lexical Knowledge Acquisition	en
dc.type.driver	info:eu-repo/semantics/article	-
dc.type.full	01 Contributo su Rivista::01.01 Articolo in rivista	it
dc.type.miur	262	-
dc.ugov.descaux1	225548	-
iris.orcid.lastModifiedDate	2024/04/04 23:06:15	*
iris.orcid.lastModifiedMillisecond	1712264775517	*
iris.scopus.extIssued	1994	-
iris.scopus.extTitle	Corpora and computational lexica: Integration of different methodologies of lexical knowledge acquisition	-
iris.sitodocente.maxattempts	1	-
iris.unpaywall.doi	10.1093/llc/9.1.29	*
iris.unpaywall.isoa	false	*
iris.unpaywall.journalisindoaj	false	*
iris.unpaywall.metadataCallLastModified	23/12/2025 04:02:53	-
iris.unpaywall.metadataCallLastModifiedMillisecond	1766458973502	-
iris.unpaywall.oastatus	closed	*
Appare nelle tipologie:	01.01 Articolo in rivista

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/115891

Citazioni

ND

ND

ND

social impact