CNR Institutional Research Information System

Many tasks in language analysis are described as the maximally economic mapping of one level of linguistic representation onto another such level. Over the past decade, many different machine-learning strategies have been developed to automatically induce such mappings directly from data. In this paper, we contend that the way most learning algorithms have been applied to problems of language analysis reflects a strong bias towards a compositional (or biunique) model of interlevel mapping. Although this is justified in some cases, we contend that biunique inter-level mapping is not a jack of all trades. A model of analogical learning, based on a paradigmatic reanalysis of memorized data, is presented here. The methodological pros and cons of this approach are discussed in relation to a number of germane linguistic issues and illustrated in the context of three case studies: word pronunciation, word analysis, and word sense disambiguation. The evidence produced here seems to suggest that the brain is not designed to carry out the logically simplest and maximally economic way of relating form and function in language. Rather we propose a radical shift of emphasis in language learning from syntagmatic inter-level mapping to paradigmatically-constrained intra-level mapping.

The hidden dimension: a paradigmatic view of data-driven NLP.

Vito Pirrelli;François Yvon

1999

Abstract

Many tasks in language analysis are described as the maximally economic mapping of one level of linguistic representation onto another such level. Over the past decade, many different machine-learning strategies have been developed to automatically induce such mappings directly from data. In this paper, we contend that the way most learning algorithms have been applied to problems of language analysis reflects a strong bias towards a compositional (or biunique) model of interlevel mapping. Although this is justified in some cases, we contend that biunique inter-level mapping is not a jack of all trades. A model of analogical learning, based on a paradigmatic reanalysis of memorized data, is presented here. The methodological pros and cons of this approach are discussed in relation to a number of germane linguistic issues and illustrated in the context of three case studies: word pronunciation, word analysis, and word sense disambiguation. The evidence produced here seems to suggest that the brain is not designed to carry out the logically simplest and maximally economic way of relating form and function in language. Rather we propose a radical shift of emphasis in language learning from syntagmatic inter-level mapping to paradigmatically-constrained intra-level mapping.

Scheda breve

Scheda completa

Scheda completa (DC)

Campo DC	Valore	Lingua
dc.authority.ancejournal	JOURNAL OF EXPERIMENTAL AND THEORETICAL ARTIFICIAL INTELLIGENCE ONLINE	-
dc.authority.orgunit	Istituto di linguistica computazionale "Antonio Zampolli" - ILC	-
dc.authority.people	Vito Pirrelli	it
dc.authority.people	François Yvon	it
dc.collection.id.s	b3f88f24-048a-4e43-8ab1-6697b90e068e	*
dc.collection.name	01.01 Articolo in rivista	*
dc.contributor.appartenenza	Istituto di linguistica computazionale "Antonio Zampolli" - ILC	*
dc.contributor.appartenenza.mi	918	*
dc.date.accessioned	2024/02/20 09:30:20	-
dc.date.available	2024/02/20 09:30:20	-
dc.date.issued	1999	-
dc.description.abstracteng	Many tasks in language analysis are described as the maximally economic mapping of one level of linguistic representation onto another such level. Over the past decade, many different machine-learning strategies have been developed to automatically induce such mappings directly from data. In this paper, we contend that the way most learning algorithms have been applied to problems of language analysis reflects a strong bias towards a compositional (or biunique) model of interlevel mapping. Although this is justified in some cases, we contend that biunique inter-level mapping is not a jack of all trades. A model of analogical learning, based on a paradigmatic reanalysis of memorized data, is presented here. The methodological pros and cons of this approach are discussed in relation to a number of germane linguistic issues and illustrated in the context of three case studies: word pronunciation, word analysis, and word sense disambiguation. The evidence produced here seems to suggest that the brain is not designed to carry out the logically simplest and maximally economic way of relating form and function in language. Rather we propose a radical shift of emphasis in language learning from syntagmatic inter-level mapping to paradigmatically-constrained intra-level mapping.	-
dc.description.affiliations	Istituto di Linguistica Computazionale "A. Zampolli", CNR, Pisa ENST, Department of Computer Science and CNRS, Paris	-
dc.description.allpeople	Vito Pirrelli; François Yvon	-
dc.description.allpeopleoriginal	Vito Pirrelli, François Yvon	-
dc.description.fulltext	none	en
dc.description.numberofauthors	1	-
dc.identifier.uri	https://hdl.handle.net/20.500.14243/264349	-
dc.language.iso	eng	-
dc.relation.firstpage	391	-
dc.relation.lastpage	408	-
dc.relation.volume	11	-
dc.subject.keywords	data-driven NLP	-
dc.subject.keywords	memory-based machine learning	-
dc.subject.keywords	analogical language learning	-
dc.subject.singlekeyword	data-driven NLP	*
dc.subject.singlekeyword	memory-based machine learning	*
dc.subject.singlekeyword	analogical language learning	*
dc.title	The hidden dimension: a paradigmatic view of data-driven NLP.	en
dc.type.driver	info:eu-repo/semantics/article	-
dc.type.full	01 Contributo su Rivista::01.01 Articolo in rivista	it
dc.type.miur	262	-
dc.type.referee	Sì, ma tipo non specificato	-
dc.ugov.descaux1	273631	-
iris.orcid.lastModifiedDate	2024/03/01 17:47:52	*
iris.orcid.lastModifiedMillisecond	1709311672049	*
iris.scopus.extIssued	1999	-
iris.scopus.extTitle	The hidden dimension: A paradigmatic view of data-driven NLP	-
iris.sitodocente.maxattempts	1	-
Appare nelle tipologie:	01.01 Articolo in rivista

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/264349

Citazioni

ND

ND

ND

social impact