Many tasks in language analysis are described as the maximally economic mapping of one level of linguistic representation onto another such level. Over the past decade, many different machine-learning strategies have been developed to automatically induce such mappings directly from data. In this paper, we contend that the way most learning algorithms have been applied to problems of language analysis reflects a strong bias towards a compositional (or biunique) model of interlevel mapping. Although this is justified in some cases, we contend that biunique inter-level mapping is not a jack of all trades. A model of analogical learning, based on a paradigmatic reanalysis of memorized data, is presented here. The methodological pros and cons of this approach are discussed in relation to a number of germane linguistic issues and illustrated in the context of three case studies: word pronunciation, word analysis, and word sense disambiguation. The evidence produced here seems to suggest that the brain is not designed to carry out the logically simplest and maximally economic way of relating form and function in language. Rather we propose a radical shift of emphasis in language learning from syntagmatic inter-level mapping to paradigmatically-constrained intra-level mapping.
The hidden dimension: a paradigmatic view of data-driven NLP.
Vito Pirrelli;
1999
Abstract
Many tasks in language analysis are described as the maximally economic mapping of one level of linguistic representation onto another such level. Over the past decade, many different machine-learning strategies have been developed to automatically induce such mappings directly from data. In this paper, we contend that the way most learning algorithms have been applied to problems of language analysis reflects a strong bias towards a compositional (or biunique) model of interlevel mapping. Although this is justified in some cases, we contend that biunique inter-level mapping is not a jack of all trades. A model of analogical learning, based on a paradigmatic reanalysis of memorized data, is presented here. The methodological pros and cons of this approach are discussed in relation to a number of germane linguistic issues and illustrated in the context of three case studies: word pronunciation, word analysis, and word sense disambiguation. The evidence produced here seems to suggest that the brain is not designed to carry out the logically simplest and maximally economic way of relating form and function in language. Rather we propose a radical shift of emphasis in language learning from syntagmatic inter-level mapping to paradigmatically-constrained intra-level mapping.| Campo DC | Valore | Lingua |
|---|---|---|
| dc.authority.ancejournal | JOURNAL OF EXPERIMENTAL AND THEORETICAL ARTIFICIAL INTELLIGENCE ONLINE | - |
| dc.authority.orgunit | Istituto di linguistica computazionale "Antonio Zampolli" - ILC | - |
| dc.authority.people | Vito Pirrelli | it |
| dc.authority.people | François Yvon | it |
| dc.collection.id.s | b3f88f24-048a-4e43-8ab1-6697b90e068e | * |
| dc.collection.name | 01.01 Articolo in rivista | * |
| dc.contributor.appartenenza | Istituto di linguistica computazionale "Antonio Zampolli" - ILC | * |
| dc.contributor.appartenenza.mi | 918 | * |
| dc.date.accessioned | 2024/02/20 09:30:20 | - |
| dc.date.available | 2024/02/20 09:30:20 | - |
| dc.date.issued | 1999 | - |
| dc.description.abstracteng | Many tasks in language analysis are described as the maximally economic mapping of one level of linguistic representation onto another such level. Over the past decade, many different machine-learning strategies have been developed to automatically induce such mappings directly from data. In this paper, we contend that the way most learning algorithms have been applied to problems of language analysis reflects a strong bias towards a compositional (or biunique) model of interlevel mapping. Although this is justified in some cases, we contend that biunique inter-level mapping is not a jack of all trades. A model of analogical learning, based on a paradigmatic reanalysis of memorized data, is presented here. The methodological pros and cons of this approach are discussed in relation to a number of germane linguistic issues and illustrated in the context of three case studies: word pronunciation, word analysis, and word sense disambiguation. The evidence produced here seems to suggest that the brain is not designed to carry out the logically simplest and maximally economic way of relating form and function in language. Rather we propose a radical shift of emphasis in language learning from syntagmatic inter-level mapping to paradigmatically-constrained intra-level mapping. | - |
| dc.description.affiliations | Istituto di Linguistica Computazionale "A. Zampolli", CNR, Pisa ENST, Department of Computer Science and CNRS, Paris | - |
| dc.description.allpeople | Vito Pirrelli; François Yvon | - |
| dc.description.allpeopleoriginal | Vito Pirrelli, François Yvon | - |
| dc.description.fulltext | none | en |
| dc.description.numberofauthors | 1 | - |
| dc.identifier.uri | https://hdl.handle.net/20.500.14243/264349 | - |
| dc.language.iso | eng | - |
| dc.relation.firstpage | 391 | - |
| dc.relation.lastpage | 408 | - |
| dc.relation.volume | 11 | - |
| dc.subject.keywords | data-driven NLP | - |
| dc.subject.keywords | memory-based machine learning | - |
| dc.subject.keywords | analogical language learning | - |
| dc.subject.singlekeyword | data-driven NLP | * |
| dc.subject.singlekeyword | memory-based machine learning | * |
| dc.subject.singlekeyword | analogical language learning | * |
| dc.title | The hidden dimension: a paradigmatic view of data-driven NLP. | en |
| dc.type.driver | info:eu-repo/semantics/article | - |
| dc.type.full | 01 Contributo su Rivista::01.01 Articolo in rivista | it |
| dc.type.miur | 262 | - |
| dc.type.referee | Sì, ma tipo non specificato | - |
| dc.ugov.descaux1 | 273631 | - |
| iris.orcid.lastModifiedDate | 2024/03/01 17:47:52 | * |
| iris.orcid.lastModifiedMillisecond | 1709311672049 | * |
| iris.scopus.extIssued | 1999 | - |
| iris.scopus.extTitle | The hidden dimension: A paradigmatic view of data-driven NLP | - |
| iris.sitodocente.maxattempts | 1 | - |
| Appare nelle tipologie: | 01.01 Articolo in rivista | |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


