Machine learning offers two basic strategies for morphology induction: lexical segmentation and surface word relation. The first one assumes that words can be segmented into morphemes. Inducing a novel inflected form requires identification of morphemic constituents and a strategy for their recombination. The second approach dispenses with segmentation: lexical representations form part of a network of associatively related inflected forms. Production of a novel form consists in filling in one empty node in the network. Here, we present the results of a recurrent LSTM network that learns to fill in paradigm cells of incomplete verb paradigms. Although the process is not based on morpheme segmentation, the model shows sensitivity to stem selection and stem-ending boundaries.

La letteratura offre due strategie di base per l'induzione morfologica. La prima presuppone la segmentazione delle forme lessicali in morfemi e genera parole nuove ricombinando morfemi conosciuti; la seconda si basa sulle relazioni di unaforma con le altre forme del suo paradigma, e genera una parola sconosciuta riempiendo una cella vuota del paradigma. In questo articolo, presentiamo i risultati di una rete LSTM ricorrente, capace di imparare a generare nuove forme verbali a partire da forme già note non segmentate. Ciononostante, la rete acquisisce una conoscenza implicita del tema verbale e del confine con la terminazione flessionale.

How "deep" is learning word inflection?

Cardillo Franco Alberto
Primo
;
Ferro Marcello
Secondo
;
Marzi Claudia
Penultimo
;
Pirrelli Vito
Ultimo
2017

Abstract

Machine learning offers two basic strategies for morphology induction: lexical segmentation and surface word relation. The first one assumes that words can be segmented into morphemes. Inducing a novel inflected form requires identification of morphemic constituents and a strategy for their recombination. The second approach dispenses with segmentation: lexical representations form part of a network of associatively related inflected forms. Production of a novel form consists in filling in one empty node in the network. Here, we present the results of a recurrent LSTM network that learns to fill in paradigm cells of incomplete verb paradigms. Although the process is not based on morpheme segmentation, the model shows sensitivity to stem selection and stem-ending boundaries.
Campo DC Valore Lingua
dc.authority.anceserie CEUR WORKSHOP PROCEEDINGS en
dc.authority.orgunit Istituto di linguistica computazionale "Antonio Zampolli" - ILC en
dc.authority.people Cardillo Franco Alberto en
dc.authority.people Ferro Marcello en
dc.authority.people Marzi Claudia en
dc.authority.people Pirrelli Vito en
dc.collection.id.s 71c7200a-7c5f-4e83-8d57-d3d2ba88f40d *
dc.collection.name 04.01 Contributo in Atti di convegno *
dc.contributor.appartenenza Istituto di linguistica computazionale "Antonio Zampolli" - ILC *
dc.contributor.appartenenza.mi 918 *
dc.date.accessioned 2024/02/19 08:03:18 -
dc.date.available 2024/02/19 08:03:18 -
dc.date.firstsubmission 2024/09/26 16:59:35 *
dc.date.issued 2017 -
dc.date.submission 2024/09/26 16:59:35 *
dc.description.abstracteng Machine learning offers two basic strategies for morphology induction: lexical segmentation and surface word relation. The first one assumes that words can be segmented into morphemes. Inducing a novel inflected form requires identification of morphemic constituents and a strategy for their recombination. The second approach dispenses with segmentation: lexical representations form part of a network of associatively related inflected forms. Production of a novel form consists in filling in one empty node in the network. Here, we present the results of a recurrent LSTM network that learns to fill in paradigm cells of incomplete verb paradigms. Although the process is not based on morpheme segmentation, the model shows sensitivity to stem selection and stem-ending boundaries. -
dc.description.abstractita La letteratura offre due strategie di base per l'induzione morfologica. La prima presuppone la segmentazione delle forme lessicali in morfemi e genera parole nuove ricombinando morfemi conosciuti; la seconda si basa sulle relazioni di unaforma con le altre forme del suo paradigma, e genera una parola sconosciuta riempiendo una cella vuota del paradigma. In questo articolo, presentiamo i risultati di una rete LSTM ricorrente, capace di imparare a generare nuove forme verbali a partire da forme già note non segmentate. Ciononostante, la rete acquisisce una conoscenza implicita del tema verbale e del confine con la terminazione flessionale. -
dc.description.affiliations Istituto di Linguistica Computazionale ILC-CNR, Pisa, Italy -
dc.description.allpeople Cardillo, FRANCO ALBERTO; Ferro, Marcello; Marzi, Claudia; Pirrelli, Vito -
dc.description.allpeopleoriginal Cardillo, Franco Alberto; Ferro, Marcello; Marzi, Claudia; Pirrelli, Vito en
dc.description.fulltext open en
dc.description.note ISSN 1613-0073 en
dc.description.numberofauthors 4 -
dc.identifier.doi 10.4000/books.aaccademia.2314 en
dc.identifier.isbn 978-88-99982-76-8 en
dc.identifier.scopus 2-s2.0-85037368972 en
dc.identifier.uri https://hdl.handle.net/20.500.14243/326587 -
dc.identifier.url http://www.scopus.com/record/display.url?eid=2-s2.0-85037368972&origin=inward en
dc.language.iso eng en
dc.miur.last.status.update 2024-09-26T14:59:43Z *
dc.publisher.country DEU en
dc.publisher.name Accademia University Press en
dc.publisher.place Torino en
dc.relation.alleditors R. Basili; M. Nissim; G. Satta en
dc.relation.conferencedate 11-13/12/2017 en
dc.relation.conferencename Fourth Italian Conference on Computational Linguistics en
dc.relation.conferenceplace Roma en
dc.relation.firstpage 77 en
dc.relation.ispartofbook Proceedings of the Fourth Italian Conference on Computational Linguistics (CLiC-it 2017) en
dc.relation.lastpage 82 en
dc.relation.medium ELETTRONICO en
dc.relation.numberofpages 6 en
dc.relation.volume 2006 en
dc.subject.keywordseng LSTM -
dc.subject.keywordseng Morphology induction -
dc.subject.keywordseng Cognitive modelling -
dc.subject.singlekeyword LSTM *
dc.subject.singlekeyword Morphology induction *
dc.subject.singlekeyword Cognitive modelling *
dc.title How "deep" is learning word inflection? en
dc.type.driver info:eu-repo/semantics/conferenceObject -
dc.type.full 04 Contributo in convegno::04.01 Contributo in Atti di convegno it
dc.type.invited contributo en
dc.type.miur 273 -
dc.type.referee Esperti anonimi en
dc.ugov.descaux1 381090 -
iris.mediafilter.data 2025/04/16 03:55:53 *
iris.orcid.lastModifiedDate 2024/12/06 16:07:29 *
iris.orcid.lastModifiedMillisecond 1733497649114 *
iris.scopus.extIssued 2017 -
iris.scopus.extTitle How “deep” is learning word inflection? -
iris.sitodocente.maxattempts 1 -
iris.unpaywall.bestoahost publisher *
iris.unpaywall.bestoaversion publishedVersion *
iris.unpaywall.doi 10.4000/books.aaccademia.2314 *
iris.unpaywall.hosttype publisher *
iris.unpaywall.isoa true *
iris.unpaywall.journalisindoaj false *
iris.unpaywall.landingpage https://doi.org/10.4000/books.aaccademia.2314 *
iris.unpaywall.license cc-by-nc-nd *
iris.unpaywall.metadataCallLastModified 10/12/2025 03:56:22 -
iris.unpaywall.metadataCallLastModifiedMillisecond 1765335382079 -
iris.unpaywall.oastatus hybrid *
scopus.authority.anceserie CEUR WORKSHOP PROCEEDINGS###1613-0073 *
scopus.category 1700 *
scopus.contributor.affiliation Istituto di Linguistica Computazionale ILC-CNR -
scopus.contributor.affiliation Istituto di Linguistica Computazionale ILC-CNR -
scopus.contributor.affiliation Istituto di Linguistica Computazionale ILC-CNR -
scopus.contributor.affiliation Istituto di Linguistica Computazionale ILC-CNR -
scopus.contributor.afid 60008941 -
scopus.contributor.afid 60008941 -
scopus.contributor.afid 60008941 -
scopus.contributor.afid 60008941 -
scopus.contributor.auid 57191090133 -
scopus.contributor.auid 15759406100 -
scopus.contributor.auid 36621334800 -
scopus.contributor.auid 14833305800 -
scopus.contributor.country Italy -
scopus.contributor.country Italy -
scopus.contributor.country Italy -
scopus.contributor.country Italy -
scopus.contributor.dptid -
scopus.contributor.dptid -
scopus.contributor.dptid -
scopus.contributor.dptid -
scopus.contributor.name Franco Alberto -
scopus.contributor.name Marcello -
scopus.contributor.name Claudia -
scopus.contributor.name Vito -
scopus.contributor.subaffiliation -
scopus.contributor.subaffiliation -
scopus.contributor.subaffiliation -
scopus.contributor.subaffiliation -
scopus.contributor.surname Cardillo -
scopus.contributor.surname Ferro -
scopus.contributor.surname Marzi -
scopus.contributor.surname Pirrelli -
scopus.date.issued 2017 *
scopus.description.abstracteng Machine learning offers two basic strategies for morphology induction: lexical segmentation and surface word relation. The first one assumes that words can be segmented into morphemes. Inducing a novel inflected form requires identification of morphemic constituents and a strategy for their recombination. The second approach dispenses with segmentation: lexical representations form part of a network of associatively related inflected forms. Production of a novel form consists in filling in one empty node in the network. Here, we present the results of a recurrent LSTM network that learns to fill in paradigm cells of incomplete verb paradigms. Although the process is not based on morpheme segmentation, the model shows sensitivity to stem selection and stem-ending boundaries. *
scopus.description.allpeopleoriginal Cardillo F.A.; Ferro M.; Marzi C.; Pirrelli V. *
scopus.differences scopus.relation.conferencename *
scopus.differences scopus.publisher.name *
scopus.differences scopus.relation.conferencedate *
scopus.differences scopus.description.allpeopleoriginal *
scopus.differences scopus.identifier.doi *
scopus.differences scopus.relation.conferenceplace *
scopus.differences scopus.title *
scopus.document.type cp *
scopus.document.types cp *
scopus.identifier.doi 10.4000/books.aaccademia.2372 *
scopus.identifier.pui 619585991 *
scopus.identifier.scopus 2-s2.0-85037368972 *
scopus.journal.sourceid 21100218356 *
scopus.language.iso eng *
scopus.publisher.name CEUR-WS *
scopus.relation.conferencedate 2017 *
scopus.relation.conferencename 4th Italian Conference on Computational Linguistics, CLiC-it 2017 *
scopus.relation.conferenceplace ita *
scopus.relation.volume 2006 *
scopus.title How “deep” is learning word inflection? *
scopus.titleeng How “deep” is learning word inflection? *
Appare nelle tipologie: 04.01 Contributo in Atti di convegno
File in questo prodotto:
File Dimensione Formato  
prod_381090-doc_129229.pdf

accesso aperto

Descrizione: How "deep" is learning word inflection?
Tipologia: Versione Editoriale (PDF)
Licenza: Dominio pubblico
Dimensione 812.89 kB
Formato Adobe PDF
812.89 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/326587
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact