The paper describes ongoing work on the digitization of an authoritative historical Italian dictionary, namely Il Grande Dizionario della Lingua Italiana (GDLI), with a specific view to creating the prerequisites for advanced human-oriented querying. After discussing the general approach taken to extract and structure the GDLI contents, in the paper we report the encouraging results of a case study carried out against two volumes which have been selected for the different conversion issues raised. Dictionary content extraction and structuring is being carried out through an iterative process based on hand coded patterns: starting from the recognition of the entry headword, a series of truth conditions are tested which allow the building and progressive structuring, in successive steps, of the whole lexical entry. We also started to design the representation of extracted and structured entries in a standard format, encoded in TEI. An outline of an example entry is also provided and illustrated in order to show what the end result will look like.
Converting and structuring a digital historical dictionary of Italian: a case study
Eva Sassolini;Monica Monachini;Simonetta Montemagni
2019
Abstract
The paper describes ongoing work on the digitization of an authoritative historical Italian dictionary, namely Il Grande Dizionario della Lingua Italiana (GDLI), with a specific view to creating the prerequisites for advanced human-oriented querying. After discussing the general approach taken to extract and structure the GDLI contents, in the paper we report the encouraging results of a case study carried out against two volumes which have been selected for the different conversion issues raised. Dictionary content extraction and structuring is being carried out through an iterative process based on hand coded patterns: starting from the recognition of the entry headword, a series of truth conditions are tested which allow the building and progressive structuring, in successive steps, of the whole lexical entry. We also started to design the representation of extracted and structured entries in a standard format, encoded in TEI. An outline of an example entry is also provided and illustrated in order to show what the end result will look like.| Campo DC | Valore | Lingua |
|---|---|---|
| dc.authority.people | Eva Sassolini | it |
| dc.authority.people | Anas Fahad Khan | it |
| dc.authority.people | Marco Biffi | it |
| dc.authority.people | Monica Monachini | it |
| dc.authority.people | Simonetta Montemagni | it |
| dc.collection.id.s | 71c7200a-7c5f-4e83-8d57-d3d2ba88f40d | * |
| dc.collection.name | 04.01 Contributo in Atti di convegno | * |
| dc.contributor.appartenenza | Istituto di linguistica computazionale "Antonio Zampolli" - ILC | * |
| dc.contributor.appartenenza.mi | 918 | * |
| dc.date.accessioned | 2024/02/19 20:57:45 | - |
| dc.date.available | 2024/02/19 20:57:45 | - |
| dc.date.issued | 2019 | - |
| dc.description.abstracteng | The paper describes ongoing work on the digitization of an authoritative historical Italian dictionary, namely Il Grande Dizionario della Lingua Italiana (GDLI), with a specific view to creating the prerequisites for advanced human-oriented querying. After discussing the general approach taken to extract and structure the GDLI contents, in the paper we report the encouraging results of a case study carried out against two volumes which have been selected for the different conversion issues raised. Dictionary content extraction and structuring is being carried out through an iterative process based on hand coded patterns: starting from the recognition of the entry headword, a series of truth conditions are tested which allow the building and progressive structuring, in successive steps, of the whole lexical entry. We also started to design the representation of extracted and structured entries in a standard format, encoded in TEI. An outline of an example entry is also provided and illustrated in order to show what the end result will look like. | - |
| dc.description.affiliations | Istituto di Linguistica Computazionale "A. Zampolli" - CNR (Pisa, Italy), Accademia della Crusca (Firenze, Italy), Università degli Studi di Firenze (Italy) | - |
| dc.description.allpeople | Sassolini, Eva; Fahad Khan, Anas; Biffi, Marco; Monachini, Monica; Montemagni, Simonetta | - |
| dc.description.allpeopleoriginal | Eva Sassolini, Anas Fahad Khan, Marco Biffi, Monica Monachini and Simonetta Montemagni: | - |
| dc.description.fulltext | none | en |
| dc.description.numberofauthors | 5 | - |
| dc.identifier.uri | https://hdl.handle.net/20.500.14243/389211 | - |
| dc.identifier.url | https://elex.link/elex2019/wp-content/uploads/2019/09/eLex_2019_35.pdf | - |
| dc.language.iso | eng | - |
| dc.relation.conferencedate | 1-3/10/2019 | - |
| dc.relation.conferencename | Electronic lexicography in the 21st century (eLex 2019): Smart Lexicography. | - |
| dc.subject.keywords | historical dictionaries; automatic acquisition; TEI representation | - |
| dc.subject.singlekeyword | historical dictionaries | * |
| dc.subject.singlekeyword | automatic acquisition | * |
| dc.subject.singlekeyword | TEI representation | * |
| dc.title | Converting and structuring a digital historical dictionary of Italian: a case study | en |
| dc.type.driver | info:eu-repo/semantics/conferenceObject | - |
| dc.type.full | 04 Contributo in convegno::04.01 Contributo in Atti di convegno | it |
| dc.type.miur | 273 | - |
| dc.ugov.descaux1 | 410154 | - |
| iris.orcid.lastModifiedDate | 2024/04/04 12:03:41 | * |
| iris.orcid.lastModifiedMillisecond | 1712225021649 | * |
| iris.scopus.extIssued | 2019 | - |
| iris.scopus.extTitle | Converting and structuring a digital historical dictionary of Italian: A case study | - |
| iris.sitodocente.maxattempts | 1 | - |
| Appare nelle tipologie: | 04.01 Contributo in Atti di convegno | |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


