This paper describes the steps of construction of a shallow lexical ontology of Italian Linguistics in Italian, set to be used by a meta-search engine for query refinement. The ontology was constructed with the software Protege 4.0.2 and encoded in OWL format; its construction has been carried out following the steps described in the well-known Ontology Learning From Text (OLFT) layer cake. The starting point was the automatic term extraction from a corpus of web documents concerning the domain of interest (304,000 words); as regards corpus construction, we describe the main criteria of the web documents selection and its critical points, concerning the definition of user profile and of degrees of specialisation. We then describe the process of term validation and construction of a glossary of terms of Italian Linguistics; afterwards, we outline the identification of synonymic chains and the main criteria of ontology design: top classes of ontology are Concept (containing taxonomy of concepts) and Term (containing terms of the glossary as instances), while concepts are linked through part-whole and involved-role relation, both borrowed from Wordnet. Finally, we show some examples of the application of the ontology for query refinement.
Creation of a bottom-up corpus-based ontology for Italian Linguistics
Emiliano Giovannetti
2012
Abstract
This paper describes the steps of construction of a shallow lexical ontology of Italian Linguistics in Italian, set to be used by a meta-search engine for query refinement. The ontology was constructed with the software Protege 4.0.2 and encoded in OWL format; its construction has been carried out following the steps described in the well-known Ontology Learning From Text (OLFT) layer cake. The starting point was the automatic term extraction from a corpus of web documents concerning the domain of interest (304,000 words); as regards corpus construction, we describe the main criteria of the web documents selection and its critical points, concerning the definition of user profile and of degrees of specialisation. We then describe the process of term validation and construction of a glossary of terms of Italian Linguistics; afterwards, we outline the identification of synonymic chains and the main criteria of ontology design: top classes of ontology are Concept (containing taxonomy of concepts) and Term (containing terms of the glossary as instances), while concepts are linked through part-whole and involved-role relation, both borrowed from Wordnet. Finally, we show some examples of the application of the ontology for query refinement.| Campo DC | Valore | Lingua |
|---|---|---|
| dc.authority.orgunit | Istituto di linguistica computazionale "Antonio Zampolli" - ILC | - |
| dc.authority.people | Elisa Bianchi | it |
| dc.authority.people | Mirko Tavosanis | it |
| dc.authority.people | Emiliano Giovannetti | it |
| dc.collection.id.s | 71c7200a-7c5f-4e83-8d57-d3d2ba88f40d | * |
| dc.collection.name | 04.01 Contributo in Atti di convegno | * |
| dc.contributor.appartenenza | Istituto di linguistica computazionale "Antonio Zampolli" - ILC | * |
| dc.contributor.appartenenza.mi | 918 | * |
| dc.date.accessioned | 2024/02/18 10:07:00 | - |
| dc.date.available | 2024/02/18 10:07:00 | - |
| dc.date.issued | 2012 | - |
| dc.description.abstracteng | This paper describes the steps of construction of a shallow lexical ontology of Italian Linguistics in Italian, set to be used by a meta-search engine for query refinement. The ontology was constructed with the software Protege 4.0.2 and encoded in OWL format; its construction has been carried out following the steps described in the well-known Ontology Learning From Text (OLFT) layer cake. The starting point was the automatic term extraction from a corpus of web documents concerning the domain of interest (304,000 words); as regards corpus construction, we describe the main criteria of the web documents selection and its critical points, concerning the definition of user profile and of degrees of specialisation. We then describe the process of term validation and construction of a glossary of terms of Italian Linguistics; afterwards, we outline the identification of synonymic chains and the main criteria of ontology design: top classes of ontology are Concept (containing taxonomy of concepts) and Term (containing terms of the glossary as instances), while concepts are linked through part-whole and involved-role relation, both borrowed from Wordnet. Finally, we show some examples of the application of the ontology for query refinement. | - |
| dc.description.affiliations | Università di Pisa, Istituto di Linguistica Computazionale "A. Zampolli" - CNR | - |
| dc.description.allpeople | Elisa Bianchi; Mirko Tavosanis; Emiliano Giovannetti | - |
| dc.description.allpeopleoriginal | Elisa Bianchi, Mirko Tavosanis, Emiliano Giovannetti | - |
| dc.description.fulltext | none | en |
| dc.description.numberofauthors | 1 | - |
| dc.identifier.isi | WOS:000323927702118 | - |
| dc.identifier.scopus | 2-s2.0-85037333686 | - |
| dc.identifier.uri | https://hdl.handle.net/20.500.14243/264623 | - |
| dc.language.iso | eng | - |
| dc.publisher.country | FRA | - |
| dc.publisher.name | European Language Resources Association ELRA | - |
| dc.publisher.place | Paris | - |
| dc.relation.conferencedate | 23-25 maggio 2012 | - |
| dc.relation.conferencename | LREC 2012 - Eight International Conference on Language Resources and Evaluation | - |
| dc.relation.conferenceplace | Istanbul | - |
| dc.relation.firstpage | 2641 | - |
| dc.relation.ispartofbook | Language Resources and Evaluation | - |
| dc.relation.lastpage | 2647 | - |
| dc.relation.numberofpages | 7 | - |
| dc.subject.keywords | Ontologies | - |
| dc.subject.keywords | Italian Linguistics | - |
| dc.subject.keywords | Query refinement | - |
| dc.subject.singlekeyword | Ontologies | * |
| dc.subject.singlekeyword | Italian Linguistics | * |
| dc.subject.singlekeyword | Query refinement | * |
| dc.title | Creation of a bottom-up corpus-based ontology for Italian Linguistics | en |
| dc.type.driver | info:eu-repo/semantics/conferenceObject | - |
| dc.type.full | 04 Contributo in convegno::04.01 Contributo in Atti di convegno | it |
| dc.type.miur | 273 | - |
| dc.type.referee | Sì, ma tipo non specificato | - |
| dc.ugov.descaux1 | 282573 | - |
| iris.isi.extIssued | 2012 | - |
| iris.isi.extTitle | Creation of a bottom-up corpus-based ontology for Italian Linguistics | - |
| iris.isi.metadataErrorDescription | 0 | - |
| iris.isi.metadataErrorType | ERROR_NO_MATCH | - |
| iris.isi.metadataStatus | ERROR | - |
| iris.orcid.lastModifiedDate | 2024/06/12 10:17:43 | * |
| iris.orcid.lastModifiedMillisecond | 1718180263824 | * |
| iris.scopus.extIssued | 2012 | - |
| iris.scopus.extTitle | Creation of a bottom-up corpus-based ontology for Italian Linguistics | - |
| iris.scopus.ideLinkStatusDate | 2024/06/12 10:17:43 | * |
| iris.scopus.ideLinkStatusMillisecond | 1718180263831 | * |
| iris.sitodocente.maxattempts | 3 | - |
| isi.category | OT | - |
| isi.category | OY | - |
| isi.contributor.affiliation | University of Pisa | - |
| isi.contributor.affiliation | University of Pisa | - |
| isi.contributor.affiliation | Consiglio Nazionale delle Ricerche (CNR) | - |
| isi.contributor.country | Italy | - |
| isi.contributor.country | Italy | - |
| isi.contributor.country | Italy | - |
| isi.contributor.name | Elisa | - |
| isi.contributor.name | Mirko | - |
| isi.contributor.name | Emiliano | - |
| isi.contributor.researcherId | - | |
| isi.contributor.researcherId | - | |
| isi.contributor.researcherId | - | |
| isi.contributor.subaffiliation | Dipartimento Studi Italianist | - |
| isi.contributor.subaffiliation | Dipartimento Studi Italianist | - |
| isi.contributor.subaffiliation | Ist Linguist Computaz Antonio Zampolli ILC | - |
| isi.contributor.surname | Bianchi | - |
| isi.contributor.surname | Tavosanis | - |
| isi.contributor.surname | Giovannetti | - |
| isi.date.issued | 2012 | - |
| isi.description.abstract | This paper describes the steps of construction of a shallow lexical ontology of Italian Linguistics in Italian, set to be used by a meta-search engine for query refinement. The ontology was constructed with the software Protege 4.0.2 and encoded in OWL format; its construction has been carried out following the steps described in the well-known Ontology Learning From Text (OLFT) layer cake. The starting point was the automatic term extraction from a corpus of web documents concerning the domain of interest (304,000 words); as regards corpus construction, we describe the main criteria of the web documents selection and its critical points, concerning the definition of user profile and of degrees of specialisation. We then describe the process of term validation and construction of a glossary of terms of Italian Linguistics; afterwards, we outline the identification of synonymic chains and the main criteria of ontology design: top classes of ontology are Concept (containing taxonomy of concepts) and Term (containing terms of the glossary as instances), while concepts are linked through part-whole and involved-role relation, both borrowed from Wordnet. Finally, we show some examples of the application of the ontology for query refinement. | - |
| isi.description.allpeopleoriginal | Bianchi, E; Tavosanis, M; Giovannetti, E; | - |
| isi.document.sourcetype | WOS.ISSHP | - |
| isi.document.type | Meeting | - |
| isi.document.types | Meeting | - |
| isi.identifier.isi | WOS:000323927702118 | - |
| isi.journal.journaltitle | LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION | - |
| isi.language.original | English | - |
| isi.publisher.place | 55-57, RUE BRILLAT-SAVARIN, PARIS, 75013, FRANCE | - |
| isi.relation.firstpage | 2641 | - |
| isi.relation.lastpage | 2647 | - |
| isi.title | Creation of a bottom-up corpus-based ontology for Italian Linguistics | - |
| scopus.category | 1203 | * |
| scopus.category | 3304 | * |
| scopus.category | 3310 | * |
| scopus.category | 3309 | * |
| scopus.contributor.affiliation | University of Pisa | - |
| scopus.contributor.affiliation | University of Pisa | - |
| scopus.contributor.affiliation | Istituto di Linguistica Computazionale antonio Zampolli (ILC-CNR) | - |
| scopus.contributor.afid | 60028868 | - |
| scopus.contributor.afid | 60028868 | - |
| scopus.contributor.afid | 60008941 | - |
| scopus.contributor.auid | 57198883193 | - |
| scopus.contributor.auid | 14066687700 | - |
| scopus.contributor.auid | 55604835100 | - |
| scopus.contributor.country | Italy | - |
| scopus.contributor.country | Italy | - |
| scopus.contributor.country | Italy | - |
| scopus.contributor.dptid | 104460015 | - |
| scopus.contributor.dptid | 104460015 | - |
| scopus.contributor.dptid | - | |
| scopus.contributor.name | Elisa | - |
| scopus.contributor.name | Mirko | - |
| scopus.contributor.name | Emiliano | - |
| scopus.contributor.subaffiliation | Dipartimento di Studi Italianistici; | - |
| scopus.contributor.subaffiliation | Dipartimento di Studi Italianistici; | - |
| scopus.contributor.subaffiliation | - | |
| scopus.contributor.surname | Bianchi | - |
| scopus.contributor.surname | Tavosanis | - |
| scopus.contributor.surname | Giovannetti | - |
| scopus.date.issued | 2012 | * |
| scopus.description.abstracteng | This paper describes the steps of construction of a shallow lexical ontology of Italian Linguistics in Italian, set to be used by a metasearch engine for query refinement. The ontology was constructed with the software Protégé 4.0.2 and encoded in OWL format; its construction has been carried out following the steps described in the well-known Ontology Learning From Text (OLFT) layer cake. The starting point was the automatic term extraction from a corpus of web documents concerning the domain of interest (304,000 words); as regards corpus construction, we describe the main criteria of the web documents selection and its critical points, concerning the definition of user profile and of degrees of specialisation. We then describe the process of term validation and construction of a glossary of terms of Italian Linguistics; afterwards, we outline the identification of synonymic chains and the main criteria of ontology design: top classes of ontology are Concept (containing taxonomy of concepts) and Term (containing terms of the glossary as instances), while concepts are linked through part-whole and involved-role relation, both borrowed from Wordnet. Finally, we show some examples of the application of the ontology for query refinement. | * |
| scopus.description.allpeopleoriginal | Bianchi E.; Tavosanis M.; Giovannetti E. | * |
| scopus.differences | scopus.relation.conferencename | * |
| scopus.differences | scopus.publisher.name | * |
| scopus.differences | scopus.subject.keywords | * |
| scopus.differences | scopus.relation.conferencedate | * |
| scopus.differences | scopus.identifier.isbn | * |
| scopus.differences | scopus.description.allpeopleoriginal | * |
| scopus.differences | scopus.description.abstracteng | * |
| scopus.differences | scopus.relation.conferenceplace | * |
| scopus.document.type | cp | * |
| scopus.document.types | cp | * |
| scopus.funding.funders | 501100003407 - Ministero dell’Istruzione, dell’Università e della Ricerca; | * |
| scopus.funding.ids | RBNE07C4R9; | * |
| scopus.identifier.isbn | 9782951740877 | * |
| scopus.identifier.pui | 619589141 | * |
| scopus.identifier.scopus | 2-s2.0-85037333686 | * |
| scopus.journal.sourceid | 21100842151 | * |
| scopus.language.iso | eng | * |
| scopus.publisher.name | European Language Resources Association (ELRA) | * |
| scopus.relation.conferencedate | 2012 | * |
| scopus.relation.conferencename | 8th International Conference on Language Resources and Evaluation, LREC 2012 | * |
| scopus.relation.conferenceplace | Istanbul Lufti Kirdar Convention and Exhibition Centre, tur | * |
| scopus.relation.firstpage | 2641 | * |
| scopus.relation.lastpage | 2647 | * |
| scopus.subject.keywords | Italian Linguistics; Ontologies; Query refinement; | * |
| scopus.title | Creation of a bottom-up corpus-based ontology for Italian Linguistics | * |
| scopus.titleeng | Creation of a bottom-up corpus-based ontology for Italian Linguistics | * |
| Appare nelle tipologie: | 04.01 Contributo in Atti di convegno | |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


