This paper describes the steps of construction of a shallow lexical ontology of Italian Linguistics in Italian, set to be used by a meta-search engine for query refinement. The ontology was constructed with the software Protege 4.0.2 and encoded in OWL format; its construction has been carried out following the steps described in the well-known Ontology Learning From Text (OLFT) layer cake. The starting point was the automatic term extraction from a corpus of web documents concerning the domain of interest (304,000 words); as regards corpus construction, we describe the main criteria of the web documents selection and its critical points, concerning the definition of user profile and of degrees of specialisation. We then describe the process of term validation and construction of a glossary of terms of Italian Linguistics; afterwards, we outline the identification of synonymic chains and the main criteria of ontology design: top classes of ontology are Concept (containing taxonomy of concepts) and Term (containing terms of the glossary as instances), while concepts are linked through part-whole and involved-role relation, both borrowed from Wordnet. Finally, we show some examples of the application of the ontology for query refinement.

Creation of a bottom-up corpus-based ontology for Italian Linguistics

Emiliano Giovannetti
2012

Abstract

This paper describes the steps of construction of a shallow lexical ontology of Italian Linguistics in Italian, set to be used by a meta-search engine for query refinement. The ontology was constructed with the software Protege 4.0.2 and encoded in OWL format; its construction has been carried out following the steps described in the well-known Ontology Learning From Text (OLFT) layer cake. The starting point was the automatic term extraction from a corpus of web documents concerning the domain of interest (304,000 words); as regards corpus construction, we describe the main criteria of the web documents selection and its critical points, concerning the definition of user profile and of degrees of specialisation. We then describe the process of term validation and construction of a glossary of terms of Italian Linguistics; afterwards, we outline the identification of synonymic chains and the main criteria of ontology design: top classes of ontology are Concept (containing taxonomy of concepts) and Term (containing terms of the glossary as instances), while concepts are linked through part-whole and involved-role relation, both borrowed from Wordnet. Finally, we show some examples of the application of the ontology for query refinement.
Campo DC Valore Lingua
dc.authority.orgunit Istituto di linguistica computazionale "Antonio Zampolli" - ILC -
dc.authority.people Elisa Bianchi it
dc.authority.people Mirko Tavosanis it
dc.authority.people Emiliano Giovannetti it
dc.collection.id.s 71c7200a-7c5f-4e83-8d57-d3d2ba88f40d *
dc.collection.name 04.01 Contributo in Atti di convegno *
dc.contributor.appartenenza Istituto di linguistica computazionale "Antonio Zampolli" - ILC *
dc.contributor.appartenenza.mi 918 *
dc.date.accessioned 2024/02/18 10:07:00 -
dc.date.available 2024/02/18 10:07:00 -
dc.date.issued 2012 -
dc.description.abstracteng This paper describes the steps of construction of a shallow lexical ontology of Italian Linguistics in Italian, set to be used by a meta-search engine for query refinement. The ontology was constructed with the software Protege 4.0.2 and encoded in OWL format; its construction has been carried out following the steps described in the well-known Ontology Learning From Text (OLFT) layer cake. The starting point was the automatic term extraction from a corpus of web documents concerning the domain of interest (304,000 words); as regards corpus construction, we describe the main criteria of the web documents selection and its critical points, concerning the definition of user profile and of degrees of specialisation. We then describe the process of term validation and construction of a glossary of terms of Italian Linguistics; afterwards, we outline the identification of synonymic chains and the main criteria of ontology design: top classes of ontology are Concept (containing taxonomy of concepts) and Term (containing terms of the glossary as instances), while concepts are linked through part-whole and involved-role relation, both borrowed from Wordnet. Finally, we show some examples of the application of the ontology for query refinement. -
dc.description.affiliations Università di Pisa, Istituto di Linguistica Computazionale "A. Zampolli" - CNR -
dc.description.allpeople Elisa Bianchi; Mirko Tavosanis; Emiliano Giovannetti -
dc.description.allpeopleoriginal Elisa Bianchi, Mirko Tavosanis, Emiliano Giovannetti -
dc.description.fulltext none en
dc.description.numberofauthors 1 -
dc.identifier.isi WOS:000323927702118 -
dc.identifier.scopus 2-s2.0-85037333686 -
dc.identifier.uri https://hdl.handle.net/20.500.14243/264623 -
dc.language.iso eng -
dc.publisher.country FRA -
dc.publisher.name European Language Resources Association ELRA -
dc.publisher.place Paris -
dc.relation.conferencedate 23-25 maggio 2012 -
dc.relation.conferencename LREC 2012 - Eight International Conference on Language Resources and Evaluation -
dc.relation.conferenceplace Istanbul -
dc.relation.firstpage 2641 -
dc.relation.ispartofbook Language Resources and Evaluation -
dc.relation.lastpage 2647 -
dc.relation.numberofpages 7 -
dc.subject.keywords Ontologies -
dc.subject.keywords Italian Linguistics -
dc.subject.keywords Query refinement -
dc.subject.singlekeyword Ontologies *
dc.subject.singlekeyword Italian Linguistics *
dc.subject.singlekeyword Query refinement *
dc.title Creation of a bottom-up corpus-based ontology for Italian Linguistics en
dc.type.driver info:eu-repo/semantics/conferenceObject -
dc.type.full 04 Contributo in convegno::04.01 Contributo in Atti di convegno it
dc.type.miur 273 -
dc.type.referee Sì, ma tipo non specificato -
dc.ugov.descaux1 282573 -
iris.isi.extIssued 2012 -
iris.isi.extTitle Creation of a bottom-up corpus-based ontology for Italian Linguistics -
iris.isi.metadataErrorDescription 0 -
iris.isi.metadataErrorType ERROR_NO_MATCH -
iris.isi.metadataStatus ERROR -
iris.orcid.lastModifiedDate 2024/06/12 10:17:43 *
iris.orcid.lastModifiedMillisecond 1718180263824 *
iris.scopus.extIssued 2012 -
iris.scopus.extTitle Creation of a bottom-up corpus-based ontology for Italian Linguistics -
iris.scopus.ideLinkStatusDate 2024/06/12 10:17:43 *
iris.scopus.ideLinkStatusMillisecond 1718180263831 *
iris.sitodocente.maxattempts 3 -
isi.category OT -
isi.category OY -
isi.contributor.affiliation University of Pisa -
isi.contributor.affiliation University of Pisa -
isi.contributor.affiliation Consiglio Nazionale delle Ricerche (CNR) -
isi.contributor.country Italy -
isi.contributor.country Italy -
isi.contributor.country Italy -
isi.contributor.name Elisa -
isi.contributor.name Mirko -
isi.contributor.name Emiliano -
isi.contributor.researcherId -
isi.contributor.researcherId -
isi.contributor.researcherId -
isi.contributor.subaffiliation Dipartimento Studi Italianist -
isi.contributor.subaffiliation Dipartimento Studi Italianist -
isi.contributor.subaffiliation Ist Linguist Computaz Antonio Zampolli ILC -
isi.contributor.surname Bianchi -
isi.contributor.surname Tavosanis -
isi.contributor.surname Giovannetti -
isi.date.issued 2012 -
isi.description.abstract This paper describes the steps of construction of a shallow lexical ontology of Italian Linguistics in Italian, set to be used by a meta-search engine for query refinement. The ontology was constructed with the software Protege 4.0.2 and encoded in OWL format; its construction has been carried out following the steps described in the well-known Ontology Learning From Text (OLFT) layer cake. The starting point was the automatic term extraction from a corpus of web documents concerning the domain of interest (304,000 words); as regards corpus construction, we describe the main criteria of the web documents selection and its critical points, concerning the definition of user profile and of degrees of specialisation. We then describe the process of term validation and construction of a glossary of terms of Italian Linguistics; afterwards, we outline the identification of synonymic chains and the main criteria of ontology design: top classes of ontology are Concept (containing taxonomy of concepts) and Term (containing terms of the glossary as instances), while concepts are linked through part-whole and involved-role relation, both borrowed from Wordnet. Finally, we show some examples of the application of the ontology for query refinement. -
isi.description.allpeopleoriginal Bianchi, E; Tavosanis, M; Giovannetti, E; -
isi.document.sourcetype WOS.ISSHP -
isi.document.type Meeting -
isi.document.types Meeting -
isi.identifier.isi WOS:000323927702118 -
isi.journal.journaltitle LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION -
isi.language.original English -
isi.publisher.place 55-57, RUE BRILLAT-SAVARIN, PARIS, 75013, FRANCE -
isi.relation.firstpage 2641 -
isi.relation.lastpage 2647 -
isi.title Creation of a bottom-up corpus-based ontology for Italian Linguistics -
scopus.category 1203 *
scopus.category 3304 *
scopus.category 3310 *
scopus.category 3309 *
scopus.contributor.affiliation University of Pisa -
scopus.contributor.affiliation University of Pisa -
scopus.contributor.affiliation Istituto di Linguistica Computazionale antonio Zampolli (ILC-CNR) -
scopus.contributor.afid 60028868 -
scopus.contributor.afid 60028868 -
scopus.contributor.afid 60008941 -
scopus.contributor.auid 57198883193 -
scopus.contributor.auid 14066687700 -
scopus.contributor.auid 55604835100 -
scopus.contributor.country Italy -
scopus.contributor.country Italy -
scopus.contributor.country Italy -
scopus.contributor.dptid 104460015 -
scopus.contributor.dptid 104460015 -
scopus.contributor.dptid -
scopus.contributor.name Elisa -
scopus.contributor.name Mirko -
scopus.contributor.name Emiliano -
scopus.contributor.subaffiliation Dipartimento di Studi Italianistici; -
scopus.contributor.subaffiliation Dipartimento di Studi Italianistici; -
scopus.contributor.subaffiliation -
scopus.contributor.surname Bianchi -
scopus.contributor.surname Tavosanis -
scopus.contributor.surname Giovannetti -
scopus.date.issued 2012 *
scopus.description.abstracteng This paper describes the steps of construction of a shallow lexical ontology of Italian Linguistics in Italian, set to be used by a metasearch engine for query refinement. The ontology was constructed with the software Protégé 4.0.2 and encoded in OWL format; its construction has been carried out following the steps described in the well-known Ontology Learning From Text (OLFT) layer cake. The starting point was the automatic term extraction from a corpus of web documents concerning the domain of interest (304,000 words); as regards corpus construction, we describe the main criteria of the web documents selection and its critical points, concerning the definition of user profile and of degrees of specialisation. We then describe the process of term validation and construction of a glossary of terms of Italian Linguistics; afterwards, we outline the identification of synonymic chains and the main criteria of ontology design: top classes of ontology are Concept (containing taxonomy of concepts) and Term (containing terms of the glossary as instances), while concepts are linked through part-whole and involved-role relation, both borrowed from Wordnet. Finally, we show some examples of the application of the ontology for query refinement. *
scopus.description.allpeopleoriginal Bianchi E.; Tavosanis M.; Giovannetti E. *
scopus.differences scopus.relation.conferencename *
scopus.differences scopus.publisher.name *
scopus.differences scopus.subject.keywords *
scopus.differences scopus.relation.conferencedate *
scopus.differences scopus.identifier.isbn *
scopus.differences scopus.description.allpeopleoriginal *
scopus.differences scopus.description.abstracteng *
scopus.differences scopus.relation.conferenceplace *
scopus.document.type cp *
scopus.document.types cp *
scopus.funding.funders 501100003407 - Ministero dell’Istruzione, dell’Università e della Ricerca; *
scopus.funding.ids RBNE07C4R9; *
scopus.identifier.isbn 9782951740877 *
scopus.identifier.pui 619589141 *
scopus.identifier.scopus 2-s2.0-85037333686 *
scopus.journal.sourceid 21100842151 *
scopus.language.iso eng *
scopus.publisher.name European Language Resources Association (ELRA) *
scopus.relation.conferencedate 2012 *
scopus.relation.conferencename 8th International Conference on Language Resources and Evaluation, LREC 2012 *
scopus.relation.conferenceplace Istanbul Lufti Kirdar Convention and Exhibition Centre, tur *
scopus.relation.firstpage 2641 *
scopus.relation.lastpage 2647 *
scopus.subject.keywords Italian Linguistics; Ontologies; Query refinement; *
scopus.title Creation of a bottom-up corpus-based ontology for Italian Linguistics *
scopus.titleeng Creation of a bottom-up corpus-based ontology for Italian Linguistics *
Appare nelle tipologie: 04.01 Contributo in Atti di convegno
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/264623
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact