This paper presents the results of a study on grey literature (GL) in the field of Natural Language Processing (NLP). Our data has been collected in a corpus of ca 13,000 records corresponding to the titles of papers presented at International Conferences from 1950 to June 2008. A statistical representation of the most significant terms relative to GL in NLP and other interrelated disciplines associates old and new words, highlighting the terminological changes that have taken place in the course of time. Aim of our study is to contribute to the creation of language resources for the extraction of GL coming from the Web in order to help prevent the disappearance of documents containing NLP words that have undergone rapid development over the last decades. This paper is organised as follows: after a general introduction to our work, section 2 provides a historical overview of NLP; sections 3 and 4 offer an account of the most relevant terms used by specialists in different periods, and indicative of the changes that have taken place; section 5 describes the methodology we have used and also contains information on our GL database and a graphical representation of the data. Finally, the conclusions stress the need to integrate pre-existing or obsolete words and expressions, creating NLP synonym relations.

Grey Literature for Natural Language Processing: a Terminological and Statistical Approach

Cignoni L;Pardelli G;Sassi M
2008

Abstract

This paper presents the results of a study on grey literature (GL) in the field of Natural Language Processing (NLP). Our data has been collected in a corpus of ca 13,000 records corresponding to the titles of papers presented at International Conferences from 1950 to June 2008. A statistical representation of the most significant terms relative to GL in NLP and other interrelated disciplines associates old and new words, highlighting the terminological changes that have taken place in the course of time. Aim of our study is to contribute to the creation of language resources for the extraction of GL coming from the Web in order to help prevent the disappearance of documents containing NLP words that have undergone rapid development over the last decades. This paper is organised as follows: after a general introduction to our work, section 2 provides a historical overview of NLP; sections 3 and 4 offer an account of the most relevant terms used by specialists in different periods, and indicative of the changes that have taken place; section 5 describes the methodology we have used and also contains information on our GL database and a graphical representation of the data. Finally, the conclusions stress the need to integrate pre-existing or obsolete words and expressions, creating NLP synonym relations.
Campo DC Valore Lingua
dc.authority.orgunit Istituto di linguistica computazionale "Antonio Zampolli" - ILC -
dc.authority.people Cignoni L it
dc.authority.people Pardelli G it
dc.authority.people Sassi M it
dc.collection.id.s 69aaa6b3-f0f0-47c1-b9a1-040bae867ec3 *
dc.collection.name 04.02 Abstract in Atti di convegno *
dc.contributor.appartenenza Istituto di linguistica computazionale "Antonio Zampolli" - ILC *
dc.contributor.appartenenza.mi 918 *
dc.date.accessioned 2024/02/21 05:47:52 -
dc.date.available 2024/02/21 05:47:52 -
dc.date.issued 2008 -
dc.description.abstract This paper presents the results of a study on grey literature (GL) in the field of Natural Language Processing (NLP). Our data has been collected in a corpus of ca 13,000 records corresponding to the titles of papers presented at International Conferences from 1950 to June 2008. A statistical representation of the most significant terms relative to GL in NLP and other interrelated disciplines associates old and new words, highlighting the terminological changes that have taken place in the course of time. Aim of our study is to contribute to the creation of language resources for the extraction of GL coming from the Web in order to help prevent the disappearance of documents containing NLP words that have undergone rapid development over the last decades. This paper is organised as follows: after a general introduction to our work, section 2 provides a historical overview of NLP; sections 3 and 4 offer an account of the most relevant terms used by specialists in different periods, and indicative of the changes that have taken place; section 5 describes the methodology we have used and also contains information on our GL database and a graphical representation of the data. Finally, the conclusions stress the need to integrate pre-existing or obsolete words and expressions, creating NLP synonym relations. -
dc.description.affiliations CNR-ILC -
dc.description.allpeople Cignoni L.; Pardelli G.; Sassi M. -
dc.description.allpeopleoriginal Cignoni L.; Pardelli G.; Sassi M. -
dc.description.fulltext none en
dc.description.numberofauthors 3 -
dc.identifier.isbn 978-90-77484-12-8 -
dc.identifier.isi WOS:000264705400010 -
dc.identifier.uri https://hdl.handle.net/20.500.14243/435679 -
dc.identifier.url http://hdl.handle.net/10068/697993 -
dc.language.iso eng -
dc.relation.alleditors D.G. Farace, J.Frantzen -
dc.relation.conferencedate December 8-9 2008 -
dc.relation.conferencename Tenth International Conference on Grey Literature: Designing the Grey Grid for Information Society -
dc.relation.conferenceplace Amsterdam -
dc.relation.firstpage 116 -
dc.relation.lastpage 120 -
dc.relation.numberofpages 5 -
dc.subject.keywords Computational Linguistics -
dc.subject.keywords Grey Literature -
dc.subject.singlekeyword Computational Linguistics *
dc.subject.singlekeyword Grey Literature *
dc.title Grey Literature for Natural Language Processing: a Terminological and Statistical Approach en
dc.type.driver info:eu-repo/semantics/conferenceObject -
dc.type.full 04 Contributo in convegno::04.02 Abstract in Atti di convegno it
dc.type.miur 274 -
dc.type.referee Sì, ma tipo non specificato -
dc.ugov.descaux1 112937 -
iris.isi.extIssued 2008 -
iris.isi.extTitle Grey Literature for Natural Language Processing: A Terminological and Statistical Approach -
iris.orcid.lastModifiedDate 2025/03/21 01:12:54 *
iris.orcid.lastModifiedMillisecond 1742515974104 *
iris.sitodocente.maxattempts 3 -
isi.authority.ancejournal THE GL-CONFERENCE SERIES. CONFERENCE PROCEEDINGS###1386-2316 *
isi.category NU *
isi.category WU *
isi.contributor.affiliation Consiglio Nazionale delle Ricerche (CNR) -
isi.contributor.affiliation Consiglio Nazionale delle Ricerche (CNR) -
isi.contributor.affiliation Consiglio Nazionale delle Ricerche (CNR) -
isi.contributor.country Italy -
isi.contributor.country Italy -
isi.contributor.country Italy -
isi.contributor.name Laura -
isi.contributor.name Gabriella -
isi.contributor.name Manuela -
isi.contributor.researcherId EQJ-9935-2022 -
isi.contributor.researcherId Z-4698-2019 -
isi.contributor.researcherId GGQ-1037-2022 -
isi.contributor.subaffiliation Ist Linguist Computaz -
isi.contributor.subaffiliation Ist Linguist Computaz -
isi.contributor.subaffiliation Ist Linguist Computaz -
isi.contributor.surname Cignoni -
isi.contributor.surname Pardelli -
isi.contributor.surname Sassi -
isi.date.issued 2008 *
isi.description.allpeopleoriginal Cignoni, L; Pardelli, G; Sassi, M; *
isi.document.sourcetype WOS.ISSHP *
isi.document.type Proceedings Paper *
isi.document.types Proceedings Paper *
isi.identifier.isi WOS:000264705400010 *
isi.journal.journaltitle DESIGNING THE GREY GRID FOR INFORMATION SOCIETY *
isi.journal.journaltitleabbrev GL CONFERENCE SER *
isi.language.original English *
isi.publisher.place GL PROGRAM & CONFERENCE BUREAU, JAVASTRAAT 194-HS, AMSTERDAM, 1095 CP, NETHERLANDS *
isi.relation.firstpage 93 *
isi.relation.issue 10 *
isi.relation.lastpage 100 *
isi.title Grey Literature for Natural Language Processing: A Terminological and Statistical Approach *
Appare nelle tipologie: 04.02 Abstract in Atti di convegno
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/435679
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? 0
social impact