This paper presents the results of a terminological work conducted by the authors on a Digital Archives Net of the Italian National Research Council (CNR) in the field of Computer Science. In particular, the research tends to analyse the use of certain terms in Computer Science in order to verify their change over the time with the aim of retrieving from the net the very essence of documentation. Its main source is a reference corpus made up of 13,500 documents which collects the scientific productions of three CNR research Institutes. They are ISTI (Institute of Information Science and Technologies), IIT (Institute of Informatics and Telematics) and ILC (Institute of Computational Linguistics), all of them born from the "Centro Studi sulle Calcolatrici Elettroniche (CSCE)" and now belonging to the CNR Department of Information & Communication Technologies and Cultural Identity. This study is divided in three sections: 1) an introductory one dedicated to the data extracted from the scientific documentation: the data have in common the use of some terms proper of the Computer Science lexicon although these term belong to different branches (Linguistics, Informatics and Telematics); 2) the second section is devoted to the description of the contents managed by the PUMA (Publication Management System) system; 3) the third part contains a statistical representation of terms extracted from archive: some comparison tables between the occurrences of the most used terms in the scientific documentation produced by the three Institutes will be created and diagrams with percentages about the most frequently used terms will be displayed too. Lastly, indexes and concordances will allow to reflect on the use of certain terms in this field and give possible keys for having access to the extraction of knowledge in the digital era.

A Digital Archive of Research Papers in Computer Science

Sassi M;Pardelli G;Biagioni S;Carlesi C;Goggi S
2010

Abstract

This paper presents the results of a terminological work conducted by the authors on a Digital Archives Net of the Italian National Research Council (CNR) in the field of Computer Science. In particular, the research tends to analyse the use of certain terms in Computer Science in order to verify their change over the time with the aim of retrieving from the net the very essence of documentation. Its main source is a reference corpus made up of 13,500 documents which collects the scientific productions of three CNR research Institutes. They are ISTI (Institute of Information Science and Technologies), IIT (Institute of Informatics and Telematics) and ILC (Institute of Computational Linguistics), all of them born from the "Centro Studi sulle Calcolatrici Elettroniche (CSCE)" and now belonging to the CNR Department of Information & Communication Technologies and Cultural Identity. This study is divided in three sections: 1) an introductory one dedicated to the data extracted from the scientific documentation: the data have in common the use of some terms proper of the Computer Science lexicon although these term belong to different branches (Linguistics, Informatics and Telematics); 2) the second section is devoted to the description of the contents managed by the PUMA (Publication Management System) system; 3) the third part contains a statistical representation of terms extracted from archive: some comparison tables between the occurrences of the most used terms in the scientific documentation produced by the three Institutes will be created and diagrams with percentages about the most frequently used terms will be displayed too. Lastly, indexes and concordances will allow to reflect on the use of certain terms in this field and give possible keys for having access to the extraction of knowledge in the digital era.
Campo DC Valore Lingua
dc.authority.orgunit Istituto di linguistica computazionale "Antonio Zampolli" - ILC -
dc.authority.orgunit Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI -
dc.authority.people Sassi M it
dc.authority.people Pardelli G it
dc.authority.people Biagioni S it
dc.authority.people Carlesi C it
dc.authority.people Goggi S it
dc.collection.id.s 71c7200a-7c5f-4e83-8d57-d3d2ba88f40d *
dc.collection.name 04.01 Contributo in Atti di convegno *
dc.contributor.appartenenza Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI *
dc.contributor.appartenenza Istituto di linguistica computazionale "Antonio Zampolli" - ILC *
dc.contributor.appartenenza.mi 918 *
dc.contributor.appartenenza.mi 973 *
dc.date.accessioned 2024/02/18 14:22:15 -
dc.date.available 2024/02/18 14:22:15 -
dc.date.issued 2010 -
dc.description.abstracteng This paper presents the results of a terminological work conducted by the authors on a Digital Archives Net of the Italian National Research Council (CNR) in the field of Computer Science. In particular, the research tends to analyse the use of certain terms in Computer Science in order to verify their change over the time with the aim of retrieving from the net the very essence of documentation. Its main source is a reference corpus made up of 13,500 documents which collects the scientific productions of three CNR research Institutes. They are ISTI (Institute of Information Science and Technologies), IIT (Institute of Informatics and Telematics) and ILC (Institute of Computational Linguistics), all of them born from the "Centro Studi sulle Calcolatrici Elettroniche (CSCE)" and now belonging to the CNR Department of Information & Communication Technologies and Cultural Identity. This study is divided in three sections: 1) an introductory one dedicated to the data extracted from the scientific documentation: the data have in common the use of some terms proper of the Computer Science lexicon although these term belong to different branches (Linguistics, Informatics and Telematics); 2) the second section is devoted to the description of the contents managed by the PUMA (Publication Management System) system; 3) the third part contains a statistical representation of terms extracted from archive: some comparison tables between the occurrences of the most used terms in the scientific documentation produced by the three Institutes will be created and diagrams with percentages about the most frequently used terms will be displayed too. Lastly, indexes and concordances will allow to reflect on the use of certain terms in this field and give possible keys for having access to the extraction of knowledge in the digital era. -
dc.description.affiliations CNR-ILC, Pisa, Italy; CNR-ILC, Pisa, Italy; CNR-ISTI, Pisa, Italy; CNR-ISTI, Pisa, Italy; CNR-ILC, Pisa, Italy -
dc.description.allpeople Sassi M.; Pardelli G.; Biagioni S.; Carlesi C.; Goggi S. -
dc.description.allpeopleoriginal Sassi M.; Pardelli G.; Biagioni S.; Carlesi C.; Goggi S. -
dc.description.fulltext reserved en
dc.description.note Indicizzato da: DBLP (Codice:LREC 2010: Valletta, Malta); http://dblp.uni-trier.de/db/conf/lrec/lrec2010.html#SassiPBCG10; Google Scholar (Codice:lrec-conf.org); PuMa (Codice:cnr.isti/cnr.isti/2010-A2-014). In: LREC'10 - Seventh International Conference on Language Resources and Evaluation (Valletta, Malta, 17-23 May 2010). Proceedings, pp. 1245 - 1248. Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Mike Rosner, Daniel Tapias (eds.). European Language Resources Association ELRA, 2010. URL: http://www.lrec-conf.org/proceedings/lrec2010/summaries/945.html. ABSTRACT: This paper presents the results of a terminological work conducted by the authors on a Digital Archives Net of the Italian National Research Council (CNR) in the field of Computer Science. In particular, the research tends to analyse the use of certain terms in Computer Science in order to verify their change over the time with the aim of retrieving from the net the very essence of documentation. Its main source is a reference corpus made up of 13,500 documents which collects the scientific productions of three CNR research Institutes. They are ISTI (Institute of Information Science and Technologies), IIT (Institute of Informatics and Telematics) and ILC (Institute of Computational Linguistics), all of them born from the "Centro Studi sulle Calcolatrici Elettroniche (CSCE)" and now belonging to the CNR Department of Information & Communication Technologies and Cultural Identity. This study is divided in three sections: 1) an introductory one dedicated to the data extracted from the scientific documentation: the data have in common the use of some terms proper of the Computer Science lexicon although these term belong to different branches (Linguistics, Informatics and Telematics); 2) the second section is devoted to the description of the contents managed by the PUMA (Publication Management System) system; 3) the third part contains a statistical representation of terms extracted from archive: some comparison tables between the occurrences of the most used terms in the scientific documentation produced by the three Institutes will be created and diagrams with percentages about the most frequently used terms will be displayed too. Lastly, indexes and concordances will allow to reflect on the use of certain terms in this field and give possible keys for having access to the extraction of knowledge in the digital era. -
dc.description.numberofauthors 5 -
dc.identifier.isbn 2-9517408-6-7 -
dc.identifier.isi WOS:000356879506002 -
dc.identifier.uri https://hdl.handle.net/20.500.14243/147512 -
dc.identifier.url http://www.lrec-conf.org/proceedings/lrec2010/summaries/945.html -
dc.language.iso eng -
dc.publisher.country FRA -
dc.publisher.name European Language Resources Association (ELRA) - Evaluations and Language resources Distribution Agency (ELDA) -
dc.publisher.place Paris -
dc.relation.alleditors Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Mike Rosner, Daniel Tapias -
dc.relation.conferencedate 17-23 Maggio 2010 -
dc.relation.conferencename Seventh International Conference on Language Resources and Evaluation -
dc.relation.conferenceplace Valletta, Malta -
dc.relation.firstpage 1245 -
dc.relation.ispartofbook Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10) -
dc.relation.lastpage 1248 -
dc.relation.numberofpages 4 -
dc.subject.keywords Digital libraries -
dc.subject.keywords Document Classification -
dc.subject.keywords Text categorisation -
dc.subject.keywords Text mining -
dc.subject.keywords Natural Language Processing. Text analysis -
dc.subject.singlekeyword Digital libraries *
dc.subject.singlekeyword Document Classification *
dc.subject.singlekeyword Text categorisation *
dc.subject.singlekeyword Text mining *
dc.subject.singlekeyword Natural Language Processing. Text analysis *
dc.title A Digital Archive of Research Papers in Computer Science en
dc.type.driver info:eu-repo/semantics/conferenceObject -
dc.type.full 04 Contributo in convegno::04.01 Contributo in Atti di convegno it
dc.type.miur 273 -
dc.type.referee Sì, ma tipo non specificato -
dc.ugov.descaux1 171547 -
iris.isi.extIssued 2010 -
iris.isi.extTitle A Digital Archive of Research Papers in Computer Science -
iris.mediafilter.data 2025/04/08 04:43:13 *
iris.orcid.lastModifiedDate 2024/03/02 00:31:20 *
iris.orcid.lastModifiedMillisecond 1709335880573 *
iris.scopus.extIssued 2010 -
iris.scopus.extTitle A digital archive of research papers in computer science -
iris.sitodocente.maxattempts 2 -
isi.category OY *
isi.contributor.affiliation Consiglio Nazionale delle Ricerche (CNR) -
isi.contributor.affiliation Consiglio Nazionale delle Ricerche (CNR) -
isi.contributor.affiliation Consiglio Nazionale delle Ricerche (CNR) -
isi.contributor.affiliation Consiglio Nazionale delle Ricerche (CNR) -
isi.contributor.affiliation Consiglio Nazionale delle Ricerche (CNR) -
isi.contributor.country Italy -
isi.contributor.country Italy -
isi.contributor.country Italy -
isi.contributor.country Italy -
isi.contributor.country Italy -
isi.contributor.name Manuela -
isi.contributor.name Gabriella -
isi.contributor.name Stefania -
isi.contributor.name Carlo -
isi.contributor.name Sara -
isi.contributor.researcherId GGQ-1037-2022 -
isi.contributor.researcherId Z-4698-2019 -
isi.contributor.researcherId FZF-8568-2022 -
isi.contributor.researcherId CHS-7904-2022 -
isi.contributor.researcherId EZY-9908-2022 -
isi.contributor.subaffiliation -
isi.contributor.subaffiliation -
isi.contributor.subaffiliation -
isi.contributor.subaffiliation -
isi.contributor.subaffiliation -
isi.contributor.surname Sassi -
isi.contributor.surname Pardelli -
isi.contributor.surname Biagioni -
isi.contributor.surname Carlesi -
isi.contributor.surname Goggi -
isi.date.issued 2010 *
isi.description.abstracteng This paper presents the results of a terminological work conducted by the authors on a Digital Archives Net of the Italian National Research Council (CNR) in the field of Computer Science. In particular, the research tends to analyse the use of certain terms in Computer Science in order to verify their change over the time with the aim of retrieving from the net the very essence of documentation. Its main source is a reference corpus made up of 13,500 documents which collects the scientific productions of three CNR research Institutes. They are ISTI (Institute of Information Science and Technologies), IIT (Institute of Informatics and Telematics) and ILC (Institute of Computational Linguistics), all of them born from the "Centro Studi sulle Calcolatrici Elettroniche (CSCE)" and now belonging to the CNR Department of Information & Communication Technologies and Cultural Identity.This study is divided in three sections:1) an introductory one dedicated to the data extracted from the scientific documentation: the data have in common the use of some terms proper of the Computer Science lexicon although these term belong to different branches (Linguistics, Informatics and Telematics);2) the second section is devoted to the description of the contents managed by the PUMA (Publication Management System) system;3) the third part contains a statistical representation of terms extracted from archive: some comparison tables between the occurrences of the most used terms in the scientific documentation produced by the three Institutes will be created and diagrams with percentages about the most frequently used terms will be displayed too. Lastly, indexes and concordances will allow to reflect on the use of certain terms in this field and give possible keys for having access to the extraction of knowledge in the digital era. *
isi.description.allpeopleoriginal Sassi, M; Pardelli, G; Biagioni, S; Carlesi, C; Goggi, S; *
isi.document.sourcetype WOS.ISSHP *
isi.document.type Proceedings Paper *
isi.document.types Proceedings Paper *
isi.identifier.isi WOS:000356879506002 *
isi.journal.journaltitle LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION *
isi.language.original English *
isi.publisher.place 55-57, RUE BRILLAT-SAVARIN, PARIS, 75013, FRANCE *
isi.relation.firstpage 1245 *
isi.relation.lastpage 1248 *
isi.title A Digital Archive of Research Papers in Computer Science *
Appare nelle tipologie: 04.01 Contributo in Atti di convegno
File in questo prodotto:
File Dimensione Formato  
prod_171547-doc_33764.pdf

non disponibili

Descrizione: Contributo a convegno-lrec-2010
Tipologia: Versione Editoriale (PDF)
Dimensione 235.68 kB
Formato Adobe PDF
235.68 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/147512
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? 2
social impact