CNR Institutional Research Information System

This paper presents the results of a terminological work conducted by the authors on a Digital Archives Net of the Italian National Research Council (CNR) in the field of Computer Science. In particular, the research tends to analyse the use of certain terms in Computer Science in order to verify their change over the time with the aim of retrieving from the net the very essence of documentation. Its main source is a reference corpus made up of 13,500 documents which collects the scientific productions of three CNR research Institutes. They are ISTI (Institute of Information Science and Technologies), IIT (Institute of Informatics and Telematics) and ILC (Institute of Computational Linguistics), all of them born from the "Centro Studi sulle Calcolatrici Elettroniche (CSCE)" and now belonging to the CNR Department of Information & Communication Technologies and Cultural Identity. This study is divided in three sections: 1) an introductory one dedicated to the data extracted from the scientific documentation: the data have in common the use of some terms proper of the Computer Science lexicon although these term belong to different branches (Linguistics, Informatics and Telematics); 2) the second section is devoted to the description of the contents managed by the PUMA (Publication Management System) system; 3) the third part contains a statistical representation of terms extracted from archive: some comparison tables between the occurrences of the most used terms in the scientific documentation produced by the three Institutes will be created and diagrams with percentages about the most frequently used terms will be displayed too. Lastly, indexes and concordances will allow to reflect on the use of certain terms in this field and give possible keys for having access to the extraction of knowledge in the digital era.

A Digital Archive of Research Papers in Computer Science

Sassi M;Pardelli G;Biagioni S;Carlesi C;Goggi S

2010

Abstract

This paper presents the results of a terminological work conducted by the authors on a Digital Archives Net of the Italian National Research Council (CNR) in the field of Computer Science. In particular, the research tends to analyse the use of certain terms in Computer Science in order to verify their change over the time with the aim of retrieving from the net the very essence of documentation. Its main source is a reference corpus made up of 13,500 documents which collects the scientific productions of three CNR research Institutes. They are ISTI (Institute of Information Science and Technologies), IIT (Institute of Informatics and Telematics) and ILC (Institute of Computational Linguistics), all of them born from the "Centro Studi sulle Calcolatrici Elettroniche (CSCE)" and now belonging to the CNR Department of Information & Communication Technologies and Cultural Identity. This study is divided in three sections: 1) an introductory one dedicated to the data extracted from the scientific documentation: the data have in common the use of some terms proper of the Computer Science lexicon although these term belong to different branches (Linguistics, Informatics and Telematics); 2) the second section is devoted to the description of the contents managed by the PUMA (Publication Management System) system; 3) the third part contains a statistical representation of terms extracted from archive: some comparison tables between the occurrences of the most used terms in the scientific documentation produced by the three Institutes will be created and diagrams with percentages about the most frequently used terms will be displayed too. Lastly, indexes and concordances will allow to reflect on the use of certain terms in this field and give possible keys for having access to the extraction of knowledge in the digital era.

Scheda breve

Scheda completa

Scheda completa (DC)

Campo DC	Valore	Lingua
dc.authority.orgunit	Istituto di linguistica computazionale "Antonio Zampolli" - ILC	-
dc.authority.orgunit	Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI	-
dc.authority.people	Sassi M	it
dc.authority.people	Pardelli G	it
dc.authority.people	Biagioni S	it
dc.authority.people	Carlesi C	it
dc.authority.people	Goggi S	it
dc.collection.id.s	71c7200a-7c5f-4e83-8d57-d3d2ba88f40d	*
dc.collection.name	04.01 Contributo in Atti di convegno	*
dc.contributor.appartenenza	Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI	*
dc.contributor.appartenenza	Istituto di linguistica computazionale "Antonio Zampolli" - ILC	*
dc.contributor.appartenenza.mi	918	*
dc.contributor.appartenenza.mi	973	*
dc.date.accessioned	2024/02/18 14:22:15	-
dc.date.available	2024/02/18 14:22:15	-
dc.date.issued	2010	-
dc.description.abstracteng	This paper presents the results of a terminological work conducted by the authors on a Digital Archives Net of the Italian National Research Council (CNR) in the field of Computer Science. In particular, the research tends to analyse the use of certain terms in Computer Science in order to verify their change over the time with the aim of retrieving from the net the very essence of documentation. Its main source is a reference corpus made up of 13,500 documents which collects the scientific productions of three CNR research Institutes. They are ISTI (Institute of Information Science and Technologies), IIT (Institute of Informatics and Telematics) and ILC (Institute of Computational Linguistics), all of them born from the "Centro Studi sulle Calcolatrici Elettroniche (CSCE)" and now belonging to the CNR Department of Information & Communication Technologies and Cultural Identity. This study is divided in three sections: 1) an introductory one dedicated to the data extracted from the scientific documentation: the data have in common the use of some terms proper of the Computer Science lexicon although these term belong to different branches (Linguistics, Informatics and Telematics); 2) the second section is devoted to the description of the contents managed by the PUMA (Publication Management System) system; 3) the third part contains a statistical representation of terms extracted from archive: some comparison tables between the occurrences of the most used terms in the scientific documentation produced by the three Institutes will be created and diagrams with percentages about the most frequently used terms will be displayed too. Lastly, indexes and concordances will allow to reflect on the use of certain terms in this field and give possible keys for having access to the extraction of knowledge in the digital era.	-
dc.description.affiliations	CNR-ILC, Pisa, Italy; CNR-ILC, Pisa, Italy; CNR-ISTI, Pisa, Italy; CNR-ISTI, Pisa, Italy; CNR-ILC, Pisa, Italy	-
dc.description.allpeople	Sassi M.; Pardelli G.; Biagioni S.; Carlesi C.; Goggi S.	-
dc.description.allpeopleoriginal	Sassi M.; Pardelli G.; Biagioni S.; Carlesi C.; Goggi S.	-
dc.description.fulltext	reserved	en
dc.description.note	Indicizzato da: DBLP (Codice:LREC 2010: Valletta, Malta); http://dblp.uni-trier.de/db/conf/lrec/lrec2010.html#SassiPBCG10; Google Scholar (Codice:lrec-conf.org); PuMa (Codice:cnr.isti/cnr.isti/2010-A2-014). In: LREC'10 - Seventh International Conference on Language Resources and Evaluation (Valletta, Malta, 17-23 May 2010). Proceedings, pp. 1245 - 1248. Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Mike Rosner, Daniel Tapias (eds.). European Language Resources Association ELRA, 2010. URL: http://www.lrec-conf.org/proceedings/lrec2010/summaries/945.html. ABSTRACT: This paper presents the results of a terminological work conducted by the authors on a Digital Archives Net of the Italian National Research Council (CNR) in the field of Computer Science. In particular, the research tends to analyse the use of certain terms in Computer Science in order to verify their change over the time with the aim of retrieving from the net the very essence of documentation. Its main source is a reference corpus made up of 13,500 documents which collects the scientific productions of three CNR research Institutes. They are ISTI (Institute of Information Science and Technologies), IIT (Institute of Informatics and Telematics) and ILC (Institute of Computational Linguistics), all of them born from the "Centro Studi sulle Calcolatrici Elettroniche (CSCE)" and now belonging to the CNR Department of Information & Communication Technologies and Cultural Identity. This study is divided in three sections: 1) an introductory one dedicated to the data extracted from the scientific documentation: the data have in common the use of some terms proper of the Computer Science lexicon although these term belong to different branches (Linguistics, Informatics and Telematics); 2) the second section is devoted to the description of the contents managed by the PUMA (Publication Management System) system; 3) the third part contains a statistical representation of terms extracted from archive: some comparison tables between the occurrences of the most used terms in the scientific documentation produced by the three Institutes will be created and diagrams with percentages about the most frequently used terms will be displayed too. Lastly, indexes and concordances will allow to reflect on the use of certain terms in this field and give possible keys for having access to the extraction of knowledge in the digital era.	-
dc.description.numberofauthors	5	-
dc.identifier.isbn	2-9517408-6-7	-
dc.identifier.isi	WOS:000356879506002	-
dc.identifier.uri	https://hdl.handle.net/20.500.14243/147512	-
dc.identifier.url	http://www.lrec-conf.org/proceedings/lrec2010/summaries/945.html	-
dc.language.iso	eng	-
dc.publisher.country	FRA	-
dc.publisher.name	European Language Resources Association (ELRA) - Evaluations and Language resources Distribution Agency (ELDA)	-
dc.publisher.place	Paris	-
dc.relation.alleditors	Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Mike Rosner, Daniel Tapias	-
dc.relation.conferencedate	17-23 Maggio 2010	-
dc.relation.conferencename	Seventh International Conference on Language Resources and Evaluation	-
dc.relation.conferenceplace	Valletta, Malta	-
dc.relation.firstpage	1245	-
dc.relation.ispartofbook	Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)	-
dc.relation.lastpage	1248	-
dc.relation.numberofpages	4	-
dc.subject.keywords	Digital libraries	-
dc.subject.keywords	Document Classification	-
dc.subject.keywords	Text categorisation	-
dc.subject.keywords	Text mining	-
dc.subject.keywords	Natural Language Processing. Text analysis	-
dc.subject.singlekeyword	Digital libraries	*
dc.subject.singlekeyword	Document Classification	*
dc.subject.singlekeyword	Text categorisation	*
dc.subject.singlekeyword	Text mining	*
dc.subject.singlekeyword	Natural Language Processing. Text analysis	*
dc.title	A Digital Archive of Research Papers in Computer Science	en
dc.type.driver	info:eu-repo/semantics/conferenceObject	-
dc.type.full	04 Contributo in convegno::04.01 Contributo in Atti di convegno	it
dc.type.miur	273	-
dc.type.referee	Sì, ma tipo non specificato	-
dc.ugov.descaux1	171547	-
iris.isi.extIssued	2010	-
iris.isi.extTitle	A Digital Archive of Research Papers in Computer Science	-
iris.mediafilter.data	2025/04/08 04:43:13	*
iris.orcid.lastModifiedDate	2024/03/02 00:31:20	*
iris.orcid.lastModifiedMillisecond	1709335880573	*
iris.scopus.extIssued	2010	-
iris.scopus.extTitle	A digital archive of research papers in computer science	-
iris.sitodocente.maxattempts	2	-
isi.category	OY	*
isi.contributor.affiliation	Consiglio Nazionale delle Ricerche (CNR)	-
isi.contributor.affiliation	Consiglio Nazionale delle Ricerche (CNR)	-
isi.contributor.affiliation	Consiglio Nazionale delle Ricerche (CNR)	-
isi.contributor.affiliation	Consiglio Nazionale delle Ricerche (CNR)	-
isi.contributor.affiliation	Consiglio Nazionale delle Ricerche (CNR)	-
isi.contributor.country	Italy	-
isi.contributor.country	Italy	-
isi.contributor.country	Italy	-
isi.contributor.country	Italy	-
isi.contributor.country	Italy	-
isi.contributor.name	Manuela	-
isi.contributor.name	Gabriella	-
isi.contributor.name	Stefania	-
isi.contributor.name	Carlo	-
isi.contributor.name	Sara	-
isi.contributor.researcherId	GGQ-1037-2022	-
isi.contributor.researcherId	Z-4698-2019	-
isi.contributor.researcherId	FZF-8568-2022	-
isi.contributor.researcherId	CHS-7904-2022	-
isi.contributor.researcherId	EZY-9908-2022	-
isi.contributor.subaffiliation		-
isi.contributor.subaffiliation		-
isi.contributor.subaffiliation		-
isi.contributor.subaffiliation		-
isi.contributor.subaffiliation		-
isi.contributor.surname	Sassi	-
isi.contributor.surname	Pardelli	-
isi.contributor.surname	Biagioni	-
isi.contributor.surname	Carlesi	-
isi.contributor.surname	Goggi	-
isi.date.issued	2010	*
isi.description.abstracteng	This paper presents the results of a terminological work conducted by the authors on a Digital Archives Net of the Italian National Research Council (CNR) in the field of Computer Science. In particular, the research tends to analyse the use of certain terms in Computer Science in order to verify their change over the time with the aim of retrieving from the net the very essence of documentation. Its main source is a reference corpus made up of 13,500 documents which collects the scientific productions of three CNR research Institutes. They are ISTI (Institute of Information Science and Technologies), IIT (Institute of Informatics and Telematics) and ILC (Institute of Computational Linguistics), all of them born from the "Centro Studi sulle Calcolatrici Elettroniche (CSCE)" and now belonging to the CNR Department of Information & Communication Technologies and Cultural Identity.This study is divided in three sections:1) an introductory one dedicated to the data extracted from the scientific documentation: the data have in common the use of some terms proper of the Computer Science lexicon although these term belong to different branches (Linguistics, Informatics and Telematics);2) the second section is devoted to the description of the contents managed by the PUMA (Publication Management System) system;3) the third part contains a statistical representation of terms extracted from archive: some comparison tables between the occurrences of the most used terms in the scientific documentation produced by the three Institutes will be created and diagrams with percentages about the most frequently used terms will be displayed too. Lastly, indexes and concordances will allow to reflect on the use of certain terms in this field and give possible keys for having access to the extraction of knowledge in the digital era.	*
isi.description.allpeopleoriginal	Sassi, M; Pardelli, G; Biagioni, S; Carlesi, C; Goggi, S;	*
isi.document.sourcetype	WOS.ISSHP	*
isi.document.type	Proceedings Paper	*
isi.document.types	Proceedings Paper	*
isi.identifier.isi	WOS:000356879506002	*
isi.journal.journaltitle	LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION	*
isi.language.original	English	*
isi.publisher.place	55-57, RUE BRILLAT-SAVARIN, PARIS, 75013, FRANCE	*
isi.relation.firstpage	1245	*
isi.relation.lastpage	1248	*
isi.title	A Digital Archive of Research Papers in Computer Science	*
Appare nelle tipologie:	04.01 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
prod_171547-doc_33764.pdf non disponibili Descrizione: Contributo a convegno-lrec-2010 Tipologia: Versione Editoriale (PDF) Dimensione 235.68 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	235.68 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/147512

Citazioni

ND

ND

2

social impact