This paper presents the results of a terminological work conducted by the authors on a Digital Archives Net of the Italian National Research Council (CNR) in the field of Computer Science. In particular, the research tends to analyse the use of certain terms in Computer Science in order to verify their change over the time with the aim of retrieving from the net the very essence of documentation. Its main source is a reference corpus made up of 13,500 documents which collects the scientific productions of three CNR research Institutes. They are ISTI (Institute of Information Science and Technologies), IIT (Institute of Informatics and Telematics) and ILC (Institute of Computational Linguistics), all of them born from the "Centro Studi sulle Calcolatrici Elettroniche (CSCE)" and now belonging to the CNR Department of Information & Communication Technologies and Cultural Identity. This study is divided in three sections: 1) an introductory one dedicated to the data extracted from the scientific documentation: the data have in common the use of some terms proper of the Computer Science lexicon although these term belong to different branches (Linguistics, Informatics and Telematics); 2) the second section is devoted to the description of the contents managed by the PUMA (Publication Management System) system; 3) the third part contains a statistical representation of terms extracted from archive: some comparison tables between the occurrences of the most used terms in the scientific documentation produced by the three Institutes will be created and diagrams with percentages about the most frequently used terms will be displayed too. Lastly, indexes and concordances will allow to reflect on the use of certain terms in this field and give possible keys for having access to the extraction of knowledge in the digital era.
A Digital Archive of Research Papers in Computer Science
Sassi M;Pardelli G;Biagioni S;Carlesi C;Goggi S
2010
Abstract
This paper presents the results of a terminological work conducted by the authors on a Digital Archives Net of the Italian National Research Council (CNR) in the field of Computer Science. In particular, the research tends to analyse the use of certain terms in Computer Science in order to verify their change over the time with the aim of retrieving from the net the very essence of documentation. Its main source is a reference corpus made up of 13,500 documents which collects the scientific productions of three CNR research Institutes. They are ISTI (Institute of Information Science and Technologies), IIT (Institute of Informatics and Telematics) and ILC (Institute of Computational Linguistics), all of them born from the "Centro Studi sulle Calcolatrici Elettroniche (CSCE)" and now belonging to the CNR Department of Information & Communication Technologies and Cultural Identity. This study is divided in three sections: 1) an introductory one dedicated to the data extracted from the scientific documentation: the data have in common the use of some terms proper of the Computer Science lexicon although these term belong to different branches (Linguistics, Informatics and Telematics); 2) the second section is devoted to the description of the contents managed by the PUMA (Publication Management System) system; 3) the third part contains a statistical representation of terms extracted from archive: some comparison tables between the occurrences of the most used terms in the scientific documentation produced by the three Institutes will be created and diagrams with percentages about the most frequently used terms will be displayed too. Lastly, indexes and concordances will allow to reflect on the use of certain terms in this field and give possible keys for having access to the extraction of knowledge in the digital era.| Campo DC | Valore | Lingua |
|---|---|---|
| dc.authority.orgunit | Istituto di linguistica computazionale "Antonio Zampolli" - ILC | - |
| dc.authority.orgunit | Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI | - |
| dc.authority.people | Sassi M | it |
| dc.authority.people | Pardelli G | it |
| dc.authority.people | Biagioni S | it |
| dc.authority.people | Carlesi C | it |
| dc.authority.people | Goggi S | it |
| dc.collection.id.s | 71c7200a-7c5f-4e83-8d57-d3d2ba88f40d | * |
| dc.collection.name | 04.01 Contributo in Atti di convegno | * |
| dc.contributor.appartenenza | Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI | * |
| dc.contributor.appartenenza | Istituto di linguistica computazionale "Antonio Zampolli" - ILC | * |
| dc.contributor.appartenenza.mi | 918 | * |
| dc.contributor.appartenenza.mi | 973 | * |
| dc.date.accessioned | 2024/02/18 14:22:15 | - |
| dc.date.available | 2024/02/18 14:22:15 | - |
| dc.date.issued | 2010 | - |
| dc.description.abstracteng | This paper presents the results of a terminological work conducted by the authors on a Digital Archives Net of the Italian National Research Council (CNR) in the field of Computer Science. In particular, the research tends to analyse the use of certain terms in Computer Science in order to verify their change over the time with the aim of retrieving from the net the very essence of documentation. Its main source is a reference corpus made up of 13,500 documents which collects the scientific productions of three CNR research Institutes. They are ISTI (Institute of Information Science and Technologies), IIT (Institute of Informatics and Telematics) and ILC (Institute of Computational Linguistics), all of them born from the "Centro Studi sulle Calcolatrici Elettroniche (CSCE)" and now belonging to the CNR Department of Information & Communication Technologies and Cultural Identity. This study is divided in three sections: 1) an introductory one dedicated to the data extracted from the scientific documentation: the data have in common the use of some terms proper of the Computer Science lexicon although these term belong to different branches (Linguistics, Informatics and Telematics); 2) the second section is devoted to the description of the contents managed by the PUMA (Publication Management System) system; 3) the third part contains a statistical representation of terms extracted from archive: some comparison tables between the occurrences of the most used terms in the scientific documentation produced by the three Institutes will be created and diagrams with percentages about the most frequently used terms will be displayed too. Lastly, indexes and concordances will allow to reflect on the use of certain terms in this field and give possible keys for having access to the extraction of knowledge in the digital era. | - |
| dc.description.affiliations | CNR-ILC, Pisa, Italy; CNR-ILC, Pisa, Italy; CNR-ISTI, Pisa, Italy; CNR-ISTI, Pisa, Italy; CNR-ILC, Pisa, Italy | - |
| dc.description.allpeople | Sassi M.; Pardelli G.; Biagioni S.; Carlesi C.; Goggi S. | - |
| dc.description.allpeopleoriginal | Sassi M.; Pardelli G.; Biagioni S.; Carlesi C.; Goggi S. | - |
| dc.description.fulltext | reserved | en |
| dc.description.note | Indicizzato da: DBLP (Codice:LREC 2010: Valletta, Malta); http://dblp.uni-trier.de/db/conf/lrec/lrec2010.html#SassiPBCG10; Google Scholar (Codice:lrec-conf.org); PuMa (Codice:cnr.isti/cnr.isti/2010-A2-014). In: LREC'10 - Seventh International Conference on Language Resources and Evaluation (Valletta, Malta, 17-23 May 2010). Proceedings, pp. 1245 - 1248. Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Mike Rosner, Daniel Tapias (eds.). European Language Resources Association ELRA, 2010. URL: http://www.lrec-conf.org/proceedings/lrec2010/summaries/945.html. ABSTRACT: This paper presents the results of a terminological work conducted by the authors on a Digital Archives Net of the Italian National Research Council (CNR) in the field of Computer Science. In particular, the research tends to analyse the use of certain terms in Computer Science in order to verify their change over the time with the aim of retrieving from the net the very essence of documentation. Its main source is a reference corpus made up of 13,500 documents which collects the scientific productions of three CNR research Institutes. They are ISTI (Institute of Information Science and Technologies), IIT (Institute of Informatics and Telematics) and ILC (Institute of Computational Linguistics), all of them born from the "Centro Studi sulle Calcolatrici Elettroniche (CSCE)" and now belonging to the CNR Department of Information & Communication Technologies and Cultural Identity. This study is divided in three sections: 1) an introductory one dedicated to the data extracted from the scientific documentation: the data have in common the use of some terms proper of the Computer Science lexicon although these term belong to different branches (Linguistics, Informatics and Telematics); 2) the second section is devoted to the description of the contents managed by the PUMA (Publication Management System) system; 3) the third part contains a statistical representation of terms extracted from archive: some comparison tables between the occurrences of the most used terms in the scientific documentation produced by the three Institutes will be created and diagrams with percentages about the most frequently used terms will be displayed too. Lastly, indexes and concordances will allow to reflect on the use of certain terms in this field and give possible keys for having access to the extraction of knowledge in the digital era. | - |
| dc.description.numberofauthors | 5 | - |
| dc.identifier.isbn | 2-9517408-6-7 | - |
| dc.identifier.isi | WOS:000356879506002 | - |
| dc.identifier.uri | https://hdl.handle.net/20.500.14243/147512 | - |
| dc.identifier.url | http://www.lrec-conf.org/proceedings/lrec2010/summaries/945.html | - |
| dc.language.iso | eng | - |
| dc.publisher.country | FRA | - |
| dc.publisher.name | European Language Resources Association (ELRA) - Evaluations and Language resources Distribution Agency (ELDA) | - |
| dc.publisher.place | Paris | - |
| dc.relation.alleditors | Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Mike Rosner, Daniel Tapias | - |
| dc.relation.conferencedate | 17-23 Maggio 2010 | - |
| dc.relation.conferencename | Seventh International Conference on Language Resources and Evaluation | - |
| dc.relation.conferenceplace | Valletta, Malta | - |
| dc.relation.firstpage | 1245 | - |
| dc.relation.ispartofbook | Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10) | - |
| dc.relation.lastpage | 1248 | - |
| dc.relation.numberofpages | 4 | - |
| dc.subject.keywords | Digital libraries | - |
| dc.subject.keywords | Document Classification | - |
| dc.subject.keywords | Text categorisation | - |
| dc.subject.keywords | Text mining | - |
| dc.subject.keywords | Natural Language Processing. Text analysis | - |
| dc.subject.singlekeyword | Digital libraries | * |
| dc.subject.singlekeyword | Document Classification | * |
| dc.subject.singlekeyword | Text categorisation | * |
| dc.subject.singlekeyword | Text mining | * |
| dc.subject.singlekeyword | Natural Language Processing. Text analysis | * |
| dc.title | A Digital Archive of Research Papers in Computer Science | en |
| dc.type.driver | info:eu-repo/semantics/conferenceObject | - |
| dc.type.full | 04 Contributo in convegno::04.01 Contributo in Atti di convegno | it |
| dc.type.miur | 273 | - |
| dc.type.referee | Sì, ma tipo non specificato | - |
| dc.ugov.descaux1 | 171547 | - |
| iris.isi.extIssued | 2010 | - |
| iris.isi.extTitle | A Digital Archive of Research Papers in Computer Science | - |
| iris.mediafilter.data | 2025/04/08 04:43:13 | * |
| iris.orcid.lastModifiedDate | 2024/03/02 00:31:20 | * |
| iris.orcid.lastModifiedMillisecond | 1709335880573 | * |
| iris.scopus.extIssued | 2010 | - |
| iris.scopus.extTitle | A digital archive of research papers in computer science | - |
| iris.sitodocente.maxattempts | 2 | - |
| isi.category | OY | * |
| isi.contributor.affiliation | Consiglio Nazionale delle Ricerche (CNR) | - |
| isi.contributor.affiliation | Consiglio Nazionale delle Ricerche (CNR) | - |
| isi.contributor.affiliation | Consiglio Nazionale delle Ricerche (CNR) | - |
| isi.contributor.affiliation | Consiglio Nazionale delle Ricerche (CNR) | - |
| isi.contributor.affiliation | Consiglio Nazionale delle Ricerche (CNR) | - |
| isi.contributor.country | Italy | - |
| isi.contributor.country | Italy | - |
| isi.contributor.country | Italy | - |
| isi.contributor.country | Italy | - |
| isi.contributor.country | Italy | - |
| isi.contributor.name | Manuela | - |
| isi.contributor.name | Gabriella | - |
| isi.contributor.name | Stefania | - |
| isi.contributor.name | Carlo | - |
| isi.contributor.name | Sara | - |
| isi.contributor.researcherId | GGQ-1037-2022 | - |
| isi.contributor.researcherId | Z-4698-2019 | - |
| isi.contributor.researcherId | FZF-8568-2022 | - |
| isi.contributor.researcherId | CHS-7904-2022 | - |
| isi.contributor.researcherId | EZY-9908-2022 | - |
| isi.contributor.subaffiliation | - | |
| isi.contributor.subaffiliation | - | |
| isi.contributor.subaffiliation | - | |
| isi.contributor.subaffiliation | - | |
| isi.contributor.subaffiliation | - | |
| isi.contributor.surname | Sassi | - |
| isi.contributor.surname | Pardelli | - |
| isi.contributor.surname | Biagioni | - |
| isi.contributor.surname | Carlesi | - |
| isi.contributor.surname | Goggi | - |
| isi.date.issued | 2010 | * |
| isi.description.abstracteng | This paper presents the results of a terminological work conducted by the authors on a Digital Archives Net of the Italian National Research Council (CNR) in the field of Computer Science. In particular, the research tends to analyse the use of certain terms in Computer Science in order to verify their change over the time with the aim of retrieving from the net the very essence of documentation. Its main source is a reference corpus made up of 13,500 documents which collects the scientific productions of three CNR research Institutes. They are ISTI (Institute of Information Science and Technologies), IIT (Institute of Informatics and Telematics) and ILC (Institute of Computational Linguistics), all of them born from the "Centro Studi sulle Calcolatrici Elettroniche (CSCE)" and now belonging to the CNR Department of Information & Communication Technologies and Cultural Identity.This study is divided in three sections:1) an introductory one dedicated to the data extracted from the scientific documentation: the data have in common the use of some terms proper of the Computer Science lexicon although these term belong to different branches (Linguistics, Informatics and Telematics);2) the second section is devoted to the description of the contents managed by the PUMA (Publication Management System) system;3) the third part contains a statistical representation of terms extracted from archive: some comparison tables between the occurrences of the most used terms in the scientific documentation produced by the three Institutes will be created and diagrams with percentages about the most frequently used terms will be displayed too. Lastly, indexes and concordances will allow to reflect on the use of certain terms in this field and give possible keys for having access to the extraction of knowledge in the digital era. | * |
| isi.description.allpeopleoriginal | Sassi, M; Pardelli, G; Biagioni, S; Carlesi, C; Goggi, S; | * |
| isi.document.sourcetype | WOS.ISSHP | * |
| isi.document.type | Proceedings Paper | * |
| isi.document.types | Proceedings Paper | * |
| isi.identifier.isi | WOS:000356879506002 | * |
| isi.journal.journaltitle | LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION | * |
| isi.language.original | English | * |
| isi.publisher.place | 55-57, RUE BRILLAT-SAVARIN, PARIS, 75013, FRANCE | * |
| isi.relation.firstpage | 1245 | * |
| isi.relation.lastpage | 1248 | * |
| isi.title | A Digital Archive of Research Papers in Computer Science | * |
| Appare nelle tipologie: | 04.01 Contributo in Atti di convegno | |
| File | Dimensione | Formato | |
|---|---|---|---|
|
prod_171547-doc_33764.pdf
non disponibili
Descrizione: Contributo a convegno-lrec-2010
Tipologia:
Versione Editoriale (PDF)
Dimensione
235.68 kB
Formato
Adobe PDF
|
235.68 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


