The prototype of an "intelligent" navigation system, which has been implemented on the contents of PUMA (http://puma.isti.cnr.it), a digital library of scientific literature, is presented. The system has been implemented by integrating our core textual search engine (known as DBT) with the TextPower (TP) technology. TP is based on NLP techniques and linguistic resources and provides tools specialized for the evaluation, analysis, classification and browsing of scientific literature. TP extends the facet concept by extracting "field + content" pairs not only from structured fields but also from free text, eg. abstracts, using a linguistic-statistical approach to annotate relevant terminology, named entities, etc. The enriched text can be queried, analysed, and classified using a new version of the DBT System known as "DBT&Facets". DBT&Facets has been implemented on the full bibliographic records of the documents archived in the PUMA digital library of the Italian National Research Council (CNR). PUMA is a user-focused, service-oriented infrastructure which manages 30 CNR institutional repositories containing about 25,000 published or open access documents in a wide variety of disciplines. In an open domain like scientific documentation, our approach based on the criteria of "semantic similarity" is useful - and perhaps more objective than one based on hierarchical elements - as it makes it possible to link different types of information, also across domains if necessary. DBT&Facets is an advanced search tool that permits the user to query and refine their results, and to identify particular relations between them. The aim of the project has been to structure a knowledge system of domain-specific information which assists the user by suggesting possible directions for their search.
Extending the "Facets" concept by applying NLP tools to catalog records of scientific literature
Picchi E;Sassi M;Biagioni S;Giannini S
2010
Abstract
The prototype of an "intelligent" navigation system, which has been implemented on the contents of PUMA (http://puma.isti.cnr.it), a digital library of scientific literature, is presented. The system has been implemented by integrating our core textual search engine (known as DBT) with the TextPower (TP) technology. TP is based on NLP techniques and linguistic resources and provides tools specialized for the evaluation, analysis, classification and browsing of scientific literature. TP extends the facet concept by extracting "field + content" pairs not only from structured fields but also from free text, eg. abstracts, using a linguistic-statistical approach to annotate relevant terminology, named entities, etc. The enriched text can be queried, analysed, and classified using a new version of the DBT System known as "DBT&Facets". DBT&Facets has been implemented on the full bibliographic records of the documents archived in the PUMA digital library of the Italian National Research Council (CNR). PUMA is a user-focused, service-oriented infrastructure which manages 30 CNR institutional repositories containing about 25,000 published or open access documents in a wide variety of disciplines. In an open domain like scientific documentation, our approach based on the criteria of "semantic similarity" is useful - and perhaps more objective than one based on hierarchical elements - as it makes it possible to link different types of information, also across domains if necessary. DBT&Facets is an advanced search tool that permits the user to query and refine their results, and to identify particular relations between them. The aim of the project has been to structure a knowledge system of domain-specific information which assists the user by suggesting possible directions for their search.| Campo DC | Valore | Lingua |
|---|---|---|
| dc.authority.orgunit | Istituto di linguistica computazionale "Antonio Zampolli" - ILC | - |
| dc.authority.orgunit | Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI | - |
| dc.authority.people | Picchi E | it |
| dc.authority.people | Sassi M | it |
| dc.authority.people | Biagioni S | it |
| dc.authority.people | Giannini S | it |
| dc.collection.id.s | 69aaa6b3-f0f0-47c1-b9a1-040bae867ec3 | * |
| dc.collection.name | 04.02 Abstract in Atti di convegno | * |
| dc.contributor.appartenenza | Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI | * |
| dc.contributor.appartenenza | Istituto di linguistica computazionale "Antonio Zampolli" - ILC | * |
| dc.contributor.appartenenza.mi | 918 | * |
| dc.contributor.appartenenza.mi | 973 | * |
| dc.date.accessioned | 2024/02/19 20:57:40 | - |
| dc.date.available | 2024/02/19 20:57:40 | - |
| dc.date.issued | 2010 | - |
| dc.description.abstract | The prototype of an "intelligent" navigation system, which has been implemented on the contents of PUMA (http://puma.isti.cnr.it), a digital library of scientific literature, is presented. The system has been implemented by integrating our core textual search engine (known as DBT) with the TextPower (TP) technology. TP is based on NLP techniques and linguistic resources and provides tools specialized for the evaluation, analysis, classification and browsing of scientific literature. TP extends the facet concept by extracting "field + content" pairs not only from structured fields but also from free text, eg. abstracts, using a linguistic-statistical approach to annotate relevant terminology, named entities, etc. The enriched text can be queried, analysed, and classified using a new version of the DBT System known as "DBT&Facets". DBT&Facets has been implemented on the full bibliographic records of the documents archived in the PUMA digital library of the Italian National Research Council (CNR). PUMA is a user-focused, service-oriented infrastructure which manages 30 CNR institutional repositories containing about 25,000 published or open access documents in a wide variety of disciplines. In an open domain like scientific documentation, our approach based on the criteria of "semantic similarity" is useful - and perhaps more objective than one based on hierarchical elements - as it makes it possible to link different types of information, also across domains if necessary. DBT&Facets is an advanced search tool that permits the user to query and refine their results, and to identify particular relations between them. The aim of the project has been to structure a knowledge system of domain-specific information which assists the user by suggesting possible directions for their search. | - |
| dc.description.affiliations | CNR-ILC, Pisa, Italy; CNR-ILC, Pisa, Italy; CNR-ISTI, Pisa, Italy; CNR-ISTI, Pisa, Italy | - |
| dc.description.allpeople | Picchi, E; Sassi, M; Biagioni, S; Giannini, S | - |
| dc.description.allpeopleoriginal | Picchi E.; Sassi M.; Biagioni S.; Giannini S. | - |
| dc.description.fulltext | restricted | en |
| dc.description.numberofauthors | 4 | - |
| dc.identifier.isbn | 978-90-77484-15-9 | - |
| dc.identifier.uri | https://hdl.handle.net/20.500.14243/86042 | - |
| dc.language.iso | eng | - |
| dc.relation.alleditors | D.J. Farace, J. Frantzen, GreyNet | - |
| dc.relation.conferencedate | 6-7 December 2010 | - |
| dc.relation.conferencename | Twelfth International Conference on Grey Literature | - |
| dc.relation.conferenceplace | Praga | - |
| dc.relation.firstpage | 82 | - |
| dc.relation.lastpage | 87 | - |
| dc.subject.keywords | NLP tools | - |
| dc.subject.keywords | Digital libraries | - |
| dc.subject.singlekeyword | NLP tools | * |
| dc.subject.singlekeyword | Digital libraries | * |
| dc.title | Extending the "Facets" concept by applying NLP tools to catalog records of scientific literature | en |
| dc.type.driver | info:eu-repo/semantics/conferenceObject | - |
| dc.type.full | 04 Contributo in convegno::04.02 Abstract in Atti di convegno | it |
| dc.type.miur | 274 | - |
| dc.type.referee | Sì, ma tipo non specificato | - |
| dc.ugov.descaux1 | 120718 | - |
| iris.mediafilter.data | 2025/04/20 03:06:19 | * |
| iris.orcid.lastModifiedDate | 2024/04/04 15:15:29 | * |
| iris.orcid.lastModifiedMillisecond | 1712236529470 | * |
| iris.sitodocente.maxattempts | 1 | - |
| Appare nelle tipologie: | 04.02 Abstract in Atti di convegno | |
| File | Dimensione | Formato | |
|---|---|---|---|
|
prod_120718-doc_84425.pdf
solo utenti autorizzati
Descrizione: paper
Tipologia:
Versione Editoriale (PDF)
Dimensione
148.4 kB
Formato
Adobe PDF
|
148.4 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


