CNR Institutional Research Information System

One of the major challenges in the post-genomic era is the speed-up of the process of identification of molecules involved in a specific disease (molecular targets). Even if the experimental procedure has greatly enhanced the analytical capability, the textual data analysis still plays a central role in the experimental activity design or in the data collection. The extraction of useful information from published papers is still strongly dependent on the human expertise in the selection and retrieval of the relevant papers. The search for abstracts in the MEDLINE or PubMed databases is a common activity for researcher. Often, the navigation in textual databases is not simple, and in many cases, the user can retrieve only a list of abstracts without any kind of additional information about the relatedness of the abstract content with the submitted query. In the last decade, the application of natural language processing tools has acquired some relevance in bioinformatics field. The possibility to retrieve and organize the textual information according to the specific topics allows the user to select and analyze only a reduced set of papers. In our work, we present the application of a document clustering system founded on self-organizing maps to reorganize in a hierarchical way the cluster of abstracts retrieved by a PubMed query. The system is available at http://www.biocomp.ge.ismac.cnr.it.

Topical Clustering of Biomedical Abstracts by Self-organizing Maps

M Fattore;P Arrigo

2004

Abstract

One of the major challenges in the post-genomic era is the speed-up of the process of identification of molecules involved in a specific disease (molecular targets). Even if the experimental procedure has greatly enhanced the analytical capability, the textual data analysis still plays a central role in the experimental activity design or in the data collection. The extraction of useful information from published papers is still strongly dependent on the human expertise in the selection and retrieval of the relevant papers. The search for abstracts in the MEDLINE or PubMed databases is a common activity for researcher. Often, the navigation in textual databases is not simple, and in many cases, the user can retrieve only a list of abstracts without any kind of additional information about the relatedness of the abstract content with the submitted query. In the last decade, the application of natural language processing tools has acquired some relevance in bioinformatics field. The possibility to retrieve and organize the textual information according to the specific topics allows the user to select and analyze only a reduced set of papers. In our work, we present the application of a document clustering system founded on self-organizing maps to reorganize in a hierarchical way the cluster of abstracts retrieved by a PubMed query. The system is available at http://www.biocomp.ge.ismac.cnr.it.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2004
			
	Strutture organizzative
	
				Istituto per lo Studio delle Macromolecole - ISMAC - Sede Milano
			
	Codice ISBN
	
				978-1-4020-7735-7
			
	Parole chiave
	
				text mining
self-organizing maps conceptual clustering
			
	Appare nelle tipologie:
	
				04.01 Contributo in Atti di convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/112857

Citazioni

ND

ND

ND

social impact