Precision and personalized medicine has been since already a while one of the stimulating international projects putting together geoscientists with different expertise in order to joint effort in sharing detection tools, data, methods and coworkers, to make a quality jump in the sector. In particular, given the opportunity to have many data on several possible patients under several possible investigating measurements, one of the typical goals one has in mind is to classify records on the basis of a hopefully reduced meaningful subset of the measured variables. The complexity of the problem makes it worthwhile to resort to automatic procedures for classi cation. Then, the question does arise of reconstructing a synthetic mathematical model, capturing the most important relations between variables, in order to both discriminate pathologies among them as well as from physiology, and possibly also infer rules of their interaction that could help in identify the very pathway of every disease. Such interrelated aspects will be the focus of the present contribution. Four main general purpose challenging approaches, also useful in the bio-informatics context, keen to be quite useful in Biopharmaceutics and Therapeutic, will be brie y discussed in the present paper, underlying cost effectiveness of each one. In order to reduce the dimensionality of the problem, thus simplifying both the computation and the subsequent understanding of the solution, the critical problems of selecting the most salient variables must be solved.

Clustering Algorithms and Bayesian Networks for Data Mining and Knowledge Discovery Challenges in Biopharmaceutics and Therapeutic

diego liberati
2018

Abstract

Precision and personalized medicine has been since already a while one of the stimulating international projects putting together geoscientists with different expertise in order to joint effort in sharing detection tools, data, methods and coworkers, to make a quality jump in the sector. In particular, given the opportunity to have many data on several possible patients under several possible investigating measurements, one of the typical goals one has in mind is to classify records on the basis of a hopefully reduced meaningful subset of the measured variables. The complexity of the problem makes it worthwhile to resort to automatic procedures for classi cation. Then, the question does arise of reconstructing a synthetic mathematical model, capturing the most important relations between variables, in order to both discriminate pathologies among them as well as from physiology, and possibly also infer rules of their interaction that could help in identify the very pathway of every disease. Such interrelated aspects will be the focus of the present contribution. Four main general purpose challenging approaches, also useful in the bio-informatics context, keen to be quite useful in Biopharmaceutics and Therapeutic, will be brie y discussed in the present paper, underlying cost effectiveness of each one. In order to reduce the dimensionality of the problem, thus simplifying both the computation and the subsequent understanding of the solution, the critical problems of selecting the most salient variables must be solved.
2018
Istituto di Elettronica e di Ingegneria dell'Informazione e delle Telecomunicazioni - IEIIT
hamming clustering
hybrid systems
k-means
model identification
principal direction divisive partitioning
rule inference
salient variables
singular value decomposition
usnupervised clustering
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/372162
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact