Patient data collection is a routine procedure in many hospitals and clinical departments, typically demographics, administrative and clinical data; these data are characterized by different precision and accuracy, with mixed format which includes quantitative, qualitative and structured representation. This paper presents a system to allow an easy and reliable classification of mixed data as collected in a cardiology department. The method is based on the following steps: mixed data are mapped in a non Euclidean space according to the generalized Minkowski metric; a cost function is used to weight the transformed data and to compensate for their unequal precision; a clustering approach allows the finding of the number of clusters of a given data set; a cluster interpretation is obtained by graphic representation of results through a reduction of features by means of Principal Component Analysis. The system is user oriented, that means the user is free to select an automated procedure on all the patients or to choose a subset of patients and/or features to drive the system at each step. The system is under evaluation by using a 7000 patients database, characteriz?d by 126 mixed features.

An interactive clustering procedure for selection of high dimension patterns

Varanini Maurizio;Taddei Alessandro;
1996

Abstract

Patient data collection is a routine procedure in many hospitals and clinical departments, typically demographics, administrative and clinical data; these data are characterized by different precision and accuracy, with mixed format which includes quantitative, qualitative and structured representation. This paper presents a system to allow an easy and reliable classification of mixed data as collected in a cardiology department. The method is based on the following steps: mixed data are mapped in a non Euclidean space according to the generalized Minkowski metric; a cost function is used to weight the transformed data and to compensate for their unequal precision; a clustering approach allows the finding of the number of clusters of a given data set; a cluster interpretation is obtained by graphic representation of results through a reduction of features by means of Principal Component Analysis. The system is user oriented, that means the user is free to select an automated procedure on all the patients or to choose a subset of patients and/or features to drive the system at each step. The system is under evaluation by using a 7000 patients database, characteriz?d by 126 mixed features.
1996
Istituto di Fisiologia Clinica - IFC
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/253071
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? ND
social impact