In the gas sensor field, principal component analysis (PCA) is still the mostly used technique for exploratory data analysis, although the human judgment of PCA plots often determines the classification results. In this paper, we propose a new approach based on cluster analysis (CA) in combination with cluster validity (CLV) that can be used to objectively infer and assess the data structure. Hierarchical and k-means clustering methods are implemented together with four indices of CLV, aiming to estimating the best partition of data. We initially demonstrate the approach on simulated data sets, and then we apply it to the experimental data coming from an electronic nose. With reference to the e-nose data, the unsupervised classification results are consistent with those achieved by the visual inspection of data through PCA plots. In addition, CLV gives quantitative information about the classification strength of the data categories; this allows to understand which of the different categories better describes the data set. The proposed method, due to its reliability and objectivity, can be used for an automated and standardized evaluation of data in gas sensor devices. (c) 2007 Elsevier B.V. All rights reserved.
Cluster validation for electronic nose data
Pardo M;Sberveglieri G
2007
Abstract
In the gas sensor field, principal component analysis (PCA) is still the mostly used technique for exploratory data analysis, although the human judgment of PCA plots often determines the classification results. In this paper, we propose a new approach based on cluster analysis (CA) in combination with cluster validity (CLV) that can be used to objectively infer and assess the data structure. Hierarchical and k-means clustering methods are implemented together with four indices of CLV, aiming to estimating the best partition of data. We initially demonstrate the approach on simulated data sets, and then we apply it to the experimental data coming from an electronic nose. With reference to the e-nose data, the unsupervised classification results are consistent with those achieved by the visual inspection of data through PCA plots. In addition, CLV gives quantitative information about the classification strength of the data categories; this allows to understand which of the different categories better describes the data set. The proposed method, due to its reliability and objectivity, can be used for an automated and standardized evaluation of data in gas sensor devices. (c) 2007 Elsevier B.V. All rights reserved.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.