Exploratory Data Analysis (EDA) is an approach for summarizing and visualizing the important characteristics of a data set, in order to make a prearranged data screening and display multivariate data in a graphical way, to render them more comprehensible. Moreover, it reveals hidden aspects within the simple evaluations. In particular, EDA is suitable for datasets with comparable variables, as structural-geometrical protein features. In this work, we analyzed some proteins belonging to ten different architectural families. After retrieval, feature selection and normalization stages, the dataset has been processed by means of simple correlation, partial correlation and principal component analysis (PCA), highlighting familyindependent or family-specific relationships, and possible outliers for the dataset itself. The results can be useful to connect these features to functional protein properties.
Basic Exploratory Proteins Analysis with Statistical Methods Applied on Structural Features
Eugenio Del Prete;Serena Dotolo;Angelo Facchiano
2015
Abstract
Exploratory Data Analysis (EDA) is an approach for summarizing and visualizing the important characteristics of a data set, in order to make a prearranged data screening and display multivariate data in a graphical way, to render them more comprehensible. Moreover, it reveals hidden aspects within the simple evaluations. In particular, EDA is suitable for datasets with comparable variables, as structural-geometrical protein features. In this work, we analyzed some proteins belonging to ten different architectural families. After retrieval, feature selection and normalization stages, the dataset has been processed by means of simple correlation, partial correlation and principal component analysis (PCA), highlighting familyindependent or family-specific relationships, and possible outliers for the dataset itself. The results can be useful to connect these features to functional protein properties.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.