BACKGROUND: Copy number alterations (CNAs) represent an important component ofgenetic variations. Such alterations are related with certain type ofcancer including those of the pancreas, colon, and breast, amongothers. CNAs have been used as biomarkers for cancer prognosis inmultiple studies, but few works report on the relation of CNAs withthe disease progression. Moreover, most studies do not consider thefollowing two important issues. (I) The identification of CNAs in genes which are responsible forexpression regulation is fundamental in order to define genetic eventsleading to malignant transformation and progression.(II) Most real domains are best described by \textit{structured} datawhere instances of multiple types are related to each other in complexways. RESULTS: Our main interest is to check whether thecolorectal cancer (CRC) progression inferencebenefits when considering both (I) the expression levels of genes withCNAs, and (II) relationships (i.e. dissimilarities) between patientsdue to expression level differences of the altered genes.We first evaluate the accuracy performance of a state-of-the-artinference method (support vector machine) when subjects arerepresented only through sets of available attribute values (i.e. geneexpression level). Then we check whether the inference accuracyimproves, when explicitly exploiting the information mentioned above.Our results suggest that the CRC progression inference improves whenthe combined data (i.e. CNA and expression level) and the considereddissimilarity measures are applied. CONCLUSIONS: Through our approach, classification is intuitively appealing and canbe conveniently obtained in the resulting dissimilarity spaces.Different public datasets from Gene Expression Omnibus (GEO)were used to validate the results.
Combined analysis of chromosomal instabilities and gene expression for colon cancer progression inference
Claudia Cava;Isabella Castiglioni;
2014
Abstract
BACKGROUND: Copy number alterations (CNAs) represent an important component ofgenetic variations. Such alterations are related with certain type ofcancer including those of the pancreas, colon, and breast, amongothers. CNAs have been used as biomarkers for cancer prognosis inmultiple studies, but few works report on the relation of CNAs withthe disease progression. Moreover, most studies do not consider thefollowing two important issues. (I) The identification of CNAs in genes which are responsible forexpression regulation is fundamental in order to define genetic eventsleading to malignant transformation and progression.(II) Most real domains are best described by \textit{structured} datawhere instances of multiple types are related to each other in complexways. RESULTS: Our main interest is to check whether thecolorectal cancer (CRC) progression inferencebenefits when considering both (I) the expression levels of genes withCNAs, and (II) relationships (i.e. dissimilarities) between patientsdue to expression level differences of the altered genes.We first evaluate the accuracy performance of a state-of-the-artinference method (support vector machine) when subjects arerepresented only through sets of available attribute values (i.e. geneexpression level). Then we check whether the inference accuracyimproves, when explicitly exploiting the information mentioned above.Our results suggest that the CRC progression inference improves whenthe combined data (i.e. CNA and expression level) and the considereddissimilarity measures are applied. CONCLUSIONS: Through our approach, classification is intuitively appealing and canbe conveniently obtained in the resulting dissimilarity spaces.Different public datasets from Gene Expression Omnibus (GEO)were used to validate the results.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.