After normalization, the distribution of gene expressions for very different organisms have a similar shape, usually exhibit heavier tails than a Gaussian distribution, and have a certain degree of asymmetry. Therefore, this distribution has been modeled in the literature using different parametric families of distributions, such the Asymmetric Laplace or the Cauchy distribution. Moreover, it is known that the tails of spot-intensity distributions are described by a power law and the variance of a given array increases with the number of genes. These features of the distribution of gene expression strongly suggest that the alpha-stable distribution is suitable to model it. In this work, we model the error distribution for gene expression data using the alpha-stable distribution. This distribution is tested successfully for four different datasets. The Kullback-Leibler, Chi-square and Hellinger tests are performed to compare how alpha-stable, Asymmetric Laplace and Gaussian fit the spot intensity distribution. The alpha-stable is proved to perform much better for every array in every dataset considered. Furthermore, using an alpha-stable mixture model, a Bayesian log-posterior odds is calculated allowing us to decide whether a gene is differently expressed or not. This statistic is based on the Scale Mixture of Normals and other well known properties of the alpha-stable distribution. The proposed methodology is illustrated using simulated data and the results are compared with the other existing statistical approach.
Modelling and assessing differential gene expression using the alpha stable distribution
Kuruoglu E E;
2009
Abstract
After normalization, the distribution of gene expressions for very different organisms have a similar shape, usually exhibit heavier tails than a Gaussian distribution, and have a certain degree of asymmetry. Therefore, this distribution has been modeled in the literature using different parametric families of distributions, such the Asymmetric Laplace or the Cauchy distribution. Moreover, it is known that the tails of spot-intensity distributions are described by a power law and the variance of a given array increases with the number of genes. These features of the distribution of gene expression strongly suggest that the alpha-stable distribution is suitable to model it. In this work, we model the error distribution for gene expression data using the alpha-stable distribution. This distribution is tested successfully for four different datasets. The Kullback-Leibler, Chi-square and Hellinger tests are performed to compare how alpha-stable, Asymmetric Laplace and Gaussian fit the spot intensity distribution. The alpha-stable is proved to perform much better for every array in every dataset considered. Furthermore, using an alpha-stable mixture model, a Bayesian log-posterior odds is calculated allowing us to decide whether a gene is differently expressed or not. This statistic is based on the Scale Mixture of Normals and other well known properties of the alpha-stable distribution. The proposed methodology is illustrated using simulated data and the results are compared with the other existing statistical approach.File | Dimensione | Formato | |
---|---|---|---|
prod_44294-doc_130766.pdf
accesso aperto
Descrizione: Modelling and assessing differential gene expression using the alpha stable distribution
Tipologia:
Versione Editoriale (PDF)
Dimensione
806.74 kB
Formato
Adobe PDF
|
806.74 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.