The study of correlation structures in DNA sequences is of great interest because it allows us to obtain structural and functional information about underlying genetic mechanisms. In this paper we present a study of the correlation structure of protein coding sequences of DNA based on a recently developed mathematical representation of the genetic code. A fundamental consequence of such representation is that codons can be assigned a parity class "odd-even". Such parity can be obtained by means of a nonlinear algorithm acting on the chemical character of the codon bases. In the same setting the Rumers class can be naturally described and a new dichotomic class, the hidden class, can be defined. Moreover, we show that the set of DNAs base transformations associated to the three dichotomic classes can be put in a compact group-theoretic framework. We use the dichotomic classes as a coding scheme for DNA sequences and study the mutual dependence between such classes. The same analysis is carried out also on the chemical dichotomies of DNA bases. In both cases, the statistical analysis is performed by using an entropy-based dependence metric possessing many desirable properties. We obtain meaningful tests for mutual dependence by using suitable resampling techniques. We find strong short-range correlations between certain combinations of dichotomic codon classes. These results support our previous hypothesis that codon classes might play an active role in the organization of genetic information.
Strong short-range correlations and dichotomic codon classes in coding DNA sequences
Gonzalez Diego Luis;Rosa Rodolfo
2008
Abstract
The study of correlation structures in DNA sequences is of great interest because it allows us to obtain structural and functional information about underlying genetic mechanisms. In this paper we present a study of the correlation structure of protein coding sequences of DNA based on a recently developed mathematical representation of the genetic code. A fundamental consequence of such representation is that codons can be assigned a parity class "odd-even". Such parity can be obtained by means of a nonlinear algorithm acting on the chemical character of the codon bases. In the same setting the Rumers class can be naturally described and a new dichotomic class, the hidden class, can be defined. Moreover, we show that the set of DNAs base transformations associated to the three dichotomic classes can be put in a compact group-theoretic framework. We use the dichotomic classes as a coding scheme for DNA sequences and study the mutual dependence between such classes. The same analysis is carried out also on the chemical dichotomies of DNA bases. In both cases, the statistical analysis is performed by using an entropy-based dependence metric possessing many desirable properties. We obtain meaningful tests for mutual dependence by using suitable resampling techniques. We find strong short-range correlations between certain combinations of dichotomic codon classes. These results support our previous hypothesis that codon classes might play an active role in the organization of genetic information.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.