We analyze a class of probability distributions for family sizes in a duplication, loss and change (DLC) model of genome evolution, recently introduced by Tiuryn, Wójtowicz and Rudnicki. After providing expressions for the generating functions of the density p and of the right-tail Q of the above distributions, we obtain closed forms for p and Q in terms of Gauss hypergeometric functions. Then, by resorting to the literature about special functions and their approximations, we provide an asymptotic expression for Q, which depends on parameters connected with the strengths of duplication and change. This shows that the DLC model yields a rich statistical model for the size distribution, whose elements are characterized by a composition of a power component with a negative exponential one. We also study the limiting distributions, as the parameters are made arbitrarily close to points of the boundary of their natural domain. In addition to the geometric distribution and to the unit mass at 1, the limiting class contains the distributions with the `longest' tails in the DLC model. A characterization of these probability laws is given.

About the gene families size distribution in a recent model of genome evolution

E Regazzini
2010

Abstract

We analyze a class of probability distributions for family sizes in a duplication, loss and change (DLC) model of genome evolution, recently introduced by Tiuryn, Wójtowicz and Rudnicki. After providing expressions for the generating functions of the density p and of the right-tail Q of the above distributions, we obtain closed forms for p and Q in terms of Gauss hypergeometric functions. Then, by resorting to the literature about special functions and their approximations, we provide an asymptotic expression for Q, which depends on parameters connected with the strengths of duplication and change. This shows that the DLC model yields a rich statistical model for the size distribution, whose elements are characterized by a composition of a power component with a negative exponential one. We also study the limiting distributions, as the parameters are made arbitrarily close to points of the boundary of their natural domain. In addition to the geometric distribution and to the unit mass at 1, the limiting class contains the distributions with the `longest' tails in the DLC model. A characterization of these probability laws is given.
2010
Istituto di Matematica Applicata e Tecnologie Informatiche - IMATI -
Duplication-loss-change model
gene families size distribution
genome evolution
File in questo prodotto:
File Dimensione Formato  
prod_182884-doc_25867.pdf

solo utenti autorizzati

Descrizione: Articolo Pubblicato
Dimensione 256.86 kB
Formato Adobe PDF
256.86 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/2261
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 23
  • ???jsp.display-item.citation.isi??? 19
social impact