Our goal in the framework of the Italian Flagship Project InterOmics is to reconstruct a set of plausible chromatin configurations from Chromosome Conformation Capture data. To this end, we rely on a simulated annealing algorithm that samples the solution space defined by a data-fit function and a multiscale chromatin model. The data-fit only accounts for the largest, most reliable contact frequencies, in order to avoid deriving distances inconsistent with the Euclidean geometry. At each scale, the chromatin model consists in a chain of partially penetrable beads whose properties (bead sizes, elasticity, curvature, etc.) can be constrained through biochemical and biological knowledge. During the annealing process, the model configuration is evolved through quaternions rather than the usual Euler matrices, as this offers a number of advantages in terms of composition of successive perturbations and automatic satisfaction of the constraints. The output of the annealing scheme is not unique due to the degrees of freedom left by the geometrical constraints. This allows us to obtain multiple configurations compatible with both the data and the prior knowledge. We are validating our method by applying it to real Hi-C data from the long arm of the human Chromosome 1. The mean-square Euclidean distances computed from our results as functions of the genomic distances support previous experimental results indicating that highly expressed genomic regions are less compact than poorly transcribed regions.

A statistical approach to infer 3D chromatin structure

Caudai C;Salerno E;Tonazzini A
2014

Abstract

Our goal in the framework of the Italian Flagship Project InterOmics is to reconstruct a set of plausible chromatin configurations from Chromosome Conformation Capture data. To this end, we rely on a simulated annealing algorithm that samples the solution space defined by a data-fit function and a multiscale chromatin model. The data-fit only accounts for the largest, most reliable contact frequencies, in order to avoid deriving distances inconsistent with the Euclidean geometry. At each scale, the chromatin model consists in a chain of partially penetrable beads whose properties (bead sizes, elasticity, curvature, etc.) can be constrained through biochemical and biological knowledge. During the annealing process, the model configuration is evolved through quaternions rather than the usual Euler matrices, as this offers a number of advantages in terms of composition of successive perturbations and automatic satisfaction of the constraints. The output of the annealing scheme is not unique due to the degrees of freedom left by the geometrical constraints. This allows us to obtain multiple configurations compatible with both the data and the prior knowledge. We are validating our method by applying it to real Hi-C data from the long arm of the human Chromosome 1. The mean-square Euclidean distances computed from our results as functions of the genomic distances support previous experimental results indicating that highly expressed genomic regions are less compact than poorly transcribed regions.
2014
Istituto di Fisiologia Clinica - IFC
Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
Computational biology
Chromatin structure
Chromosome conformation capture
File in questo prodotto:
File Dimensione Formato  
prod_305468-doc_87167.pdf

accesso aperto

Descrizione: A statistical approach to infer 3D chromatin structure
Tipologia: Versione Editoriale (PDF)
Dimensione 132.21 kB
Formato Adobe PDF
132.21 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/269606
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact