Investigating large datasets of biological information by automatic procedures may offer chances of progress in knowledge. Recently, tremendous improvements in structural biology have allowed the number of structures in the Protein Data Bank (PDB) archive to increase rapidly, in particular those for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2)-associated proteins. However, their automatic analysis can be hampered by the nonuniform descriptors used by authors in some records of the PDB and PDBx/mmCIF files. In this opinion article we highlight the difficulties encountered in automating the analysis of hundreds of structures, suggesting that further standardization of the description of these molecular entities and of their attributes, generalized to the macromolecular structures contained in the PDB, might generate files more suitable for automatized analyses of a large number of structures.

Standardizing macromolecular structure files: further efforts are needed

Giordano D;Facchiano A;
2023

Abstract

Investigating large datasets of biological information by automatic procedures may offer chances of progress in knowledge. Recently, tremendous improvements in structural biology have allowed the number of structures in the Protein Data Bank (PDB) archive to increase rapidly, in particular those for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2)-associated proteins. However, their automatic analysis can be hampered by the nonuniform descriptors used by authors in some records of the PDB and PDBx/mmCIF files. In this opinion article we highlight the difficulties encountered in automating the analysis of hundreds of structures, suggesting that further standardization of the description of these molecular entities and of their attributes, generalized to the macromolecular structures contained in the PDB, might generate files more suitable for automatized analyses of a large number of structures.
2023
Istituto di Scienze dell'Alimentazione - ISA
PDB archive
automatic data analysis
antibody
spike protein
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/439145
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? ND
social impact