There are many different file formats for graphs. The capabilities of these file formats range from simple adjacency lists or coordinates to complex formats that can store arbitrary data. This has led to a situation where we have a large number of different, mostly incompatible formats. Exchanging graphs between different programs is painful, and sometimes impossible. The obvious answer to this problem is the introduction of a common file format. One reason is that exchange formats often do not support all product and platform-specific features. This is inevitable, but should not exclude the exchange of platform independent parts, probably with a less-efficient, portable replacement for product specific features. Another concern is efficiency. One should not expect a universal format to be more efficient than one that is designed for a specific purpose, but there is no reason that a common file format should be so inefficient that it cannot be used. In the case of graphs, many file formats for graphs are not designed for efficiency, but for ease of use, so the overhead should be small. Furthermore, there is no reason that prevents the use of both an optimized native format, and a second interchange format. In general, a common graph file format should have the following features: 1. The format must be platform independent, and easy to implement. 2. It must have the capability to represent arbitrary data structures, since advanced programs have the need to attach their specific data to nodes and edges. 3. It should be flexible enough that a specific order of declarations is not needed, and that any non-essential data may be omitted.

Overview of Standard Graph File Formats

A Messina
2018

Abstract

There are many different file formats for graphs. The capabilities of these file formats range from simple adjacency lists or coordinates to complex formats that can store arbitrary data. This has led to a situation where we have a large number of different, mostly incompatible formats. Exchanging graphs between different programs is painful, and sometimes impossible. The obvious answer to this problem is the introduction of a common file format. One reason is that exchange formats often do not support all product and platform-specific features. This is inevitable, but should not exclude the exchange of platform independent parts, probably with a less-efficient, portable replacement for product specific features. Another concern is efficiency. One should not expect a universal format to be more efficient than one that is designed for a specific purpose, but there is no reason that a common file format should be so inefficient that it cannot be used. In the case of graphs, many file formats for graphs are not designed for efficiency, but for ease of use, so the overhead should be small. Furthermore, there is no reason that prevents the use of both an optimized native format, and a second interchange format. In general, a common graph file format should have the following features: 1. The format must be platform independent, and easy to implement. 2. It must have the capability to represent arbitrary data structures, since advanced programs have the need to attach their specific data to nodes and edges. 3. It should be flexible enough that a specific order of declarations is not needed, and that any non-essential data may be omitted.
2018
Istituto di Calcolo e Reti ad Alte Prestazioni - ICAR
graph database
NoSQL
file format
csv
gml
graphml
gexf
gdf
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/348750
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact