This paper presents a methodology for the evaluation of table understanding algorithms for PDF documents. The evaluation takes into account three major tasks: table detection, table structure recognition and functional analysis. We provide a general and exible output model for each task along with corresponding evaluation metrics and methods. We also present a methodology for collecting and ground-truthing PDF documents based on consensusreaching principles and provide a publicly available groundtruthed dataset. Copyright © 2012 by the Association for Computing Machinery, Inc. (ACM).

A methodology for evaluating algorithms for table understanding in PDF documents

Oro Ermelinda;
2012

Abstract

This paper presents a methodology for the evaluation of table understanding algorithms for PDF documents. The evaluation takes into account three major tasks: table detection, table structure recognition and functional analysis. We provide a general and exible output model for each task along with corresponding evaluation metrics and methods. We also present a methodology for collecting and ground-truthing PDF documents based on consensusreaching principles and provide a publicly available groundtruthed dataset. Copyright © 2012 by the Association for Computing Machinery, Inc. (ACM).
2012
Istituto di Calcolo e Reti ad Alte Prestazioni - ICAR
9781450311168
Document analysis
Document understanding
Ground-truth dataset
Metrics
Performance evaluation
Table processing
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/281434
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 63
  • ???jsp.display-item.citation.isi??? ND
social impact