A fully visual approach for business documents classification is presented. The paper describes how SURF visual features, extracted from the documents, can be usefully used for business document recognition and their classification. Some of the extracted features are used to compute a prototype aiming at speed up the comparison of a document class while obtaining the best recognition rate. Moreover, we can determine which features are relevant and we can select zones of interest in the documents. Experimental setup has been performed on a set of real business documents of different typologies and companies. We tested also the robustness of our approach adding artificial defects and noise to the original documents and classifying them taking into account exclusively visual and graphical features. The capability of documents classification without any kind of text analysis has the great advantage to make the system totally independent from the idiom.

A fully visual based business document classification system

Infantino I;Maniscalco U;Vella F
2014

Abstract

A fully visual approach for business documents classification is presented. The paper describes how SURF visual features, extracted from the documents, can be usefully used for business document recognition and their classification. Some of the extracted features are used to compute a prototype aiming at speed up the comparison of a document class while obtaining the best recognition rate. Moreover, we can determine which features are relevant and we can select zones of interest in the documents. Experimental setup has been performed on a set of real business documents of different typologies and companies. We tested also the robustness of our approach adding artificial defects and noise to the original documents and classifying them taking into account exclusively visual and graphical features. The capability of documents classification without any kind of text analysis has the great advantage to make the system totally independent from the idiom.
2014
Istituto di Calcolo e Reti ad Alte Prestazioni - ICAR
978-0-9893-1933-1
document classification
image features extraction
image processing
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/288821
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 3
  • ???jsp.display-item.citation.isi??? ND
social impact