This paper describes an integrated system for processing and analyzing highly degraded ancient printed documents. For each page, the system reduces noise by wavelet-based filtering, extracts and segments the text lines into characters by a fast adaptive thresholding, and performs OCR by a feed-forward back-propagation multilayer neural network. The probability recognition is used as a discriminant parameter for determining the automatic activation of a feed-back process, leading back to a block for refining segmentation. This block acts only on the small portions of the text where the recognition was not trustable, and makes use of blind deconvolution and MRF-based segmentation techniques. The experimental results highlight the good performance of the whole system in the analysis of even strongly degraded texts.

An integrated system for the analysis and the recognition of characters in ancient documents

Tonazzini A
2002

Abstract

This paper describes an integrated system for processing and analyzing highly degraded ancient printed documents. For each page, the system reduces noise by wavelet-based filtering, extracts and segments the text lines into characters by a fast adaptive thresholding, and performs OCR by a feed-forward back-propagation multilayer neural network. The probability recognition is used as a discriminant parameter for determining the automatic activation of a feed-back process, leading back to a block for refining segmentation. This block acts only on the small portions of the text where the recognition was not trustable, and makes use of blind deconvolution and MRF-based segmentation techniques. The experimental results highlight the good performance of the whole system in the analysis of even strongly degraded texts.
2002
Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
Characters recognition
File in questo prodotto:
File Dimensione Formato  
prod_91532-doc_127650.pdf

solo utenti autorizzati

Descrizione: An integrated system for the analysis and the recognition of characters in ancient documents
Tipologia: Versione Editoriale (PDF)
Dimensione 140.36 kB
Formato Adobe PDF
140.36 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/114017
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact