In this paper, we approach the removal of back-to-front interferences from scans of double-sided documents as a blind source separation problem, and extend our previous linear mixing model to a more effective nonlinear mixing model. We consider the front and back ideal images as two individual patterns overlapped in the observed recto and verso scans, and apply an unsupervised constrained maximum likelihood technique to separate them. Through several real examples, we show that the results obtained by this approach are much better than the ones obtained through data decorrelation or independent component analysis. As compared to approaches based on segmentation/classification, which often aim at cleaning a foreground text by removing all the textured background, one of the advantages of our method is that cleaning does not alter genuine features of the document, such as color or other structures it may contain. This is particularly interesting when the document has a historical importance, since its readability can be improved while maintaining the original appearance.

Nonlinear model and constrained ML for removing back-to-front interferences from recto-verso documents

Salerno E;Tonazzini A
2012

Abstract

In this paper, we approach the removal of back-to-front interferences from scans of double-sided documents as a blind source separation problem, and extend our previous linear mixing model to a more effective nonlinear mixing model. We consider the front and back ideal images as two individual patterns overlapped in the observed recto and verso scans, and apply an unsupervised constrained maximum likelihood technique to separate them. Through several real examples, we show that the results obtained by this approach are much better than the ones obtained through data decorrelation or independent component analysis. As compared to approaches based on segmentation/classification, which often aim at cleaning a foreground text by removing all the textured background, one of the advantages of our method is that cleaning does not alter genuine features of the document, such as color or other structures it may contain. This is particularly interesting when the document has a historical importance, since its readability can be improved while maintaining the original appearance.
2012
Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
Document restoration
Nonlinear data model
Back-to-front interferences
File in questo prodotto:
File Dimensione Formato  
prod_274659-doc_77943.pdf

solo utenti autorizzati

Descrizione: Nonlinear model and constrained ML for removing back-to-front interferences from recto-verso documents
Tipologia: Versione Editoriale (PDF)
Dimensione 1.84 MB
Formato Adobe PDF
1.84 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/256049
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 16
  • ???jsp.display-item.citation.isi??? 16
social impact