In this paper, we address the removal of severe back-to-front interferences in archival documents, when recto and verso images of the page are available. The problem is approached from a modeling point of view, considering the ideal images of the two separated texts as individual source patterns that overlap in the observed images through some parametric mixing operator. Earlier approaches were based on linear mixtures of the ideal reflectance maps, or of the ideal optical densities and absorptance maps, through unknown coefficients or blur kernels. Some approximations and/or partial user supervision were then adopted to jointly estimate the sources and the model parameters. Nevertheless, a feasible and reliable data model for this problem should at least be non-linear and space-variant, to cope with occlusions, ink saturation, and large variability of the mixing level. This is especially true for ancient documents affected by ink seeping (bleed-through). The search for such a model is still far from being concluded, or even impossible to pursue, due to the unavailability of information about the chemical and physical processes at the origin of the phenomenon. Hence, here, we propose the use of pixel-dependent parameters, within a model additive in the optical densities, to compensate not only for non-stationarity, but also for the lack or the imprecise knowledge of the non-linearity, and for modeling errors more in general.

Non-stationary modeling for the separation of overlapped texts in documents

Tonazzini A;Savino P;Salerno E
2014

Abstract

In this paper, we address the removal of severe back-to-front interferences in archival documents, when recto and verso images of the page are available. The problem is approached from a modeling point of view, considering the ideal images of the two separated texts as individual source patterns that overlap in the observed images through some parametric mixing operator. Earlier approaches were based on linear mixtures of the ideal reflectance maps, or of the ideal optical densities and absorptance maps, through unknown coefficients or blur kernels. Some approximations and/or partial user supervision were then adopted to jointly estimate the sources and the model parameters. Nevertheless, a feasible and reliable data model for this problem should at least be non-linear and space-variant, to cope with occlusions, ink saturation, and large variability of the mixing level. This is especially true for ancient documents affected by ink seeping (bleed-through). The search for such a model is still far from being concluded, or even impossible to pursue, due to the unavailability of information about the chemical and physical processes at the origin of the phenomenon. Hence, here, we propose the use of pixel-dependent parameters, within a model additive in the optical densities, to compensate not only for non-stationarity, but also for the lack or the imprecise knowledge of the non-linearity, and for modeling errors more in general.
2014
Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
Inglese
2014 IEEE 17th Signal Processing and Communications Applications Conference (Siu) (Publisher: IEEE)
SIU 2014 - 2014 22nd Signal Processing and Communications Applications Conference
2314
2318
5
http://ieeexplore.ieee.org/xpl/articleDetails.jsp?reload=true&arnumber=6830727&pageNumber%3D44291%26rowsPerPage%3D75
IEEE
New York
STATI UNITI D'AMERICA
Sì, ma tipo non specificato
23-25 April 2014
Trabzon, Turkey
Dcument restoration
Non-stationary data model
Back-to-front interferences
Progetto: Innovative Tools for cultural heritage ArChiving and restorAtion Acronimo: ITACA Tipo Progetto: EU_FP7. Codice Puma: /cnr.isti/2014-A2-019
3
restricted
Tonazzini, A; Savino, P; Salerno, E
273
info:eu-repo/semantics/conferenceObject
04 Contributo in convegno::04.01 Contributo in Atti di convegno
File in questo prodotto:
File Dimensione Formato  
prod_281751-doc_80255.pdf

solo utenti autorizzati

Descrizione: Non-stationary modeling for the separation of overlapped texts in documents
Tipologia: Versione Editoriale (PDF)
Dimensione 467.7 kB
Formato Adobe PDF
467.7 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/254587
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact