We propose a fast procedure based on neural networks (NN) to correct the typically complex background of recto-verso historical manuscripts, where the texts of the two sides often appear mixed. The purpose is to eliminate the interfering, shining-through text, to facilitate both the work of philologists and paleographers and the automatic analysis of the linguistic contents. We adapt the learning phase of a very simple shallow NN to exploit the information of the registered recto and verso sides of the manuscript without the need for a large class of other similar manuscripts. Hence, the training set is self-generated from the data images based on a theoretical mixing model that accounts for ink spreading through the paper fiber and for ink saturation in the text superposition areas. Operationally, we select pairs of patches containing clean text from the manuscript and then mix them symmetrically using the model with varying parameters that span the allowed range. This makes the NN able to generalize to diverse amounts of ink seeping and then classify different manuscripts. We show comparisons between the results obtained on heavily damaged manuscripts with this NN and other approaches. From a qualitative point of view, the proposed method seems quite promising.
A shallow neural net with model-based learning for the virtual restoration of recto-verso manuscript
Savino P;Tonazzini A
2022
Abstract
We propose a fast procedure based on neural networks (NN) to correct the typically complex background of recto-verso historical manuscripts, where the texts of the two sides often appear mixed. The purpose is to eliminate the interfering, shining-through text, to facilitate both the work of philologists and paleographers and the automatic analysis of the linguistic contents. We adapt the learning phase of a very simple shallow NN to exploit the information of the registered recto and verso sides of the manuscript without the need for a large class of other similar manuscripts. Hence, the training set is self-generated from the data images based on a theoretical mixing model that accounts for ink spreading through the paper fiber and for ink saturation in the text superposition areas. Operationally, we select pairs of patches containing clean text from the manuscript and then mix them symmetrically using the model with varying parameters that span the allowed range. This makes the NN able to generalize to diverse amounts of ink seeping and then classify different manuscripts. We show comparisons between the results obtained on heavily damaged manuscripts with this NN and other approaches. From a qualitative point of view, the proposed method seems quite promising.File | Dimensione | Formato | |
---|---|---|---|
prod_471459-doc_192630.pdf
accesso aperto
Descrizione: A shallow neural net with model-based learning for the virtual restoration of recto-verso manuscript
Tipologia:
Versione Editoriale (PDF)
Dimensione
6.76 MB
Formato
Adobe PDF
|
6.76 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.