We propose a technique for efficient document retrieval from digital libraries containing document images which are compressed with token based compression. The technique we propose uses the layout information supplied by the relative positions of the character tokens on the page of a 'query' paper document to retrieve the original document in the image database. The query image is captured from a paper document by a multimedia system composed of a PC and a video scanning tool. This technique avoids OCRing the query document and the documents in the database; moreover avoidsdecompressing the documents in the database compressed with token based compression, therefore achieving important time and computational gains. The technique provides one with the capability of retrieving the original document stored in a digital library using part of a previously produced paper copy

Document image retrieval without OCRing using a video scanning system

Kuruoglu EE;
2001

Abstract

We propose a technique for efficient document retrieval from digital libraries containing document images which are compressed with token based compression. The technique we propose uses the layout information supplied by the relative positions of the character tokens on the page of a 'query' paper document to retrieve the original document in the image database. The query image is captured from a paper document by a multimedia system composed of a PC and a video scanning tool. This technique avoids OCRing the query document and the documents in the database; moreover avoidsdecompressing the documents in the database compressed with token based compression, therefore achieving important time and computational gains. The technique provides one with the capability of retrieving the original document stored in a digital library using part of a previously produced paper copy
2001
Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
Image similarity retrieval
Index generation
Multimedia information systems
Document capture (scanning
document analysis)
File in questo prodotto:
File Dimensione Formato  
prod_160467-doc_141228.pdf

accesso aperto

Descrizione: Document image retrieval without OCRing using a video scanning system
Dimensione 1.06 MB
Formato Adobe PDF
1.06 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/148824
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact