In this work we report our experience in the use of CART classifiers in the difficult problem of distinguishing among photographs, graphics, texts and compound digital documents. To cope with the great variety of compound documents we have designed a hierarchical strategy which first classifies documents as compound or non-compound by verifying their homogeneity. Non-compound documents are then classified as photographs, graphics or texts. Documents are indexed only by low-level perceptual features such as color, texture and shape.
A hierarchical classification strategy for digital documents
Brambilla C;
2002
Abstract
In this work we report our experience in the use of CART classifiers in the difficult problem of distinguishing among photographs, graphics, texts and compound digital documents. To cope with the great variety of compound documents we have designed a hierarchical strategy which first classifies documents as compound or non-compound by verifying their homogeneity. Non-compound documents are then classified as photographs, graphics or texts. Documents are indexed only by low-level perceptual features such as color, texture and shape.File in questo prodotto:
Non ci sono file associati a questo prodotto.
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


