Assistance in retrieving of documents on the World Wide Web is provided either by search engines, through keyword based queries, or by catalogues, which organise documents into hierarchical collections. Maintaining catalogues manually is becoming increasingly difficult due to the sheer amount of material on the Web, and therefore it will be soon necessary to resort to techniques for automatic classification of documents. Classification is traditionally performed by extracting information for indexing a document from the document itself. The paper describes the technique of categorisation by context, which exploits the context perceivable from the structure of HTML documents to extract useful information for classifying the documents they refer to. We present the results of experiments with a preliminary implementation of the technique. © Springer Pub. Co.

Sommario non disponibile.

Categorisation by context

Sebastiani F
1998

Abstract

Assistance in retrieving of documents on the World Wide Web is provided either by search engines, through keyword based queries, or by catalogues, which organise documents into hierarchical collections. Maintaining catalogues manually is becoming increasingly difficult due to the sheer amount of material on the Web, and therefore it will be soon necessary to resort to techniques for automatic classification of documents. Classification is traditionally performed by extracting information for indexing a document from the document itself. The paper describes the technique of categorisation by context, which exploits the context perceivable from the structure of HTML documents to extract useful information for classifying the documents they refer to. We present the results of experiments with a preliminary implementation of the technique. © Springer Pub. Co.
1998
Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
Inglese
David G. Schwartz, Monica Divitini
Proceedings of the forst International Workshop. Innovative Internet Information Systems
Innovative Internet Information Systems : IIIS-98. First International Workshop
4
719
736
19
http://www.scopus.com/inward/record.url?eid=2-s2.0-0342726569&partnerID=q2rCbXpz
Sì, ma tipo non specificato
1998
Pisa
Sommario non disponibile.
Text cataegorization
Content analysis and indexing
Codice PuMa: /cnr.iei/1998-A2-008
1
open
Attardi G.; Di Marco S.; Salvi D.; Sebastiani F.
273
info:eu-repo/semantics/conferenceObject
04 Contributo in convegno::04.01 Contributo in Atti di convegno
File in questo prodotto:
File Dimensione Formato  
prod_410135-doc_144299.pdf

accesso aperto

Descrizione: Categorisation by context
Tipologia: Versione Editoriale (PDF)
Dimensione 1.56 MB
Formato Adobe PDF
1.56 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/389192
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 13
  • ???jsp.display-item.citation.isi??? ND
social impact