CNR Institutional Research Information System

Assistance in retrieving of documents on the World Wide Web is provided either by search engines, through keyword based queries, or by catalogues, which organise documents into hierarchical collections. Maintaining catalogues manually is becoming increasingly difficult due to the sheer amount of material on the Web, and therefore it will be soon necessary to resort to techniques for automatic classification of documents. Classification is traditionally performed by extracting information for indexing a document from the document itself. The paper describes the technique of categorisation by context, which exploits the context perceivable from the structure of HTML documents to extract useful information for classifying the documents they refer to. We present the results of experiments with a preliminary implementation of the technique. © Springer Pub. Co.

Sommario non disponibile.

Categorisation by context

Attardi G;Di Marco S;Salvi D;Sebastiani F

1998

Abstract

Assistance in retrieving of documents on the World Wide Web is provided either by search engines, through keyword based queries, or by catalogues, which organise documents into hierarchical collections. Maintaining catalogues manually is becoming increasingly difficult due to the sheer amount of material on the Web, and therefore it will be soon necessary to resort to techniques for automatic classification of documents. Classification is traditionally performed by extracting information for indexing a document from the document itself. The paper describes the technique of categorisation by context, which exploits the context perceivable from the structure of HTML documents to extract useful information for classifying the documents they refer to. We present the results of experiments with a preliminary implementation of the technique. © Springer Pub. Co.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				1998
			
	Strutture organizzative
	
				Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
			
	Lingua/e
	
				Inglese
			
	Supervisori e coordinatori esterni
	
				David G. Schwartz, Monica Divitini
			
	Titolo del Volume
	
				Proceedings of the forst International Workshop. Innovative Internet Information Systems
			
	Rivista
	
				JOURNAL OF UNIVERSAL COMPUTER SCIENCE (ONLINE)
			
	Titolo del convegno
	
				Innovative Internet Information Systems : IIIS-98. First International Workshop
			
	Volume
	
				4
			
	Da pagina
	
				719
			
	A pagina
	
				736
			
	Numero di pagine
	
				19
			
	URL
	
				http://www.scopus.com/inward/record.url?eid=2-s2.0-0342726569&partnerID=q2rCbXpz
			
	Referee
	
				Sì, ma tipo non specificato
			
	Periodo del Convegno
	
				1998
			
	Luogo del Convegno
	
				Pisa
			
	Breve descrizione dei contenuti (Abstract)
	
				Sommario non disponibile.
			
	Parole chiave
	
				Text cataegorization
Content analysis and indexing
			
	Altre informazioni
	
				Codice PuMa: /cnr.iei/1998-A2-008
			
	Codice Scopus
	
				2-s2.0-0342726569
			
	Numero autori
	
				1
			
	Fulltext
	
				open
			
	Tutti gli autori
	
						Attardi G.; Di Marco S.; Salvi D.; Sebastiani F.
					
	Tipologia Login Miur
	
				273
			
	Tipologia
	
				info:eu-repo/semantics/conferenceObject
			
	Tipologia
	
				04 Contributo in convegno::04.01 Contributo in Atti di convegno
			
	Appare nelle tipologie:
	
				04.01 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
prod_410135-doc_144299.pdf accesso aperto Descrizione: Categorisation by context Tipologia: Versione Editoriale (PDF) Dimensione 1.56 MB Formato Adobe PDF Visualizza/Apri	1.56 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/389192

Citazioni

ND

13

ND

social impact