Automatic text classification (ATC) is a discipline at the crossroads of information retrieval (IR), machine learning (ML), and computational linguistics (CL), and consists in the realization of text classifiers, i.e. software systems capable of assigning texts to one or more categories, or classes, from a predefined set. Applications range from the automated indexing of scientific articles, to e-mail routing, spam filtering, authorship attribution, and automated survey coding. This article will focus on the ML approach to ATC, whereby a software system (called the learner) automatically builds a classifier for the categories of interest by generalizing from a 'training' set of pre-classified texts.
Classification of text, automatic
Sebastiani F
2006
Abstract
Automatic text classification (ATC) is a discipline at the crossroads of information retrieval (IR), machine learning (ML), and computational linguistics (CL), and consists in the realization of text classifiers, i.e. software systems capable of assigning texts to one or more categories, or classes, from a predefined set. Applications range from the automated indexing of scientific articles, to e-mail routing, spam filtering, authorship attribution, and automated survey coding. This article will focus on the ML approach to ATC, whereby a software system (called the learner) automatically builds a classifier for the categories of interest by generalizing from a 'training' set of pre-classified texts.File | Dimensione | Formato | |
---|---|---|---|
prod_138962-doc_128679.pdf
solo utenti autorizzati
Descrizione: Classification of text, automatic
Tipologia:
Versione Editoriale (PDF)
Dimensione
96.78 kB
Formato
Adobe PDF
|
96.78 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.