JaTeCS is an open source Java library focused on automatic text categorization. It covers all the steps of an experimental activity, from reading the corpus to the evaluation of the results. JaTeCS focuses on text as the central input, and its code is optimized for this type of data. As with many other machine learning (ML) frameworks, JaTeCS provides data readers for many formats and well-known corpora, NLP tools, feature selection and weighting methods, the implementation of many ML algorithms as well as wrappers for well-known external software (e.g., libSVM, SVMlight). JaTeCS also provides the implementation of methods related to text classification that are rarely, if never, provided by other ML framework (e.g., active learning, quantification, transfer learning).
JaTeCS, a Java library focused on automatic text categorization
Esuli A;Fagni T;Moreo Fernández A
2016
Abstract
JaTeCS is an open source Java library focused on automatic text categorization. It covers all the steps of an experimental activity, from reading the corpus to the evaluation of the results. JaTeCS focuses on text as the central input, and its code is optimized for this type of data. As with many other machine learning (ML) frameworks, JaTeCS provides data readers for many formats and well-known corpora, NLP tools, feature selection and weighting methods, the implementation of many ML algorithms as well as wrappers for well-known external software (e.g., libSVM, SVMlight). JaTeCS also provides the implementation of methods related to text classification that are rarely, if never, provided by other ML framework (e.g., active learning, quantification, transfer learning).File | Dimensione | Formato | |
---|---|---|---|
prod_354510-doc_114889.pdf
accesso aperto
Descrizione: JaTeCS, a Java library focused on automatic text categorization
Dimensione
209.64 kB
Formato
Adobe PDF
|
209.64 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.