SparkBOOST is a Java library built over Apache Spark that provides a distributed implementation of AdaBoost.MH and MP-Boost machine learning algorithms. These boosting algorithms are known to be very effective and robust to overfitting in many application domains, e.g. in natural language processing contexts. SparkBOOST offers to developers a fast way to scale these algorithms to large scale problems, where one needs to build classifiers from very large training datasets or simply needs to quickly classify huge stream of documents. The library can be integrated into custom programs by using a simple API. The SparkBOOST implementation also provides some command line tools to perform learning and classification on data sources available in LibSVM format.

SparkBOOST, an Apache spark-based boosting library

Fagni T;Esuli A
2016

Abstract

SparkBOOST is a Java library built over Apache Spark that provides a distributed implementation of AdaBoost.MH and MP-Boost machine learning algorithms. These boosting algorithms are known to be very effective and robust to overfitting in many application domains, e.g. in natural language processing contexts. SparkBOOST offers to developers a fast way to scale these algorithms to large scale problems, where one needs to build classifiers from very large training datasets or simply needs to quickly classify huge stream of documents. The library can be integrated into custom programs by using a simple API. The SparkBOOST implementation also provides some command line tools to perform learning and classification on data sources available in LibSVM format.
2016
Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
Boosting
Spark
Big data
Software engineering
Design tools and techniques
Software architectures
Artificial intelligence learning
File in questo prodotto:
File Dimensione Formato  
prod_354509-doc_114888.pdf

accesso aperto

Descrizione: SparkBOOST, an Apache spark-based boosting library
Dimensione 209.54 kB
Formato Adobe PDF
209.54 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/325355
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact