In this talk, we present the main results of a paper accepted at ECIR 2022 [1]. We investigate novel SoC-FPGA solutions for fast and energy-efficient ranking based on machine learned ensembles of decision trees. Since the memory footprint of ranking ensembles limits the effective exploitation of programmable logic for large-scale inference tasks [2], we investigate binning and quantization techniques to reduce the memory occupation of the learned model and we optimize the state-of-the-art ensemble-traversal algorithm for deployment on lowcost, energy-efficient FPGA devices. The results of the experiments conducted using publicly available Learning-to-Rank datasets, show that our model compression techniques do not impact significantly the accuracy. Moreover, the reduced space requirements allow the models and the logic to be replicated on the FPGA device in order to execute several inference tasks in parallel. We discuss in details the experimental settings and the feasibility of the deployment of the proposed solution in a real setting. The results of the experiments conducted show that our FPGA solution achieves performances at the state of the art and consumes from 9× up to 19.8× less energy than an equivalent multi-threaded CPU implementation.

Energy-efficient ranking on FPGAs through ensemble model compression (Abstract)

Molina R.;Nardini F. M.;Perego R.;Trani S.
2022

Abstract

In this talk, we present the main results of a paper accepted at ECIR 2022 [1]. We investigate novel SoC-FPGA solutions for fast and energy-efficient ranking based on machine learned ensembles of decision trees. Since the memory footprint of ranking ensembles limits the effective exploitation of programmable logic for large-scale inference tasks [2], we investigate binning and quantization techniques to reduce the memory occupation of the learned model and we optimize the state-of-the-art ensemble-traversal algorithm for deployment on lowcost, energy-efficient FPGA devices. The results of the experiments conducted using publicly available Learning-to-Rank datasets, show that our model compression techniques do not impact significantly the accuracy. Moreover, the reduced space requirements allow the models and the logic to be replicated on the FPGA device in order to execute several inference tasks in parallel. We discuss in details the experimental settings and the feasibility of the deployment of the proposed solution in a real setting. The results of the experiments conducted show that our FPGA solution achieves performances at the state of the art and consumes from 9× up to 19.8× less energy than an equivalent multi-threaded CPU implementation.
2022
Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
Inglese
Pasi G., Cremonesi P., Orlando S., Zanker M., Massimo D., Turati G.
Italian Information Retrieval Workshop
IIR 2022 - 12th Italian Information Retrieval Workshop 2022
1
http://ceur-ws.org/Vol-3177/paper9.pdf
Sì, ma tipo non specificato
19-22/06/2022
Tirrenia, Pisa, Italy
Learning to rank
Model compression
Efficient inference
SoC FPGA
Elettronico
6
info:eu-repo/semantics/conferenceObject
open
274
04 Contributo in convegno::04.02 Abstract in Atti di convegno
Gil-Costa, V.; Loor, F.; Molina, R.; Nardini, F. M.; Perego, R.; Trani, S.
File in questo prodotto:
File Dimensione Formato  
prod_471851-doc_191803.pdf

accesso aperto

Descrizione: Energy-Efficient Ranking on FPGAs through Ensemble Model Compression (Abstract)
Tipologia: Versione Editoriale (PDF)
Licenza: Creative commons
Dimensione 342.02 kB
Formato Adobe PDF
342.02 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/417708
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact