The exponential growth of devices and data at the edges of the Internet is rising scalability and privacy concerns on approaches based exclusively on remote cloud platforms. Data gravity, a fundamental concept in Fog Computing, points towards decentralisation of computation for data analysis, as a viable alternative to address those concerns. Decentralising AI tasks on several cooperative devices means identifying the optimal set of locations or Collection Points (CP for short) to use, in the continuum between full centralisation (i.e., all data on a single device) and full decentralisation (i.e., data on source locations). We propose an analytical framework able to find the optimal operating point in this continuum, linking the accuracy of the learning task with the corresponding network and computational cost for moving data and running the distributed training at the CPs. We show through simulations that the model accurately predicts the optimal trade-off, quite often an intermediate point between full centralisation and full decentralisation, showing also a significant cost saving w.r.t. both of them. Finally, the analytical model admits closed-form or numeric solutions, making it not only a performance evaluation instrument but also a design tool to configure a given distributed learning task optimally before its deployment.

Optimising Cost vs Accuracy of Decentralised Analytics in Fog Computing Environments

L Valerio;A Passarella;M Conti
2022

Abstract

The exponential growth of devices and data at the edges of the Internet is rising scalability and privacy concerns on approaches based exclusively on remote cloud platforms. Data gravity, a fundamental concept in Fog Computing, points towards decentralisation of computation for data analysis, as a viable alternative to address those concerns. Decentralising AI tasks on several cooperative devices means identifying the optimal set of locations or Collection Points (CP for short) to use, in the continuum between full centralisation (i.e., all data on a single device) and full decentralisation (i.e., data on source locations). We propose an analytical framework able to find the optimal operating point in this continuum, linking the accuracy of the learning task with the corresponding network and computational cost for moving data and running the distributed training at the CPs. We show through simulations that the model accurately predicts the optimal trade-off, quite often an intermediate point between full centralisation and full decentralisation, showing also a significant cost saving w.r.t. both of them. Finally, the analytical model admits closed-form or numeric solutions, making it not only a performance evaluation instrument but also a design tool to configure a given distributed learning task optimally before its deployment.
2022
Istituto di informatica e telematica - IIT
Distributed machine learning
DSVRG
optimal configuration
edge computing
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/412427
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact