The most widely adopted approach for knowledgeextraction from raw data generated at the edges of the Internet(e.g., by IoT or personal mobile devices) is through global cloudplatforms, where data is collected from devices, and analysed.However, with the increasing number of devices spread in thephysical environment, this approach rises several concerns. Thedata gravity concept, one of the basis of Fog and Mobile EdgeComputing, points towards a decentralisation of computationfor data analysis, whereby the latter is performed closer towhere data is generated, for both scalability and privacy reasons.Hence, data produced by devices might be processed accordingto one of the following approaches: (i) directly on devices thatcollected it (ii) in the cloud, or (iii) through fog/mobile edgecomputing techniques, i.e., at intermediate nodes in the network,running distributed analytics after collecting subsets of the data.Clearly, (i) and (ii) are the two extreme cases of (iii). It isworth noting that the same analytics task executed at differentcollection points in the network, comes at different costs in termsof traffic generated over the network. Precisely, these costs referto the traffic generated to move data towards the collection pointselected (e.g. the Edge or the Cloud) and the one induced bythe distributed analytics process. Until now, deciding if to useintermediate collection points, and which one they should bein order to both obtain a target accuracy and minimise thenetwork traffic, is an open question. In this paper, we propose ananalytical framework able to cope with this problem. Precisely,we consider learning tasks, and define a model linking theaccuracy of the learning task performed with a certain set ofcollection points, with the corresponding network traffic. Themodel can be used to identify, given the specification of thelearning problem (e.g. binary classification, regression, etc.), andits target accuracy, what is the optimal level for collecting data inorder to minimise the total network cost. We validate our modelthrough simulations in order to show that setting, in simulation,the level of intermediate collection indicated by our model, leadsto the minimum cost for the target accuracy.

Optimal Trade-off between Accuracy and Network Cost of Distributed Learning in Mobile Edge Computing

L Valerio;A Passarella;M Conti
2017

Abstract

The most widely adopted approach for knowledgeextraction from raw data generated at the edges of the Internet(e.g., by IoT or personal mobile devices) is through global cloudplatforms, where data is collected from devices, and analysed.However, with the increasing number of devices spread in thephysical environment, this approach rises several concerns. Thedata gravity concept, one of the basis of Fog and Mobile EdgeComputing, points towards a decentralisation of computationfor data analysis, whereby the latter is performed closer towhere data is generated, for both scalability and privacy reasons.Hence, data produced by devices might be processed accordingto one of the following approaches: (i) directly on devices thatcollected it (ii) in the cloud, or (iii) through fog/mobile edgecomputing techniques, i.e., at intermediate nodes in the network,running distributed analytics after collecting subsets of the data.Clearly, (i) and (ii) are the two extreme cases of (iii). It isworth noting that the same analytics task executed at differentcollection points in the network, comes at different costs in termsof traffic generated over the network. Precisely, these costs referto the traffic generated to move data towards the collection pointselected (e.g. the Edge or the Cloud) and the one induced bythe distributed analytics process. Until now, deciding if to useintermediate collection points, and which one they should bein order to both obtain a target accuracy and minimise thenetwork traffic, is an open question. In this paper, we propose ananalytical framework able to cope with this problem. Precisely,we consider learning tasks, and define a model linking theaccuracy of the learning task performed with a certain set ofcollection points, with the corresponding network traffic. Themodel can be used to identify, given the specification of thelearning problem (e.g. binary classification, regression, etc.), andits target accuracy, what is the optimal level for collecting data inorder to minimise the total network cost. We validate our modelthrough simulations in order to show that setting, in simulation,the level of intermediate collection indicated by our model, leadsto the minimum cost for the target accuracy.
2017
Istituto di informatica e telematica - IIT
communication efficient
Distributed Learning
Mobile edge comuting
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/356301
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact