Several machine-learning models specialized in providing revenue protection for power distribution companies can be found in the literature. However, traditional approaches present some limits: those models, relying solely on internal Company data, can be blind compared to some types of frequent anomalies, e.g., illegal connections, absence of consumption drops, etc. An innovative approach is the combination of proprietary data with third party data sets, preferably linked to geographical coordinates, thus representing a modeling of the territory. By integrating multiple sources of data, it is in principle possible to identify inconsistencies between activity patterns across data sets that would otherwise be impossible to identify by solely relying on proprietary data. In this paper, we instantiate this idea based on a combination of fine-grained smart meter consumption data with cellular phone data records. The rationale for combining power consumption and cellular phone data is that both data sets have been proved to be good proxies of human activity. Hence, the identification of emergent activity patterns in the two data sets, and their spatio-temporal comparison, holds potential of substantially increasing the effectiveness of non-technical loss detection with respect to current machine learning practices.

TLC Pointer - The Use of Geo-Spatial Data for Non-Technical Loss Detection

P Santi;
2019

Abstract

Several machine-learning models specialized in providing revenue protection for power distribution companies can be found in the literature. However, traditional approaches present some limits: those models, relying solely on internal Company data, can be blind compared to some types of frequent anomalies, e.g., illegal connections, absence of consumption drops, etc. An innovative approach is the combination of proprietary data with third party data sets, preferably linked to geographical coordinates, thus representing a modeling of the territory. By integrating multiple sources of data, it is in principle possible to identify inconsistencies between activity patterns across data sets that would otherwise be impossible to identify by solely relying on proprietary data. In this paper, we instantiate this idea based on a combination of fine-grained smart meter consumption data with cellular phone data records. The rationale for combining power consumption and cellular phone data is that both data sets have been proved to be good proxies of human activity. Hence, the identification of emergent activity patterns in the two data sets, and their spatio-temporal comparison, holds potential of substantially increasing the effectiveness of non-technical loss detection with respect to current machine learning practices.
2019
Istituto di informatica e telematica - IIT
data analysis
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/386309
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact