This paper introduces a novel method for flare forecasting, combining prediction accuracy with the ability to identify the most relevant predictive variables. This result is obtained by means of a two-step approach: first, a supervised regularization method for regression, namely, LASSO is applied, where a sparsity-enhancing penalty term allows the identification of the significance with which each data feature contributes to the prediction; then, an unsupervised fuzzy clustering technique for classification, namely, Fuzzy C-Means, is applied, where the regression outcome is partitioned through the minimization of a cost function and without focusing on the optimization of a specific skill score. This approach is therefore hybrid, since it combines supervised and unsupervised learning; realizes classification in an automatic, skill-score-independent way; and provides effective prediction performances even in the case of imbalanced data sets. Its prediction power is verified against NOAA Space Weather Prediction Center data, using as a test set, data in the range between 1996 August and 2010 December and as training set, data in the range between 1988 December and 1996 June. To validate the method, we computed several skill scores typically utilized in flare prediction and compared the values provided by the hybrid approach with the ones provided by several standard (non-hybrid) machine learning methods. The results showed that the hybrid approach performs classification better than all other supervised methods and with an effectiveness comparable to the one of clustering methods; but, in addition, it provides a reliable ranking of the weights with which the data properties contribute to the forecast.

A Hybrid Supervised/Unsupervised Machine Learning Approach to Solar Flare Prediction

Campi C;Massone AM
2018

Abstract

This paper introduces a novel method for flare forecasting, combining prediction accuracy with the ability to identify the most relevant predictive variables. This result is obtained by means of a two-step approach: first, a supervised regularization method for regression, namely, LASSO is applied, where a sparsity-enhancing penalty term allows the identification of the significance with which each data feature contributes to the prediction; then, an unsupervised fuzzy clustering technique for classification, namely, Fuzzy C-Means, is applied, where the regression outcome is partitioned through the minimization of a cost function and without focusing on the optimization of a specific skill score. This approach is therefore hybrid, since it combines supervised and unsupervised learning; realizes classification in an automatic, skill-score-independent way; and provides effective prediction performances even in the case of imbalanced data sets. Its prediction power is verified against NOAA Space Weather Prediction Center data, using as a test set, data in the range between 1996 August and 2010 December and as training set, data in the range between 1988 December and 1996 June. To validate the method, we computed several skill scores typically utilized in flare prediction and compared the values provided by the hybrid approach with the ones provided by several standard (non-hybrid) machine learning methods. The results showed that the hybrid approach performs classification better than all other supervised methods and with an effectiveness comparable to the one of clustering methods; but, in addition, it provides a reliable ranking of the weights with which the data properties contribute to the forecast.
2018
Istituto Superconduttori, materiali innovativi e dispositivi - SPIN
methods: data analysis
methods: statistical
Sun: flares
sunspots
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/344645
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 57
  • ???jsp.display-item.citation.isi??? ND
social impact