In many scientific applications, it is necessary to perform classification, which means discrimination between examples belonging to different classes. Machine Learning Tools have proved to be very performing in this task and can achieve very high success rates. On the other hand, the "realism" and interpretability of their results are very low, limiting their applicability. In this paper, a method to derive manageable equations for the hypersurface between classes is presented. The main objective consists of formulating the results of machine learning tools in a way representing the actual "physics" behind the phenomena under investigation. The proposed approach is based on a suitable combination of Support vector Machines and Symbolic Regression via Genetic Programming; it has been investigated with a series of systematic numerical tests, for different types of equations and classification problems, and tested with various experimental databases. The obtained results indicate that the proposed method permits to find a good trade-off between accuracy of the classification and complexity of the derived mathematical equations. Moreover, the derived models can be tuned to reflect the actual phenomena, providing a very useful tool to bridge the gap between data, machine learning tools and scientific theories.
Deriving realistic mathematical models from support vector machines for scientific applications
Murari A;
2017
Abstract
In many scientific applications, it is necessary to perform classification, which means discrimination between examples belonging to different classes. Machine Learning Tools have proved to be very performing in this task and can achieve very high success rates. On the other hand, the "realism" and interpretability of their results are very low, limiting their applicability. In this paper, a method to derive manageable equations for the hypersurface between classes is presented. The main objective consists of formulating the results of machine learning tools in a way representing the actual "physics" behind the phenomena under investigation. The proposed approach is based on a suitable combination of Support vector Machines and Symbolic Regression via Genetic Programming; it has been investigated with a series of systematic numerical tests, for different types of equations and classification problems, and tested with various experimental databases. The obtained results indicate that the proposed method permits to find a good trade-off between accuracy of the classification and complexity of the derived mathematical equations. Moreover, the derived models can be tuned to reflect the actual phenomena, providing a very useful tool to bridge the gap between data, machine learning tools and scientific theories.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


