CNR Institutional Research Information System

In the last decades, advanced statistical and machine-learning tools have made enormous progress and they find applications in many fields. On the other hand, their penetration in the scientific domain is delayed by various factors, among which one fundamental limitation is that they assume stationary conditions. This is due to the fact that traditional machine learning tools guarantee their results only if the data in the training set, the test set and the final application are sampled from the same probability distribution function. On the contrary, in most scientific applications, the main objective of new experiments consists precisely of exploring uncharted regions of the parameter space to acquire new knowledge. Traditional methods of covariate shift to address this issue are clearly insufficient. In this paper, a completely new method is proposed, which is based on the falsification of data driven models. The technique is based on symbol manipulation with evolutionary programmes. The performance of the approach has been extensively tested numerically, proving its competitive advantages. The capability of the methodology, to handle practical and experimental cases, has been shown with the example of determining scaling laws for the design of new experiments, a typical issue violating the assumptions of stationarity. The same methodology can be adopted also to investigate large databases or the outputs of complex simulations, to focus the analysis efforts on the most promising entries.

A New Approach to the Planning of New Experiments based on Learning in Non-Stationary Conditions

Murari A;Lungaroni M;Peluso E;Gelfusa M;Craciunescu T;JET Contributors

2019

Abstract

In the last decades, advanced statistical and machine-learning tools have made enormous progress and they find applications in many fields. On the other hand, their penetration in the scientific domain is delayed by various factors, among which one fundamental limitation is that they assume stationary conditions. This is due to the fact that traditional machine learning tools guarantee their results only if the data in the training set, the test set and the final application are sampled from the same probability distribution function. On the contrary, in most scientific applications, the main objective of new experiments consists precisely of exploring uncharted regions of the parameter space to acquire new knowledge. Traditional methods of covariate shift to address this issue are clearly insufficient. In this paper, a completely new method is proposed, which is based on the falsification of data driven models. The technique is based on symbol manipulation with evolutionary programmes. The performance of the approach has been extensively tested numerically, proving its competitive advantages. The capability of the methodology, to handle practical and experimental cases, has been shown with the example of determining scaling laws for the design of new experiments, a typical issue violating the assumptions of stationarity. The same methodology can be adopted also to investigate large databases or the outputs of complex simulations, to focus the analysis efforts on the most promising entries.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2019
			
	Strutture organizzative
	
				Istituto gas ionizzati  - IGI - Sede Padova
Istituto per la Scienza e Tecnologia dei Plasmi - ISTP
			
	Parole chiave
	
				Symbolic regression
Genetic programming
Planning of experiments
Concept shift
Covariateshift
Scaling laws
			
	Appare nelle tipologie:
	
				01.01 Articolo in rivista

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/363318

Citazioni

ND

ND

ND

social impact