CNR Institutional Research Information System

The integration of Phase Change Material (PCM) storage with Heat Pump (HP) systems offers significant potential for demand-side flexibility but presents challenges in control due to complex thermodynamics during phase change. To overcome the computational burden of online optimization and the training instability of model-free reinforcement learning, this study proposes a novel framework utilizing Model Predictive Control (MPC)-guided Imitation Learning (IL). A high-fidelity Functional Mock-Up Unit (FMU) is employed to simulate the PCM-HP integration, where an MPC expert agent generates optimal control trajectories. Two IL agents, Behavior Cloning (BC) and Generative Adversarial Imitation Learning (GAIL), are trained to mimic this expert under dynamic pricing signals. While both IL agents are able to learn the load-shifting behaviors, GAIL outperforms BC in generalization. BC suffers from limited robustness in unobserved states, whereas GAIL captures the underlying policy distribution, achieving Mean Absolute Percentage Error (MAPE) of approximately 9% during testing. This framework successfully bridges model-based and model-free paradigms, offering a scalable, real-time control alternative that retains optimality without requiring complex physical modeling during deployment.

Model predictive control guided imitation learning for optimal control of PCM thermal energy storage

Chen Y.;Marotta I.;Palomba V.;Ohlson Timoudas T.;Wang Q.

2026

Abstract

The integration of Phase Change Material (PCM) storage with Heat Pump (HP) systems offers significant potential for demand-side flexibility but presents challenges in control due to complex thermodynamics during phase change. To overcome the computational burden of online optimization and the training instability of model-free reinforcement learning, this study proposes a novel framework utilizing Model Predictive Control (MPC)-guided Imitation Learning (IL). A high-fidelity Functional Mock-Up Unit (FMU) is employed to simulate the PCM-HP integration, where an MPC expert agent generates optimal control trajectories. Two IL agents, Behavior Cloning (BC) and Generative Adversarial Imitation Learning (GAIL), are trained to mimic this expert under dynamic pricing signals. While both IL agents are able to learn the load-shifting behaviors, GAIL outperforms BC in generalization. BC suffers from limited robustness in unobserved states, whereas GAIL captures the underlying policy distribution, achieving Mean Absolute Percentage Error (MAPE) of approximately 9% during testing. This framework successfully bridges model-based and model-free paradigms, offering a scalable, real-time control alternative that retains optimality without requiring complex physical modeling during deployment.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2026
			
	Strutture organizzative
	
				Istituto di Tecnologie Avanzate per l'Energia - ITAE
			
	Parole chiave
	
				Demand-side management
Energy flexibility
Imitation Learning
Model predictive control
PCM storage
			
	Appare nelle tipologie:
	
				01.01 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
mpc pcm ate.pdf accesso aperto Tipologia: Versione Editoriale (PDF) Licenza: Creative commons Dimensione 4.68 MB Formato Adobe PDF Visualizza/Apri	4.68 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/582543

Citazioni

ND

1

ND

social impact