CNR Institutional Research Information System

This paper investigates the policy optimization paradigm, where a learning model is trained to find the solution of complex Markov decision problems, as a tool to address the berth allocation problem in multimodal terminals. To this purpose, we drop the typical formulation of the latter as a mixed-integer static scheduling one, and we model it instead as an evolving scenario in which berths are assigned to ships according to a parameterized policy function that drives the temporal evolution of the environment. We adopt a cross-entropy optimization scheme to optimize the policy parameters, which is a simple and highly parallelizable gradient-free technique. As compared to the static mixed-integer formulation, the proposed approach relies on a much lighter optimization problem in the continuous space of the policy parameters, thus making it feasible to replan in real time when needed. Furthermore, the generality of the policy optimization approach allows to take into account any performance metric and specific feature of the scenario straightforwardly, without the need to devise ad hoc heuristics. Simulation tests showcase the good performance of the policy approach under various conditions.

Policy optimization for berth allocation problems

C Cervellera;M Gaggero;D Maccio

2021

Abstract

This paper investigates the policy optimization paradigm, where a learning model is trained to find the solution of complex Markov decision problems, as a tool to address the berth allocation problem in multimodal terminals. To this purpose, we drop the typical formulation of the latter as a mixed-integer static scheduling one, and we model it instead as an evolving scenario in which berths are assigned to ships according to a parameterized policy function that drives the temporal evolution of the environment. We adopt a cross-entropy optimization scheme to optimize the policy parameters, which is a simple and highly parallelizable gradient-free technique. As compared to the static mixed-integer formulation, the proposed approach relies on a much lighter optimization problem in the continuous space of the policy parameters, thus making it feasible to replan in real time when needed. Furthermore, the generality of the policy optimization approach allows to take into account any performance metric and specific feature of the scenario straightforwardly, without the need to devise ad hoc heuristics. Simulation tests showcase the good performance of the policy approach under various conditions.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2021
			
	Strutture organizzative
	
				Istituto di iNgegneria del Mare - INM (ex INSEAN)
			
	Parole chiave
	
				Berth allocation problem
dynamic optimization
machine learning
policy optimization
			
	Appare nelle tipologie:
	
				04.01 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
prod_453219-doc_171739.pdf solo utenti autorizzati Descrizione: Policy optimization for berth allocation problems Tipologia: Documento in Post-print Dimensione 387.42 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	387.42 kB	Adobe PDF	Visualizza/Apri Richiedi una copia
prod_453219-doc_178013.pdf solo utenti autorizzati Descrizione: Policy optimization for berth allocation problems Tipologia: Documento in Post-print Dimensione 1.97 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	1.97 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/398963

Citazioni

ND

6

2

social impact