Dynamic Treatment Regimes (DTRs) are adaptive treatment strategies that allow clinicians to personalize dynamically the treatment for each patient based on their step-by-step response to their treatment. There are a series of predefined alternative treatments for each disease and any patient may associate with one of these treatments according to his/her demographics. DTRs for a certain disease are studied and evaluated by means of statistical approaches where patients are randomized at each step of the treatment and their responses are observed. Recently, the Reinforcement Learning (RL) paradigm has also been applied to determine DTRs. However, such approaches may be limited by the need to design a true reward function, which may be difficult to formalize when the expert knowledge is not well assessed, as when the DTR is in the design phase. To address this limitation, an extension of the RL paradigm, namely Inverse Reinforcement Learning (IRL), has been adopted to learn the reward function from data, such as those derived from DTR trials. In this paper, we define a Projection Based Inverse Reinforcement Learning (PB-IRL) approach to learn the true underlying reward function for given demonstrations (DTR trials). Such a reward function can be used both to evaluate the set of DTRs determined for a certain disease, as well as to enable an RL-based intelligent agent to self-learn the best way and then act as a decision support system for the clinician.

Projection based inverse reinforcement learning for the analysis of dynamic treatment regimes

De Pietro G.;Paragliola G.;Coronato A.
2023

Abstract

Dynamic Treatment Regimes (DTRs) are adaptive treatment strategies that allow clinicians to personalize dynamically the treatment for each patient based on their step-by-step response to their treatment. There are a series of predefined alternative treatments for each disease and any patient may associate with one of these treatments according to his/her demographics. DTRs for a certain disease are studied and evaluated by means of statistical approaches where patients are randomized at each step of the treatment and their responses are observed. Recently, the Reinforcement Learning (RL) paradigm has also been applied to determine DTRs. However, such approaches may be limited by the need to design a true reward function, which may be difficult to formalize when the expert knowledge is not well assessed, as when the DTR is in the design phase. To address this limitation, an extension of the RL paradigm, namely Inverse Reinforcement Learning (IRL), has been adopted to learn the reward function from data, such as those derived from DTR trials. In this paper, we define a Projection Based Inverse Reinforcement Learning (PB-IRL) approach to learn the true underlying reward function for given demonstrations (DTR trials). Such a reward function can be used both to evaluate the set of DTRs determined for a certain disease, as well as to enable an RL-based intelligent agent to self-learn the best way and then act as a decision support system for the clinician.
2023
Istituto di Calcolo e Reti ad Alte Prestazioni - ICAR - Sede Secondaria Napoli
Decision Support System (DSS)
Dynamic Treatment Regime (DTR)
Inverse Reinforcement Learning (IRL)
Reinforcement Learning (RL)
File in questo prodotto:
File Dimensione Formato  
Projection based inverse reinforcement learning for the analysis of dynamic treatment regimes.pdf

non disponibili

Tipologia: Versione Editoriale (PDF)
Licenza: Altro tipo di licenza
Dimensione 1.59 MB
Formato Adobe PDF
1.59 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/517922
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? ND
social impact