CNR Institutional Research Information System

In this study, we delve into the adaptation and effectiveness of Transformer-based, pre-trained Large Language Models (LLMs) within the biomedical domain, a field that poses unique challenges due to its complexity and the specialized nature of its data. Building on the foundation laid by the transformative architecture of Transformers, we investigate the nuanced dynamics of LLMs through a multifaceted lens, focusing on two domain-specific tasks, i.e., Natural Language Inference (NLI) and Named Entity Recognition (NER). Our objective is to bridge the knowledge gap regarding how these models’ downstream performances correlate with their capacity to encapsulate task-relevant information. To achieve this goal, we probed and analyzed the inner encoding and attention mechanisms in LLMs, both encoder- and decoder-based, tailored for either general or biomedical-specific applications. This examination occurs before and after the models are fine-tuned across various data volumes. Our findings reveal that the models’ downstream effectiveness is intricately linked to specific patterns within their internal mechanisms, shedding light on the nuanced ways in which LLMs process and apply knowledge in the biomedical context. The source code for this paper is available at https://github.com/agnesebonfigli99/LLMs-in-the-Biomedical-Domain.

From pre-training to fine-tuning: An in-depth analysis of Large Language Models in the biomedical domain

Bonfigli A.;Bacco L.;Merone M.;Dell'Orletta F.

2024

Abstract

In this study, we delve into the adaptation and effectiveness of Transformer-based, pre-trained Large Language Models (LLMs) within the biomedical domain, a field that poses unique challenges due to its complexity and the specialized nature of its data. Building on the foundation laid by the transformative architecture of Transformers, we investigate the nuanced dynamics of LLMs through a multifaceted lens, focusing on two domain-specific tasks, i.e., Natural Language Inference (NLI) and Named Entity Recognition (NER). Our objective is to bridge the knowledge gap regarding how these models’ downstream performances correlate with their capacity to encapsulate task-relevant information. To achieve this goal, we probed and analyzed the inner encoding and attention mechanisms in LLMs, both encoder- and decoder-based, tailored for either general or biomedical-specific applications. This examination occurs before and after the models are fine-tuned across various data volumes. Our findings reveal that the models’ downstream effectiveness is intricately linked to specific patterns within their internal mechanisms, shedding light on the nuanced ways in which LLMs process and apply knowledge in the biomedical context. The source code for this paper is available at https://github.com/agnesebonfigli99/LLMs-in-the-Biomedical-Domain.

Scheda breve

Scheda completa

Scheda completa (DC)

Campo DC	Valore	Lingua
dc.authority.ancejournal	ARTIFICIAL INTELLIGENCE IN MEDICINE	en
dc.authority.orgunit	Istituto di linguistica computazionale "Antonio Zampolli" - ILC	en
dc.authority.people	Bonfigli A.	en
dc.authority.people	Bacco L.	en
dc.authority.people	Merone M.	en
dc.authority.people	Dell'Orletta F.	en
dc.collection.id.s	b3f88f24-048a-4e43-8ab1-6697b90e068e	*
dc.collection.name	01.01 Articolo in rivista	*
dc.contributor.appartenenza	Istituto di linguistica computazionale "Antonio Zampolli" - ILC	*
dc.contributor.appartenenza.mi	918	*
dc.contributor.area	Non assegn	*
dc.date.accessioned	2024/12/16 17:11:25	-
dc.date.available	2024/12/16 17:11:25	-
dc.date.firstsubmission	2024/12/13 19:18:45	*
dc.date.issued	2024	-
dc.date.submission	2025/01/24 18:16:48	*
dc.description.abstracteng	In this study, we delve into the adaptation and effectiveness of Transformer-based, pre-trained Large Language Models (LLMs) within the biomedical domain, a field that poses unique challenges due to its complexity and the specialized nature of its data. Building on the foundation laid by the transformative architecture of Transformers, we investigate the nuanced dynamics of LLMs through a multifaceted lens, focusing on two domain-specific tasks, i.e., Natural Language Inference (NLI) and Named Entity Recognition (NER). Our objective is to bridge the knowledge gap regarding how these models’ downstream performances correlate with their capacity to encapsulate task-relevant information. To achieve this goal, we probed and analyzed the inner encoding and attention mechanisms in LLMs, both encoder- and decoder-based, tailored for either general or biomedical-specific applications. This examination occurs before and after the models are fine-tuned across various data volumes. Our findings reveal that the models’ downstream effectiveness is intricately linked to specific patterns within their internal mechanisms, shedding light on the nuanced ways in which LLMs process and apply knowledge in the biomedical context. The source code for this paper is available at https://github.com/agnesebonfigli99/LLMs-in-the-Biomedical-Domain.	-
dc.description.allpeople	Bonfigli, A.; Bacco, L.; Merone, M.; Dell'Orletta, F.	-
dc.description.allpeopleoriginal	Bonfigli A.; Bacco L.; Merone M.; Dell'Orletta F.	en
dc.description.fulltext	open	en
dc.description.international	no	en
dc.description.numberofauthors	4	-
dc.identifier.doi	10.1016/j.artmed.2024.103003	en
dc.identifier.isi	WOS:001348580100001	en
dc.identifier.scopus	2-s2.0-85207363735	en
dc.identifier.source	scopus	*
dc.identifier.uri	https://hdl.handle.net/20.500.14243/518430	-
dc.language.iso	eng	en
dc.relation.volume	157	en
dc.subject.keywordseng	Biomedical domain	-
dc.subject.keywordseng	Domain adaptation	-
dc.subject.keywordseng	Large Language Models	-
dc.subject.singlekeyword	Biomedical domain	*
dc.subject.singlekeyword	Domain adaptation	*
dc.subject.singlekeyword	Large Language Models	*
dc.title	From pre-training to fine-tuning: An in-depth analysis of Large Language Models in the biomedical domain	en
dc.type.driver	info:eu-repo/semantics/article	-
dc.type.full	01 Contributo su Rivista::01.01 Articolo in rivista	it
dc.type.miur	262	-
iris.isi.extIssued	2024	-
iris.isi.extTitle	From pre-training to fine-tuning: An in-depth analysis of Large Language Models in the biomedical domain	-
iris.mediafilter.data	2025/04/04 04:06:16	*
iris.orcid.lastModifiedDate	2025/02/05 10:55:37	*
iris.orcid.lastModifiedMillisecond	1738749337111	*
iris.scopus.extIssued	2024	-
iris.scopus.extTitle	From pre-training to fine-tuning: An in-depth analysis of Large Language Models in the biomedical domain	-
iris.sitodocente.maxattempts	1	-
iris.unpaywall.bestoahost	publisher	*
iris.unpaywall.bestoaversion	publishedVersion	*
iris.unpaywall.doi	10.1016/j.artmed.2024.103003	*
iris.unpaywall.hosttype	publisher	*
iris.unpaywall.isoa	true	*
iris.unpaywall.journalisindoaj	false	*
iris.unpaywall.landingpage	https://doi.org/10.1016/j.artmed.2024.103003	*
iris.unpaywall.license	cc-by	*
iris.unpaywall.metadataCallLastModified	28/04/2026 04:36:48	-
iris.unpaywall.metadataCallLastModifiedMillisecond	1777343808767	-
iris.unpaywall.oastatus	hybrid	*
isi.authority.ancejournal	ARTIFICIAL INTELLIGENCE IN MEDICINE###0933-3657	*
isi.category	PT	*
isi.category	EP	*
isi.category	IG	*
isi.contributor.affiliation	University Campus Bio-Medico - Rome Italy	-
isi.contributor.affiliation	Consiglio Nazionale delle Ricerche (CNR)	-
isi.contributor.affiliation	University Campus Bio-Medico - Rome Italy	-
isi.contributor.affiliation	Consiglio Nazionale delle Ricerche (CNR)	-
isi.contributor.country	Italy	-
isi.contributor.country	Italy	-
isi.contributor.country	Italy	-
isi.contributor.country	Italy	-
isi.contributor.name	Agnese	-
isi.contributor.name	Luca	-
isi.contributor.name	Mario	-
isi.contributor.name	Felice	-
isi.contributor.researcherId	LRV-8816-2024	-
isi.contributor.researcherId	AHA-7493-2022	-
isi.contributor.researcherId	AAA-8945-2019	-
isi.contributor.researcherId	AAX-1864-2020	-
isi.contributor.subaffiliation	Dept Engn	-
isi.contributor.subaffiliation	Inst Computat Linguist Antonio Zampolli	-
isi.contributor.subaffiliation	Dept Engn	-
isi.contributor.subaffiliation	Inst Computat Linguist Antonio Zampolli	-
isi.contributor.surname	Bonfigli	-
isi.contributor.surname	Bacco	-
isi.contributor.surname	Merone	-
isi.contributor.surname	Dell'Orletta	-
isi.date.issued	2024	*
isi.description.abstracteng	In this study, we delve into the adaptation and effectiveness of Transformer-based, pre-trained Large Language Models (LLMs) within the biomedical domain, afield that poses unique challenges due to its complexity and the specialized nature of its data. Building on the foundation laid by the transformative architecture of Transformers, we investigate the nuanced dynamics of LLMs through a multifaceted lens, focusing on two domain-specific tasks, i.e., Natural Language Inference (NLI) and Named Entity Recognition (NER). Our objective is to bridge the knowledge gap regarding how these models' downstream performances correlate with their capacity to encapsulate task-relevant information. To achieve this goal, we probed and analyzed the inner encoding and attention mechanisms in LLMs, both encoder- and decoder-based, tailored for either general or biomedical-specific applications. This examination occurs before and after the models are fine-tuned across various data volumes. Our findings reveal that the models' downstream effectiveness is intricately linked to specific patterns within their internal mechanisms, shedding light on the nuanced ways in which LLMs process and apply knowledge in the biomedical context. The source code for this paper is available at https://github.com/agnesebonfigli99/LLMs-in-the-Biomedical-Domain.	*
isi.description.allpeopleoriginal	Bonfigli, A; Bacco, L; Merone, M; Dell'Orletta, F;	*
isi.document.sourcetype	WOS.SCI	*
isi.document.type	Article	*
isi.document.types	Article	*
isi.identifier.doi	10.1016/j.artmed.2024.103003	*
isi.identifier.eissn	1873-2860	*
isi.identifier.isi	WOS:001348580100001	*
isi.journal.journaltitle	ARTIFICIAL INTELLIGENCE IN MEDICINE	*
isi.journal.journaltitleabbrev	ARTIF INTELL MED	*
isi.language.original	English	*
isi.publisher.place	RADARWEG 29, 1043 NX AMSTERDAM, NETHERLANDS	*
isi.relation.volume	157	*
isi.title	From pre-training to fine-tuning: An in-depth analysis of Large Language Models in the biomedical domain	*
scopus.authority.ancejournal	ARTIFICIAL INTELLIGENCE IN MEDICINE###0933-3657	*
scopus.category	2701	*
scopus.category	1702	*
scopus.contributor.affiliation	National Research Council	-
scopus.contributor.affiliation	Università Campus Bio-Medico di Roma	-
scopus.contributor.affiliation	Università Campus Bio-Medico di Roma	-
scopus.contributor.affiliation	National Research Council	-
scopus.contributor.afid	60021199	-
scopus.contributor.afid	60005308	-
scopus.contributor.afid	60005308	-
scopus.contributor.afid	60021199	-
scopus.contributor.auid	58973576400	-
scopus.contributor.auid	57220927387	-
scopus.contributor.auid	56102657200	-
scopus.contributor.auid	57540567000	-
scopus.contributor.country	Italy	-
scopus.contributor.country	Italy	-
scopus.contributor.country	Italy	-
scopus.contributor.country	Italy	-
scopus.contributor.dptid	121833164	-
scopus.contributor.dptid	116307659	-
scopus.contributor.dptid	116307659	-
scopus.contributor.dptid	121833164	-
scopus.contributor.name	Agnese	-
scopus.contributor.name	Luca	-
scopus.contributor.name	Mario	-
scopus.contributor.name	Felice	-
scopus.contributor.subaffiliation	ItaliaNLP Lab;Institute of Computational Linguistics “Antonio Zampolli”;	-
scopus.contributor.subaffiliation	Research Unit of Computer Systems and Bioinformatics;Department of Engineering;	-
scopus.contributor.subaffiliation	Research Unit of Intelligent Technology for Health and Wellbeing;Department of Engineering;	-
scopus.contributor.subaffiliation	ItaliaNLP Lab;Institute of Computational Linguistics “Antonio Zampolli”;	-
scopus.contributor.surname	Bonfigli	-
scopus.contributor.surname	Bacco	-
scopus.contributor.surname	Merone	-
scopus.contributor.surname	Dell'Orletta	-
scopus.date.issued	2024	*
scopus.description.abstracteng	In this study, we delve into the adaptation and effectiveness of Transformer-based, pre-trained Large Language Models (LLMs) within the biomedical domain, a field that poses unique challenges due to its complexity and the specialized nature of its data. Building on the foundation laid by the transformative architecture of Transformers, we investigate the nuanced dynamics of LLMs through a multifaceted lens, focusing on two domain-specific tasks, i.e., Natural Language Inference (NLI) and Named Entity Recognition (NER). Our objective is to bridge the knowledge gap regarding how these models’ downstream performances correlate with their capacity to encapsulate task-relevant information. To achieve this goal, we probed and analyzed the inner encoding and attention mechanisms in LLMs, both encoder- and decoder-based, tailored for either general or biomedical-specific applications. This examination occurs before and after the models are fine-tuned across various data volumes. Our findings reveal that the models’ downstream effectiveness is intricately linked to specific patterns within their internal mechanisms, shedding light on the nuanced ways in which LLMs process and apply knowledge in the biomedical context. The source code for this paper is available at https://github.com/agnesebonfigli99/LLMs-in-the-Biomedical-Domain.	*
scopus.description.allpeopleoriginal	Bonfigli A.; Bacco L.; Merone M.; Dell'Orletta F.	*
scopus.differences	scopus.subject.keywords	*
scopus.document.type	ar	*
scopus.document.types	ar	*
scopus.identifier.doi	10.1016/j.artmed.2024.103003	*
scopus.identifier.eissn	1873-2860	*
scopus.identifier.pmid	39471773	*
scopus.identifier.pui	2035290733	*
scopus.identifier.scopus	2-s2.0-85207363735	*
scopus.journal.sourceid	24140	*
scopus.language.iso	eng	*
scopus.publisher.name	Elsevier B.V.	*
scopus.relation.article	103003	*
scopus.relation.volume	157	*
scopus.subject.keywords	BERT; Biomedical domain; Domain adaptation; GPT; Large Language Models; Probing tasks;	*
scopus.title	From pre-training to fine-tuning: An in-depth analysis of Large Language Models in the biomedical domain	*
scopus.titleeng	From pre-training to fine-tuning: An in-depth analysis of Large Language Models in the biomedical domain	*
Appare nelle tipologie:	01.01 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
1-s2.0-S0933365724002458-main-1.pdf accesso aperto Tipologia: Versione Editoriale (PDF) Licenza: Creative commons Dimensione 5.99 MB Formato Adobe PDF Visualizza/Apri	5.99 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/518430

Citazioni

ND

14

9

social impact