In this study, we delve into the adaptation and effectiveness of Transformer-based, pre-trained Large Language Models (LLMs) within the biomedical domain, a field that poses unique challenges due to its complexity and the specialized nature of its data. Building on the foundation laid by the transformative architecture of Transformers, we investigate the nuanced dynamics of LLMs through a multifaceted lens, focusing on two domain-specific tasks, i.e., Natural Language Inference (NLI) and Named Entity Recognition (NER). Our objective is to bridge the knowledge gap regarding how these models’ downstream performances correlate with their capacity to encapsulate task-relevant information. To achieve this goal, we probed and analyzed the inner encoding and attention mechanisms in LLMs, both encoder- and decoder-based, tailored for either general or biomedical-specific applications. This examination occurs before and after the models are fine-tuned across various data volumes. Our findings reveal that the models’ downstream effectiveness is intricately linked to specific patterns within their internal mechanisms, shedding light on the nuanced ways in which LLMs process and apply knowledge in the biomedical context. The source code for this paper is available at https://github.com/agnesebonfigli99/LLMs-in-the-Biomedical-Domain.

From pre-training to fine-tuning: An in-depth analysis of Large Language Models in the biomedical domain

Bonfigli A.;Merone M.;Dell'Orletta F.
2024

Abstract

In this study, we delve into the adaptation and effectiveness of Transformer-based, pre-trained Large Language Models (LLMs) within the biomedical domain, a field that poses unique challenges due to its complexity and the specialized nature of its data. Building on the foundation laid by the transformative architecture of Transformers, we investigate the nuanced dynamics of LLMs through a multifaceted lens, focusing on two domain-specific tasks, i.e., Natural Language Inference (NLI) and Named Entity Recognition (NER). Our objective is to bridge the knowledge gap regarding how these models’ downstream performances correlate with their capacity to encapsulate task-relevant information. To achieve this goal, we probed and analyzed the inner encoding and attention mechanisms in LLMs, both encoder- and decoder-based, tailored for either general or biomedical-specific applications. This examination occurs before and after the models are fine-tuned across various data volumes. Our findings reveal that the models’ downstream effectiveness is intricately linked to specific patterns within their internal mechanisms, shedding light on the nuanced ways in which LLMs process and apply knowledge in the biomedical context. The source code for this paper is available at https://github.com/agnesebonfigli99/LLMs-in-the-Biomedical-Domain.
Campo DC Valore Lingua
dc.authority.ancejournal ARTIFICIAL INTELLIGENCE IN MEDICINE en
dc.authority.orgunit Istituto di linguistica computazionale "Antonio Zampolli" - ILC en
dc.authority.people Bonfigli A. en
dc.authority.people Bacco L. en
dc.authority.people Merone M. en
dc.authority.people Dell'Orletta F. en
dc.collection.id.s b3f88f24-048a-4e43-8ab1-6697b90e068e *
dc.collection.name 01.01 Articolo in rivista *
dc.contributor.appartenenza Istituto di linguistica computazionale "Antonio Zampolli" - ILC *
dc.contributor.appartenenza.mi 918 *
dc.contributor.area Non assegn *
dc.date.accessioned 2024/12/16 17:11:25 -
dc.date.available 2024/12/16 17:11:25 -
dc.date.firstsubmission 2024/12/13 19:18:45 *
dc.date.issued 2024 -
dc.date.submission 2025/01/24 18:16:48 *
dc.description.abstracteng In this study, we delve into the adaptation and effectiveness of Transformer-based, pre-trained Large Language Models (LLMs) within the biomedical domain, a field that poses unique challenges due to its complexity and the specialized nature of its data. Building on the foundation laid by the transformative architecture of Transformers, we investigate the nuanced dynamics of LLMs through a multifaceted lens, focusing on two domain-specific tasks, i.e., Natural Language Inference (NLI) and Named Entity Recognition (NER). Our objective is to bridge the knowledge gap regarding how these models’ downstream performances correlate with their capacity to encapsulate task-relevant information. To achieve this goal, we probed and analyzed the inner encoding and attention mechanisms in LLMs, both encoder- and decoder-based, tailored for either general or biomedical-specific applications. This examination occurs before and after the models are fine-tuned across various data volumes. Our findings reveal that the models’ downstream effectiveness is intricately linked to specific patterns within their internal mechanisms, shedding light on the nuanced ways in which LLMs process and apply knowledge in the biomedical context. The source code for this paper is available at https://github.com/agnesebonfigli99/LLMs-in-the-Biomedical-Domain. -
dc.description.allpeople Bonfigli, A.; Bacco, L.; Merone, M.; Dell'Orletta, F. -
dc.description.allpeopleoriginal Bonfigli A.; Bacco L.; Merone M.; Dell'Orletta F. en
dc.description.fulltext open en
dc.description.international no en
dc.description.numberofauthors 4 -
dc.identifier.doi 10.1016/j.artmed.2024.103003 en
dc.identifier.isi WOS:001348580100001 en
dc.identifier.scopus 2-s2.0-85207363735 en
dc.identifier.source scopus *
dc.identifier.uri https://hdl.handle.net/20.500.14243/518430 -
dc.language.iso eng en
dc.relation.volume 157 en
dc.subject.keywordseng Biomedical domain -
dc.subject.keywordseng Domain adaptation -
dc.subject.keywordseng Large Language Models -
dc.subject.singlekeyword Biomedical domain *
dc.subject.singlekeyword Domain adaptation *
dc.subject.singlekeyword Large Language Models *
dc.title From pre-training to fine-tuning: An in-depth analysis of Large Language Models in the biomedical domain en
dc.type.driver info:eu-repo/semantics/article -
dc.type.full 01 Contributo su Rivista::01.01 Articolo in rivista it
dc.type.miur 262 -
iris.isi.extIssued 2024 -
iris.isi.extTitle From pre-training to fine-tuning: An in-depth analysis of Large Language Models in the biomedical domain -
iris.mediafilter.data 2025/04/04 04:06:16 *
iris.orcid.lastModifiedDate 2025/02/05 10:55:37 *
iris.orcid.lastModifiedMillisecond 1738749337111 *
iris.scopus.extIssued 2024 -
iris.scopus.extTitle From pre-training to fine-tuning: An in-depth analysis of Large Language Models in the biomedical domain -
iris.sitodocente.maxattempts 1 -
iris.unpaywall.bestoahost publisher *
iris.unpaywall.bestoaversion publishedVersion *
iris.unpaywall.doi 10.1016/j.artmed.2024.103003 *
iris.unpaywall.hosttype publisher *
iris.unpaywall.isoa true *
iris.unpaywall.journalisindoaj false *
iris.unpaywall.landingpage https://doi.org/10.1016/j.artmed.2024.103003 *
iris.unpaywall.license cc-by *
iris.unpaywall.metadataCallLastModified 28/04/2026 04:36:48 -
iris.unpaywall.metadataCallLastModifiedMillisecond 1777343808767 -
iris.unpaywall.oastatus hybrid *
isi.authority.ancejournal ARTIFICIAL INTELLIGENCE IN MEDICINE###0933-3657 *
isi.category PT *
isi.category EP *
isi.category IG *
isi.contributor.affiliation University Campus Bio-Medico - Rome Italy -
isi.contributor.affiliation Consiglio Nazionale delle Ricerche (CNR) -
isi.contributor.affiliation University Campus Bio-Medico - Rome Italy -
isi.contributor.affiliation Consiglio Nazionale delle Ricerche (CNR) -
isi.contributor.country Italy -
isi.contributor.country Italy -
isi.contributor.country Italy -
isi.contributor.country Italy -
isi.contributor.name Agnese -
isi.contributor.name Luca -
isi.contributor.name Mario -
isi.contributor.name Felice -
isi.contributor.researcherId LRV-8816-2024 -
isi.contributor.researcherId AHA-7493-2022 -
isi.contributor.researcherId AAA-8945-2019 -
isi.contributor.researcherId AAX-1864-2020 -
isi.contributor.subaffiliation Dept Engn -
isi.contributor.subaffiliation Inst Computat Linguist Antonio Zampolli -
isi.contributor.subaffiliation Dept Engn -
isi.contributor.subaffiliation Inst Computat Linguist Antonio Zampolli -
isi.contributor.surname Bonfigli -
isi.contributor.surname Bacco -
isi.contributor.surname Merone -
isi.contributor.surname Dell'Orletta -
isi.date.issued 2024 *
isi.description.abstracteng In this study, we delve into the adaptation and effectiveness of Transformer-based, pre-trained Large Language Models (LLMs) within the biomedical domain, afield that poses unique challenges due to its complexity and the specialized nature of its data. Building on the foundation laid by the transformative architecture of Transformers, we investigate the nuanced dynamics of LLMs through a multifaceted lens, focusing on two domain-specific tasks, i.e., Natural Language Inference (NLI) and Named Entity Recognition (NER). Our objective is to bridge the knowledge gap regarding how these models' downstream performances correlate with their capacity to encapsulate task-relevant information. To achieve this goal, we probed and analyzed the inner encoding and attention mechanisms in LLMs, both encoder- and decoder-based, tailored for either general or biomedical-specific applications. This examination occurs before and after the models are fine-tuned across various data volumes. Our findings reveal that the models' downstream effectiveness is intricately linked to specific patterns within their internal mechanisms, shedding light on the nuanced ways in which LLMs process and apply knowledge in the biomedical context. The source code for this paper is available at https://github.com/agnesebonfigli99/LLMs-in-the-Biomedical-Domain. *
isi.description.allpeopleoriginal Bonfigli, A; Bacco, L; Merone, M; Dell'Orletta, F; *
isi.document.sourcetype WOS.SCI *
isi.document.type Article *
isi.document.types Article *
isi.identifier.doi 10.1016/j.artmed.2024.103003 *
isi.identifier.eissn 1873-2860 *
isi.identifier.isi WOS:001348580100001 *
isi.journal.journaltitle ARTIFICIAL INTELLIGENCE IN MEDICINE *
isi.journal.journaltitleabbrev ARTIF INTELL MED *
isi.language.original English *
isi.publisher.place RADARWEG 29, 1043 NX AMSTERDAM, NETHERLANDS *
isi.relation.volume 157 *
isi.title From pre-training to fine-tuning: An in-depth analysis of Large Language Models in the biomedical domain *
scopus.authority.ancejournal ARTIFICIAL INTELLIGENCE IN MEDICINE###0933-3657 *
scopus.category 2701 *
scopus.category 1702 *
scopus.contributor.affiliation National Research Council -
scopus.contributor.affiliation Università Campus Bio-Medico di Roma -
scopus.contributor.affiliation Università Campus Bio-Medico di Roma -
scopus.contributor.affiliation National Research Council -
scopus.contributor.afid 60021199 -
scopus.contributor.afid 60005308 -
scopus.contributor.afid 60005308 -
scopus.contributor.afid 60021199 -
scopus.contributor.auid 58973576400 -
scopus.contributor.auid 57220927387 -
scopus.contributor.auid 56102657200 -
scopus.contributor.auid 57540567000 -
scopus.contributor.country Italy -
scopus.contributor.country Italy -
scopus.contributor.country Italy -
scopus.contributor.country Italy -
scopus.contributor.dptid 121833164 -
scopus.contributor.dptid 116307659 -
scopus.contributor.dptid 116307659 -
scopus.contributor.dptid 121833164 -
scopus.contributor.name Agnese -
scopus.contributor.name Luca -
scopus.contributor.name Mario -
scopus.contributor.name Felice -
scopus.contributor.subaffiliation ItaliaNLP Lab;Institute of Computational Linguistics “Antonio Zampolli”; -
scopus.contributor.subaffiliation Research Unit of Computer Systems and Bioinformatics;Department of Engineering; -
scopus.contributor.subaffiliation Research Unit of Intelligent Technology for Health and Wellbeing;Department of Engineering; -
scopus.contributor.subaffiliation ItaliaNLP Lab;Institute of Computational Linguistics “Antonio Zampolli”; -
scopus.contributor.surname Bonfigli -
scopus.contributor.surname Bacco -
scopus.contributor.surname Merone -
scopus.contributor.surname Dell'Orletta -
scopus.date.issued 2024 *
scopus.description.abstracteng In this study, we delve into the adaptation and effectiveness of Transformer-based, pre-trained Large Language Models (LLMs) within the biomedical domain, a field that poses unique challenges due to its complexity and the specialized nature of its data. Building on the foundation laid by the transformative architecture of Transformers, we investigate the nuanced dynamics of LLMs through a multifaceted lens, focusing on two domain-specific tasks, i.e., Natural Language Inference (NLI) and Named Entity Recognition (NER). Our objective is to bridge the knowledge gap regarding how these models’ downstream performances correlate with their capacity to encapsulate task-relevant information. To achieve this goal, we probed and analyzed the inner encoding and attention mechanisms in LLMs, both encoder- and decoder-based, tailored for either general or biomedical-specific applications. This examination occurs before and after the models are fine-tuned across various data volumes. Our findings reveal that the models’ downstream effectiveness is intricately linked to specific patterns within their internal mechanisms, shedding light on the nuanced ways in which LLMs process and apply knowledge in the biomedical context. The source code for this paper is available at https://github.com/agnesebonfigli99/LLMs-in-the-Biomedical-Domain. *
scopus.description.allpeopleoriginal Bonfigli A.; Bacco L.; Merone M.; Dell'Orletta F. *
scopus.differences scopus.subject.keywords *
scopus.document.type ar *
scopus.document.types ar *
scopus.identifier.doi 10.1016/j.artmed.2024.103003 *
scopus.identifier.eissn 1873-2860 *
scopus.identifier.pmid 39471773 *
scopus.identifier.pui 2035290733 *
scopus.identifier.scopus 2-s2.0-85207363735 *
scopus.journal.sourceid 24140 *
scopus.language.iso eng *
scopus.publisher.name Elsevier B.V. *
scopus.relation.article 103003 *
scopus.relation.volume 157 *
scopus.subject.keywords BERT; Biomedical domain; Domain adaptation; GPT; Large Language Models; Probing tasks; *
scopus.title From pre-training to fine-tuning: An in-depth analysis of Large Language Models in the biomedical domain *
scopus.titleeng From pre-training to fine-tuning: An in-depth analysis of Large Language Models in the biomedical domain *
Appare nelle tipologie: 01.01 Articolo in rivista
File in questo prodotto:
File Dimensione Formato  
1-s2.0-S0933365724002458-main-1.pdf

accesso aperto

Tipologia: Versione Editoriale (PDF)
Licenza: Creative commons
Dimensione 5.99 MB
Formato Adobe PDF
5.99 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/518430
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 14
  • ???jsp.display-item.citation.isi??? 9
social impact