Background: Biomedical natural language processing (NLP) increasingly relies on large language models and extensive datasets, presenting significant computational challenges. Methods: We propose Blue5, a multi-task model based on SciFive that incorporates instance selection (IS) to enable efficient, multi-task learning (MTL) on biomedical data. We adapt the E2SC-IS framework for the biomedical domain, integrating a calibrated SVM classifier to reduce computational costs. Results: Our approach achieves an average data reduction of 26.6% across the several tasks of the BLUE (Biomedical Language Understanding Evaluation) Benchmark, while maintaining performance comparable with state-of-the-art models. The multi-task SVM configuration emerges as the most effective, demonstrating the power of combining IS with MTL for biomedical NLP. As a result of the unified framework, Blue5 effectively selects the most informative instances across tasks, ensuring model generalization while efficiently handling multiple NLP tasks. Conclusion: Our work offers a practical solution to address growing computational demands, enabling more scalable and accessible applications of advanced NLP techniques in biomedical research and healthcare.
Efficient multi-task learning with instance selection for biomedical NLP
Bonfigli A.;Pecchia L.;Merone M.;Dell'Orletta F.
2025
Abstract
Background: Biomedical natural language processing (NLP) increasingly relies on large language models and extensive datasets, presenting significant computational challenges. Methods: We propose Blue5, a multi-task model based on SciFive that incorporates instance selection (IS) to enable efficient, multi-task learning (MTL) on biomedical data. We adapt the E2SC-IS framework for the biomedical domain, integrating a calibrated SVM classifier to reduce computational costs. Results: Our approach achieves an average data reduction of 26.6% across the several tasks of the BLUE (Biomedical Language Understanding Evaluation) Benchmark, while maintaining performance comparable with state-of-the-art models. The multi-task SVM configuration emerges as the most effective, demonstrating the power of combining IS with MTL for biomedical NLP. As a result of the unified framework, Blue5 effectively selects the most informative instances across tasks, ensuring model generalization while efficiently handling multiple NLP tasks. Conclusion: Our work offers a practical solution to address growing computational demands, enabling more scalable and accessible applications of advanced NLP techniques in biomedical research and healthcare.| Campo DC | Valore | Lingua |
|---|---|---|
| dc.authority.ancejournal | COMPUTERS IN BIOLOGY AND MEDICINE | en |
| dc.authority.orgunit | Istituto di linguistica computazionale "Antonio Zampolli" - ILC | en |
| dc.authority.people | Bonfigli A. | en |
| dc.authority.people | Bacco L. | en |
| dc.authority.people | Pecchia L. | en |
| dc.authority.people | Merone M. | en |
| dc.authority.people | Dell'Orletta F. | en |
| dc.collection.id.s | b3f88f24-048a-4e43-8ab1-6697b90e068e | * |
| dc.collection.name | 01.01 Articolo in rivista | * |
| dc.contributor.appartenenza | Istituto di linguistica computazionale "Antonio Zampolli" - ILC | * |
| dc.contributor.appartenenza.mi | 918 | * |
| dc.contributor.area | Non assegn | * |
| dc.date.accessioned | 2026/03/03 14:57:20 | - |
| dc.date.available | 2026/03/03 14:57:20 | - |
| dc.date.firstsubmission | 2026/03/02 18:50:58 | * |
| dc.date.issued | 2025 | - |
| dc.date.submission | 2026/03/02 18:50:58 | * |
| dc.description.abstracteng | Background: Biomedical natural language processing (NLP) increasingly relies on large language models and extensive datasets, presenting significant computational challenges. Methods: We propose Blue5, a multi-task model based on SciFive that incorporates instance selection (IS) to enable efficient, multi-task learning (MTL) on biomedical data. We adapt the E2SC-IS framework for the biomedical domain, integrating a calibrated SVM classifier to reduce computational costs. Results: Our approach achieves an average data reduction of 26.6% across the several tasks of the BLUE (Biomedical Language Understanding Evaluation) Benchmark, while maintaining performance comparable with state-of-the-art models. The multi-task SVM configuration emerges as the most effective, demonstrating the power of combining IS with MTL for biomedical NLP. As a result of the unified framework, Blue5 effectively selects the most informative instances across tasks, ensuring model generalization while efficiently handling multiple NLP tasks. Conclusion: Our work offers a practical solution to address growing computational demands, enabling more scalable and accessible applications of advanced NLP techniques in biomedical research and healthcare. | - |
| dc.description.allpeople | Bonfigli, A.; Bacco, L.; Pecchia, L.; Merone, M.; Dell'Orletta, F. | - |
| dc.description.allpeopleoriginal | Bonfigli A.; Bacco L.; Pecchia L.; Merone M.; Dell'Orletta F. | en |
| dc.description.fulltext | open | en |
| dc.description.international | no | en |
| dc.description.numberofauthors | 5 | - |
| dc.identifier.doi | 10.1016/j.compbiomed.2025.110050 | en |
| dc.identifier.scopus | 2-s2.0-105001252768 | en |
| dc.identifier.source | scopus | * |
| dc.identifier.uri | https://hdl.handle.net/20.500.14243/570501 | - |
| dc.language.iso | eng | en |
| dc.relation.volume | 190 | en |
| dc.subject.keywords | Biomedical NLP | - |
| dc.subject.keywords | BLUE benchmark | - |
| dc.subject.keywords | Computational efficiency | - |
| dc.subject.keywords | Instance selection | - |
| dc.subject.keywords | Multi-task learning | - |
| dc.subject.singlekeyword | Biomedical NLP | * |
| dc.subject.singlekeyword | BLUE benchmark | * |
| dc.subject.singlekeyword | Computational efficiency | * |
| dc.subject.singlekeyword | Instance selection | * |
| dc.subject.singlekeyword | Multi-task learning | * |
| dc.title | Efficient multi-task learning with instance selection for biomedical NLP | en |
| dc.type.driver | info:eu-repo/semantics/article | - |
| dc.type.full | 01 Contributo su Rivista::01.01 Articolo in rivista | it |
| dc.type.miur | 262 | - |
| iris.mediafilter.data | 2026/03/04 02:52:12 | * |
| iris.orcid.lastModifiedDate | 2026/03/03 14:57:20 | * |
| iris.orcid.lastModifiedMillisecond | 1772546240668 | * |
| iris.scopus.extIssued | 2025 | - |
| iris.scopus.extTitle | Efficient multi-task learning with instance selection for biomedical NLP | - |
| iris.sitodocente.maxattempts | 1 | - |
| iris.unpaywall.doi | 10.1016/j.compbiomed.2025.110050 | * |
| iris.unpaywall.isoa | false | * |
| iris.unpaywall.journalisindoaj | false | * |
| iris.unpaywall.metadataCallLastModified | 04/03/2026 04:34:02 | - |
| iris.unpaywall.metadataCallLastModifiedMillisecond | 1772595242548 | - |
| iris.unpaywall.oastatus | closed | * |
| scopus.authority.ancejournal | COMPUTERS IN BIOLOGY AND MEDICINE###0010-4825 | * |
| scopus.category | 2718 | * |
| scopus.category | 1706 | * |
| scopus.contributor.affiliation | Università Campus Bio-Medico di Roma | - |
| scopus.contributor.affiliation | Università Campus Bio-Medico di Roma | - |
| scopus.contributor.affiliation | Fondazione Policlinico Universitario Campus Bio-Medico di Roma | - |
| scopus.contributor.affiliation | Università Campus Bio-Medico di Roma | - |
| scopus.contributor.affiliation | National Research Council | - |
| scopus.contributor.afid | 60005308 | - |
| scopus.contributor.afid | 60005308 | - |
| scopus.contributor.afid | 60276021 | - |
| scopus.contributor.afid | 60005308 | - |
| scopus.contributor.afid | 60021199 | - |
| scopus.contributor.auid | 58973576400 | - |
| scopus.contributor.auid | 57220927387 | - |
| scopus.contributor.auid | 35746897300 | - |
| scopus.contributor.auid | 56102657200 | - |
| scopus.contributor.auid | 57540567000 | - |
| scopus.contributor.country | Italy | - |
| scopus.contributor.country | Italy | - |
| scopus.contributor.country | Italy | - |
| scopus.contributor.country | Italy | - |
| scopus.contributor.country | Italy | - |
| scopus.contributor.dptid | 116307659 | - |
| scopus.contributor.dptid | 116307659 | - |
| scopus.contributor.dptid | - | |
| scopus.contributor.dptid | 116307659 | - |
| scopus.contributor.dptid | 121833164 | - |
| scopus.contributor.name | Agnese | - |
| scopus.contributor.name | Luca | - |
| scopus.contributor.name | Leandro | - |
| scopus.contributor.name | Mario | - |
| scopus.contributor.name | Felice | - |
| scopus.contributor.subaffiliation | Research Unit of Intelligent Technology for Health and Wellbeing;Department of Engineering; | - |
| scopus.contributor.subaffiliation | Research Unit of Computer Systems and Bioinformatics;Department of Engineering; | - |
| scopus.contributor.subaffiliation | - | |
| scopus.contributor.subaffiliation | Research Unit of Intelligent Technology for Health and Wellbeing;Department of Engineering; | - |
| scopus.contributor.subaffiliation | ItaliaNLP Lab;Institute of Computational Linguistics ”Antonio Zampolli”; | - |
| scopus.contributor.surname | Bonfigli | - |
| scopus.contributor.surname | Bacco | - |
| scopus.contributor.surname | Pecchia | - |
| scopus.contributor.surname | Merone | - |
| scopus.contributor.surname | Dell'Orletta | - |
| scopus.date.issued | 2025 | * |
| scopus.description.abstracteng | Background: Biomedical natural language processing (NLP) increasingly relies on large language models and extensive datasets, presenting significant computational challenges. Methods: We propose Blue5, a multi-task model based on SciFive that incorporates instance selection (IS) to enable efficient, multi-task learning (MTL) on biomedical data. We adapt the E2SC-IS framework for the biomedical domain, integrating a calibrated SVM classifier to reduce computational costs. Results: Our approach achieves an average data reduction of 26.6% across the several tasks of the BLUE (Biomedical Language Understanding Evaluation) Benchmark, while maintaining performance comparable with state-of-the-art models. The multi-task SVM configuration emerges as the most effective, demonstrating the power of combining IS with MTL for biomedical NLP. As a result of the unified framework, Blue5 effectively selects the most informative instances across tasks, ensuring model generalization while efficiently handling multiple NLP tasks. Conclusion: Our work offers a practical solution to address growing computational demands, enabling more scalable and accessible applications of advanced NLP techniques in biomedical research and healthcare. | * |
| scopus.description.allpeopleoriginal | Bonfigli A.; Bacco L.; Pecchia L.; Merone M.; Dell'Orletta F. | * |
| scopus.differences | scopus.subject.keywords | * |
| scopus.document.type | ar | * |
| scopus.document.types | ar | * |
| scopus.identifier.doi | 10.1016/j.compbiomed.2025.110050 | * |
| scopus.identifier.eissn | 1879-0534 | * |
| scopus.identifier.pmid | 40168806 | * |
| scopus.identifier.pui | 2038116232 | * |
| scopus.identifier.scopus | 2-s2.0-105001252768 | * |
| scopus.journal.sourceid | 17957 | * |
| scopus.language.iso | eng | * |
| scopus.publisher.name | Elsevier Ltd | * |
| scopus.relation.article | 110050 | * |
| scopus.relation.volume | 190 | * |
| scopus.subject.keywords | Biomedical NLP; BLUE benchmark; Computational efficiency; Instance selection; Multi-task learning; | * |
| scopus.title | Efficient multi-task learning with instance selection for biomedical NLP | * |
| scopus.titleeng | Efficient multi-task learning with instance selection for biomedical NLP | * |
| Appare nelle tipologie: | 01.01 Articolo in rivista | |
| File | Dimensione | Formato | |
|---|---|---|---|
|
1-s2.0-S0010482525004019-main.pdf
accesso aperto
Licenza:
Creative commons
Dimensione
1.94 MB
Formato
Adobe PDF
|
1.94 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


