This work presents a new dataset created especially for Visual Question Answering (VQA) on brain tumor MRI images. This dataset includes 750 MRI images of brain tumor with a 512 × 512 pixel resolution. It also includes two different kinds of expert-annotated question-answer combinations in natural language (What/Which, and Yes/No) associated with three possible brain tumor categories (glioma, meningioma, and pituitary). To create a benchmark for this dataset, we propose a dual-stream VQA framework that leverages two transformer-based models to handle image feature extraction, question interpretation, and answer generation. The baseline model is thoroughly assessed on the dataset, revealing the task’s inherent complexity and emphasizing the difficulties in achieving precise medical VQA. The outcomes underscore the dataset’s utility in advancing multimodal medical support systems and lay the groundwork for future progress in this domain.
Brain Tumor MRI Interpretation: Towards a Benchmark for Medical Visual Question Answering
Minutolo, Aniello;Esposito, Massimo;
2025
Abstract
This work presents a new dataset created especially for Visual Question Answering (VQA) on brain tumor MRI images. This dataset includes 750 MRI images of brain tumor with a 512 × 512 pixel resolution. It also includes two different kinds of expert-annotated question-answer combinations in natural language (What/Which, and Yes/No) associated with three possible brain tumor categories (glioma, meningioma, and pituitary). To create a benchmark for this dataset, we propose a dual-stream VQA framework that leverages two transformer-based models to handle image feature extraction, question interpretation, and answer generation. The baseline model is thoroughly assessed on the dataset, revealing the task’s inherent complexity and emphasizing the difficulties in achieving precise medical VQA. The outcomes underscore the dataset’s utility in advancing multimodal medical support systems and lay the groundwork for future progress in this domain.| File | Dimensione | Formato | |
|---|---|---|---|
|
Brain Tumor MRI Interpretation.pdf
solo utenti autorizzati
Tipologia:
Documento in Post-print
Licenza:
NON PUBBLICO - Accesso privato/ristretto
Dimensione
753.94 kB
Formato
Adobe PDF
|
753.94 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


