CNR Institutional Research Information System

Computer-aided diagnosis (CAD) systems based on deep learning have shown significant potential for Alzheimer’s disease (AD) stage classification from Magnetic Resonance Imaging (MRI). Nevertheless, challenges such as class imbalance, small sample sizes, and the presence of multiple slices per subject may lead to biased evaluation and statistically unreliable performance, particularly for minority classes. In this study, a Vision Transformer (ViT)-based framework is proposed for multi-class AD classification using a Kaggle dataset containing 6400 MRI slices across four cognitive stages. A subject-wise data-splitting strategy is employed to prevent information leakage between the training and testing sets, and the statistical unreliability of near-perfect scores in underrepresented classes is critically examined. An ablation study is conducted to assess the contribution of key architectural components, demonstrating the effectiveness of self-attention and patch embedding in capturing discriminative features. Furthermore, attention-based visualization maps are incorporated to highlight brain regions influencing the model’s decisions and to illustrate subtle anatomical differences between MildDemented and VeryMildDemented cases. The proposed approach achieves a test accuracy of 97.98%, outperforming existing methods on the same dataset while providing improved interpretability. It supports early and accurate AD stage identification.

Enhancing Early Detection of Alzheimer’s Disease via Vision Transformer Machine Learning Architecture Using MRI Images

Wided Hechkel;Marco Leo;Pierluigi Carcagnì;Marco Del-Coco;Abdelhamid Helali

2026

Abstract

Computer-aided diagnosis (CAD) systems based on deep learning have shown significant potential for Alzheimer’s disease (AD) stage classification from Magnetic Resonance Imaging (MRI). Nevertheless, challenges such as class imbalance, small sample sizes, and the presence of multiple slices per subject may lead to biased evaluation and statistically unreliable performance, particularly for minority classes. In this study, a Vision Transformer (ViT)-based framework is proposed for multi-class AD classification using a Kaggle dataset containing 6400 MRI slices across four cognitive stages. A subject-wise data-splitting strategy is employed to prevent information leakage between the training and testing sets, and the statistical unreliability of near-perfect scores in underrepresented classes is critically examined. An ablation study is conducted to assess the contribution of key architectural components, demonstrating the effectiveness of self-attention and patch embedding in capturing discriminative features. Furthermore, attention-based visualization maps are incorporated to highlight brain regions influencing the model’s decisions and to illustrate subtle anatomical differences between MildDemented and VeryMildDemented cases. The proposed approach achieves a test accuracy of 97.98%, outperforming existing methods on the same dataset while providing improved interpretability. It supports early and accurate AD stage identification.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2026
			
	Strutture organizzative
	
				Istituto di Scienze Applicate e Sistemi Intelligenti "Eduardo Caianiello" - ISASI - Sede Secondaria Lecce
			
	Parole chiave
	
				accuracy
Alzheimer’s Disease
classification
MRI
Vision Transformer

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/577141

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ente

Citazioni

ND

0

0

social impact