Robust computer vision for POCUS: achieving reliability in ultrasound imaging

Ignesti, Giacomo

The rapid proliferation of Deep Learning (DL) has fundamentally transformed the scientific landscape, shifting the paradigm of problem-solving from explicit, algorithmic logic to data-driven feature learning. However, as these technologies permeate high-stakes domains such as medicine and precision agriculture, a critical tension has emerged. The very non-linearity that grants Neural Networks their immense expressive power, as described by the Universal Approximation Theorem, simultaneously causes intrinsic opacity, gradient instability, and a susceptibility to spurious correlations. Consequently, the deployment of these black-box models faces significant hurdles in terms of reliability and trust, necessitating the development of Artificial Intelligence (AI) that is not merely accurate but also fundamentally reliable and interpretable. The Clinical Context: Telemedicine and the TiAssisto Project. This thesis grounds its research in the critical context of the SARS-CoV-2 pandemic, which exposed systemic fragilities in healthcare and highlighted the need for resilient, decentralised care models. A significant portion of the applied research was conducted within the TiAssisto project, a server-based web platform designed to support multi-pathological patients by bridging the gap between hospitalisation and home care. To operationalise the vision of Medicine 5.0—a human-centric approach to healthcare—TiAssisto integrates a hybrid AI framework. This comprises a Strong AI component utilising a rule-based algorithm and logic for monitoring vital signs, and a Weak AI component employing a DL Model for the automatic classification of medical imaging. The platform prioritises patient security through strict local encryption and anonymisation protocols, anticipating the growing necessity for Reliable AI architectures. The Challenge of Point-of-Care Ultrasound (POCUS). Within this framework, the thesis identifies Lung Ultrasound (LUS) as a pivotal technology for telemedicine, given its portability and safety. However, LUS presents unique challenges for Computer Vision; unlike MRI or CT scans, LUS images are characterised by a low signal-to-noise ratio and are artefact-rich. Diagnosis often relies on interpreting acoustic artefacts, such as "A-lines" (horizontal repetitions indicating a healthy lung) or "B-lines" (vertical comet-tail artefacts indicating pathology), rather than direct anatomical visualization. To address the computational constraints of Point-of-Care (POCUS) devices and the scarcity of labelled data, the research proposes a novel Efficient Adaptive Ensembling methodology. This approach utilises the EfficientNet-b0 DL architecture as a backbone due to its optimal balance of accuracy and complexity. By training weak learners on data subsets to induce diversity and subsequently synthesising their deep features via a trainable linear combination layer, the model achieves robustness without the computational cost of traditional ensembles. When validated on a large public LUS dataset, this ensemble model achieved 100% accuracy in distinguishing between COVID-19, Pneumonia, and Healthy classes, significantly outperforming previous state-of-the-art models. Crucially, it maintained high efficiency, requiring only 0.78 G-FLOPs, making it suitable for real-time deployment on edge devices. Foundations of Reliable AI: From Diagnosis to Constraint. While achieving high accuracy is essential, the thesis argues that clinical adoption hinges on trust. Initial efforts utilised post-hoc Explainable AI (XAI) methods, specifically Grad-CAM, to visually validate that the LUS model focused on clinically relevant areas, such as the pleural line. However, the research identifies a critical limitation in post-hoc analysis: it is purely diagnostic and cannot correct a model’s flawed reasoning. To transcend this limitation, the thesis introduces a paradigm shift towards Intrinsically Guided Training. The research proposes moving the generation of Class Activation Maps (CAM) into the training loop itself. By incorporating a custom Contrastive Loss term, the training process actively penalises the model if its internal attention does not align with the relevant input features. This concept evolved into the Batch-CAM paradigm. This advanced methodology replaces instance-specific reconstruction with a Class Prototype approach, where the model compares batch-averaged CAMs against a learned prototype of the target class. This enforces semantic consistency, compelling the model to learn a generalised representation of a class (e.g., the concept of a "T-shirt" or a specific digit) rather than memorising individual examples. Experimental validation on MNIST and Fashion-MNIST demonstrated that this approach yields models with competitive accuracy and significantly superior, human-intelligible internal representations. Conclusion and Future Outlook. The thesis concludes that the path to Reliable AI lies in shifting from passive analysis to active constraint during the learning process. By embedding interpretability into the loss function, models can be guided to learn for the right reasons, mitigating the risk of spurious correlations. The first application of the method to LUS interpretation yields high accuracy, and the systems, now partially guided, seem to learn meaningful anatomical details to inform their decisions. Looking forward, the research outlines a trajectory towards Continual Learning to prevent catastrophic forgetting as medical protocols evolve, and Federated Learning to facilitate privacy-preserving training on decentralised data. Ultimately, integrating these Plausible AI models into hybrid Decision Support Systems offers a structure for the future of healthcare systems that are efficient, transparent, and trustworthy partners in human decision-making.

Robust computer vision for POCUS: achieving reliability in ultrasound imaging / Ignesti, Giacomo. - ELETTRONICO. - (2026 May 04).