This exploratory study compares persistent homology methods with traditional machine learning and deep learning techniques for label-efficient classification. We propose pure topological approaches, including persistence thresholding and Bottleneck distance classification, and explore hybrid methods combining persistent homology with machine learning. These are evaluated against conventional machine learning algorithms and deep neural networks on two binary classification tasks: surface crack detection and malaria cell identification. We assess performance across various number of samples per class, ranging from 1 to 500. Our study highlights the efficacy of persistent homology-based methods in low-data scenarios. Using the Bottleneck distance approach, we achieve 95.95% accuracy in crack detection and 93.11% in malaria diagnosis with only one labeled sample per class. These results outperform the best performance from machine learning models, which achieves 69.40% and 39.75% accuracy, respectively, and deep learning models, which attains up to 95.96% in crack detection and 62.72% in malaria diagnosis. This demonstrates the superior performance of topological methods in classification tasks with few labeled data. Hybrid approaches demonstrate enhanced performance as the number of labeled samples increases, effectively leveraging topological features to boost classification accuracy. This study highlights the robustness of topological methods in extracting meaningful features from limited data, offering promising directions for efficient, label-conserving classification strategies. The results underscore the worth of persistent homology, both as a standalone tool and in combination with machine learning, particularly in domains where labeled data scarcity challenges traditional deep learning approaches.
Persistent Homology vs. Learning Methods: A Comparative Study in Limited Data Scenarios
Ulderico Fugacci
2024
Abstract
This exploratory study compares persistent homology methods with traditional machine learning and deep learning techniques for label-efficient classification. We propose pure topological approaches, including persistence thresholding and Bottleneck distance classification, and explore hybrid methods combining persistent homology with machine learning. These are evaluated against conventional machine learning algorithms and deep neural networks on two binary classification tasks: surface crack detection and malaria cell identification. We assess performance across various number of samples per class, ranging from 1 to 500. Our study highlights the efficacy of persistent homology-based methods in low-data scenarios. Using the Bottleneck distance approach, we achieve 95.95% accuracy in crack detection and 93.11% in malaria diagnosis with only one labeled sample per class. These results outperform the best performance from machine learning models, which achieves 69.40% and 39.75% accuracy, respectively, and deep learning models, which attains up to 95.96% in crack detection and 62.72% in malaria diagnosis. This demonstrates the superior performance of topological methods in classification tasks with few labeled data. Hybrid approaches demonstrate enhanced performance as the number of labeled samples increases, effectively leveraging topological features to boost classification accuracy. This study highlights the robustness of topological methods in extracting meaningful features from limited data, offering promising directions for efficient, label-conserving classification strategies. The results underscore the worth of persistent homology, both as a standalone tool and in combination with machine learning, particularly in domains where labeled data scarcity challenges traditional deep learning approaches.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


