Topological Data Analysis (TDA) is proving to be an excellent tool for shape analysis of digital data. The recently found synergy with artificial intelligence gave rise to Topological Machine Learning (TML), which aims to combine the expressive power of computational topology with the accuracy of machine learning to provide a comprehensive and automatic framework for data classification. The aim of this thesis is twofold: to develop current applications of TML in practical scenarios, with emphasis on the most overlooked aspects of its pipeline, and to connect the theory of TDA with a broader class of maps, the Group Equivariant Non-Expansive Operators (GENEOs). In the first part of this dissertation, we develop a pipeline to study digital data by means of TML in order to validate the practical aspects of our theory. We apply this pipeline to benchmark and experimental datasets, achieving state-of-the-art accuracies in biomedical scenarios. Moreover, we perform an empirical but extensive study of the stability of features arising from the various homological dimensions with respect to noise and points distribution in the persistence diagram. Such a comparison is novel in the TML literature and our findings show that results coming from the concatenation of each homological dimension available are the best approach in the vectorization step. We later expand on the main concept of TDA, proving that the functor that computes persistence diagrams can be seen as a particular instance of GENEOs (Theorem 4.1.4). The GENEO framework allows us to inject arbitrary equivariances in a machine learning setting and represents a new possible approach to neural network architecture. Next, we fully present the theory of GENEOs and their properties, such as convexity and concavity, under suitable assumptions. This thesis expand the GENEO theory with two new tools to define such operators, namely using symmetric functions (Theorem 5.3.24) and a characterization theorem of linear GENEOs between arbitrary functional spaces (Theorem 6.2.2). Finally, we develop a new neural network architecture with GENEOs instead of neurons and show its potential in a couple of applications.
A bridge between persistent homology and group equivariant non-expansive operators: theory and applications / Conti, F.; Frosini, P.; Moroni, D.; Pascali, M. A.. - ELETTRONICO. - (2024).
A bridge between persistent homology and group equivariant non-expansive operators: theory and applications
Conti F.;Moroni D.Relatore interno
;Pascali M. A.Correlatore interno
2024
Abstract
Topological Data Analysis (TDA) is proving to be an excellent tool for shape analysis of digital data. The recently found synergy with artificial intelligence gave rise to Topological Machine Learning (TML), which aims to combine the expressive power of computational topology with the accuracy of machine learning to provide a comprehensive and automatic framework for data classification. The aim of this thesis is twofold: to develop current applications of TML in practical scenarios, with emphasis on the most overlooked aspects of its pipeline, and to connect the theory of TDA with a broader class of maps, the Group Equivariant Non-Expansive Operators (GENEOs). In the first part of this dissertation, we develop a pipeline to study digital data by means of TML in order to validate the practical aspects of our theory. We apply this pipeline to benchmark and experimental datasets, achieving state-of-the-art accuracies in biomedical scenarios. Moreover, we perform an empirical but extensive study of the stability of features arising from the various homological dimensions with respect to noise and points distribution in the persistence diagram. Such a comparison is novel in the TML literature and our findings show that results coming from the concatenation of each homological dimension available are the best approach in the vectorization step. We later expand on the main concept of TDA, proving that the functor that computes persistence diagrams can be seen as a particular instance of GENEOs (Theorem 4.1.4). The GENEO framework allows us to inject arbitrary equivariances in a machine learning setting and represents a new possible approach to neural network architecture. Next, we fully present the theory of GENEOs and their properties, such as convexity and concavity, under suitable assumptions. This thesis expand the GENEO theory with two new tools to define such operators, namely using symmetric functions (Theorem 5.3.24) and a characterization theorem of linear GENEOs between arbitrary functional spaces (Theorem 6.2.2). Finally, we develop a new neural network architecture with GENEOs instead of neurons and show its potential in a couple of applications.File | Dimensione | Formato | |
---|---|---|---|
Tesi_dottorato_finale.pdf
accesso aperto
Licenza:
Creative commons
Dimensione
29.15 MB
Formato
Adobe PDF
|
29.15 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.