The development of a topological pipeline for machine learning involves two crucial steps that strongly influence the performance of the pipeline. The first step is the choice of the filtration that associates a persistence diagram with digital data. The second step is the choice of the representation method for the persistence diagrams, which often relies on several parameters. In this work we develop a pipeline that associates persistence diagrams to digital data, via the most appropriate filtration for the type of data considered. Using a grid search approach, this pipeline determines optimal representation methods and parameters. We assess the performance of our pipeline, and in parallel we compare the different representation methods, on popular benchmark datasets. This work is a first step towards both an easy, ready to use, pipeline for data classification using persistent homology and machine learning, and to understand the theoretical reasons why, given a dataset and a task to be performed, a pair (filtration, topological representation) is better than another.
A topological pipeline for machine learning
Conti F
2022
Abstract
The development of a topological pipeline for machine learning involves two crucial steps that strongly influence the performance of the pipeline. The first step is the choice of the filtration that associates a persistence diagram with digital data. The second step is the choice of the representation method for the persistence diagrams, which often relies on several parameters. In this work we develop a pipeline that associates persistence diagrams to digital data, via the most appropriate filtration for the type of data considered. Using a grid search approach, this pipeline determines optimal representation methods and parameters. We assess the performance of our pipeline, and in parallel we compare the different representation methods, on popular benchmark datasets. This work is a first step towards both an easy, ready to use, pipeline for data classification using persistent homology and machine learning, and to understand the theoretical reasons why, given a dataset and a task to be performed, a pair (filtration, topological representation) is better than another.File | Dimensione | Formato | |
---|---|---|---|
prod_467058-doc_183749.pdf
accesso aperto
Descrizione: A topological pipeline for machine learning
Tipologia:
Versione Editoriale (PDF)
Dimensione
958.96 kB
Formato
Adobe PDF
|
958.96 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.