CNR Institutional Research Information System

The development of a topological pipeline for machine learning involves two crucial steps that strongly influence the performance of the pipeline. The first step is the choice of the filtration that associates a persistence diagram with digital data. The second step is the choice of the representation method for the persistence diagrams, which often relies on several parameters. In this work we develop a pipeline that associates persistence diagrams to digital data, via the most appropriate filtration for the type of data considered. Using a grid search approach, this pipeline determines optimal representation methods and parameters. We assess the performance of our pipeline, and in parallel we compare the different representation methods, on popular benchmark datasets. This work is a first step towards both an easy, ready to use, pipeline for data classification using persistent homology and machine learning, and to understand the theoretical reasons why, given a dataset and a task to be performed, a pair (filtration, topological representation) is better than another.

A topological pipeline for machine learning

Conti F

2022

Abstract

The development of a topological pipeline for machine learning involves two crucial steps that strongly influence the performance of the pipeline. The first step is the choice of the filtration that associates a persistence diagram with digital data. The second step is the choice of the representation method for the persistence diagrams, which often relies on several parameters. In this work we develop a pipeline that associates persistence diagrams to digital data, via the most appropriate filtration for the type of data considered. Using a grid search approach, this pipeline determines optimal representation methods and parameters. We assess the performance of our pipeline, and in parallel we compare the different representation methods, on popular benchmark datasets. This work is a first step towards both an easy, ready to use, pipeline for data classification using persistent homology and machine learning, and to understand the theoretical reasons why, given a dataset and a task to be performed, a pair (filtration, topological representation) is better than another.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2022
			
	Strutture organizzative
	
				Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
			
	Parole chiave
	
				Topological data analysis
Machine learning
Persistent homology
Pipeline
			
	Appare nelle tipologie:
	
				04.03 Poster in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
prod_467058-doc_183749.pdf accesso aperto Descrizione: A topological pipeline for machine learning Tipologia: Versione Editoriale (PDF) Dimensione 958.96 kB Formato Adobe PDF Visualizza/Apri	958.96 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/440806

Citazioni

ND

ND

ND

social impact