Clustering univariate functional data is mostly based on projecting the curves onto an adequate basis and applying some distance or similarity models on the coefficients. The basis functions should be chosen depending on features of the function being estimated. Commonly used are Fourier, polynomial and splines, but these may not be well suited for curves that exhibit inhomogeneous behavior. Wavelets on the contrary are well suited for identifying highly discriminant local time and scale features, and are able to adapt to the data smoothness. In recent years, few methods, relying on wavelet-based similarity measures, have been proposed for clustering curves, observed on equidistant points. In this work, we present a non-equidistant design wavelet based method for non-parametrically estimating and clustering a large number of curves. The method consists of several crucial stages: fitting functional data by non-equispaced design wavelet regression, screening out nearly flat curves, denoising the remaining curves with wavelet thresholding, and finally clustering the denoised curves. Simulation studies compare our proposed method with some other functional clustering methods. The method is applied for clustering some real functional data profiles.

Unsupervised curve clustering using wavelets

Amato, Umberto;De Feis, Italia;
2024

Abstract

Clustering univariate functional data is mostly based on projecting the curves onto an adequate basis and applying some distance or similarity models on the coefficients. The basis functions should be chosen depending on features of the function being estimated. Commonly used are Fourier, polynomial and splines, but these may not be well suited for curves that exhibit inhomogeneous behavior. Wavelets on the contrary are well suited for identifying highly discriminant local time and scale features, and are able to adapt to the data smoothness. In recent years, few methods, relying on wavelet-based similarity measures, have been proposed for clustering curves, observed on equidistant points. In this work, we present a non-equidistant design wavelet based method for non-parametrically estimating and clustering a large number of curves. The method consists of several crucial stages: fitting functional data by non-equispaced design wavelet regression, screening out nearly flat curves, denoising the remaining curves with wavelet thresholding, and finally clustering the denoised curves. Simulation studies compare our proposed method with some other functional clustering methods. The method is applied for clustering some real functional data profiles.
2024
Istituto per le applicazioni del calcolo - IAC - Sede Secondaria Napoli
Istituto di Scienze Applicate e Sistemi Intelligenti "Eduardo Caianiello" - ISASI - Sede Secondaria Napoli
False discovery rate; Functional data; High-dimensional testing; k-means;
File in questo prodotto:
File Dimensione Formato  
s11634-024-00612-7.pdf

solo utenti autorizzati

Tipologia: Versione Editoriale (PDF)
Licenza: NON PUBBLICO - Accesso privato/ristretto
Dimensione 1.57 MB
Formato Adobe PDF
1.57 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/512791
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact