This work provides a comprehensive theoretical and empirical analysis of SwitchPath, a stochastic activation function that improves learning dynamics by probabilistically toggling between a neuron standard activation and its negation. We develop theoretical foundations and demonstrate its impact in multiple scenarios. By maintaining gradient flow and injecting controlled stochasticity, the method improves generalization, uncertainty estimation, and training efficiency. Experiments in classification show consistent gains over ReLU and Leaky ReLU across CNNs and Vision Transformers, with reduced overfitting and better test accuracy. In generative modeling, a novel two-phase training scheme significantly mitigates mode collapse and accelerates convergence. Our theoretical analysis reveals that SwitchPath introduces a form of multiplicative noise that acts as a structural regularizer. Additional empirical investigations show improved information propagation and reduced model complexity. These results establish this activation mechanism as a simple yet effective way to enhance exploration, regularization, and reliability in modern neural networks.

Exploration and generalization in deep learning with SwitchPath activations

Metta C.
;
2025

Abstract

This work provides a comprehensive theoretical and empirical analysis of SwitchPath, a stochastic activation function that improves learning dynamics by probabilistically toggling between a neuron standard activation and its negation. We develop theoretical foundations and demonstrate its impact in multiple scenarios. By maintaining gradient flow and injecting controlled stochasticity, the method improves generalization, uncertainty estimation, and training efficiency. Experiments in classification show consistent gains over ReLU and Leaky ReLU across CNNs and Vision Transformers, with reduced overfitting and better test accuracy. In generative modeling, a novel two-phase training scheme significantly mitigates mode collapse and accelerates convergence. Our theoretical analysis reveals that SwitchPath introduces a form of multiplicative noise that acts as a structural regularizer. Additional empirical investigations show improved information propagation and reduced model complexity. These results establish this activation mechanism as a simple yet effective way to enhance exploration, regularization, and reliability in modern neural networks.
2025
Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
Deep learning, Neural network algorithms, Generative networks
File in questo prodotto:
File Dimensione Formato  
s10994-025-06840-y.pdf

solo utenti autorizzati

Descrizione: Exploration and generalization in deep learning with SwitchPath activations
Tipologia: Versione Editoriale (PDF)
Licenza: NON PUBBLICO - Accesso privato/ristretto
Dimensione 3.83 MB
Formato Adobe PDF
3.83 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/552044
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 1
social impact