In this paper, we provide explicit upper bounds on some distances between the (law of the) output of a random Gaussian neural network and (the law of) a random Gaussian vector. Our main results concern deep random Gaussian neural networks with a rather general activation function. The upper bounds show how the widths of the layers, the activation function, and other architecture parameters affect the Gaussian approximation of the output. Our techniques, relying on Stein's method and integration by parts formulas for the Gaussian law, yield estimates on distances that are indeed integral probability metrics and include the convex distance. This latter metric is defined by testing against indicator functions of measurable convex sets and so allows for accurate estimates of the probability that the output is localized in some region of the space, which is an aspect of a significant interest both from a practitioner's and a theorist's perspective. We illustrated our results by some numerical examples.
Normal Approximation of Random Gaussian Neural Networks
Nicola ApollonioPrimo
;Daniela De Canditiis;Giovanni Franzina;Paola Stolfi
;Giovanni Luca Torrisi
2024
Abstract
In this paper, we provide explicit upper bounds on some distances between the (law of the) output of a random Gaussian neural network and (the law of) a random Gaussian vector. Our main results concern deep random Gaussian neural networks with a rather general activation function. The upper bounds show how the widths of the layers, the activation function, and other architecture parameters affect the Gaussian approximation of the output. Our techniques, relying on Stein's method and integration by parts formulas for the Gaussian law, yield estimates on distances that are indeed integral probability metrics and include the convex distance. This latter metric is defined by testing against indicator functions of measurable convex sets and so allows for accurate estimates of the probability that the output is localized in some region of the space, which is an aspect of a significant interest both from a practitioner's and a theorist's perspective. We illustrated our results by some numerical examples.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.