denoising using generative models

This post is about my latests preprint with Dustin Mixon about signal denosing with generative networks.

Exploiting the structure of signals is a key idea in signal processing. For instance, natural images are known to be sparse on wavelets basis, and sparsity allows for recovering signals from few measurements (à la compressed sensing).

Generative models

The current trend, that takes advantage of the empirical success of deep learning, is to learn the structure of the signal first, and then exploit it. One way to represent the structure of signals is through a generative model, i.e. a parametrization of the data

such that is close to the probability density of the data of interest.

Very impressive generative models have been produced with autoencoders and generative adversarial networks. However, it is a hard engineering task that involves tuning a lot of parameters and hours if not days of training neural networks with highly dedicated hardware. There doesn’t seem to exist a provable way to produce them successfully and it is not clear whether a reasonable data distribution is actually learned. But there exist very impressive examples, like this one from Google:

This people do not exists, their faces were randomly generated

Inverse problems with generative models

Once you have a good generative model you can do amazing things with it. Bora, Jalal, Price and Dimakis from UT Austin empirically showed that they can solve the compressed sensing problem with 10 times fewer measurements than classical compressed sensing requires by replacing the sparse signal assumption by an appropriate generative model . Their theoretical result shows that under mild hypothesis, if ( is the noise), then

satisfies .

However, it is not obvious that one can efficiently solve the optimization problem (1) since its landscape may a priori have many local minima. Recent work by Hand and Voroninski shows the optimization problem (1) can be solved using local methods provided that where and are matrices with i.i.d. Gaussian entries properly scaled. There is no learning in this setting.

SUNLayer and denoising with generative models

Motivated by both works, Dustin and I studied the simpler inverse problem of signal denoising with generative networks. The aim is to explain the phenomenon illustrated in the figure at the top of this post, i.e. given a noisy signal then one can denoise by finding

and the optimization problem can be solved by local methods like gradient descent (i.e. no spurious critical points).

In order to do that we consider a simpler model for a generative model inspired by neural networks. One layer of the SUNLayer (spherical uniform neural layer) is defined as

where is an arbitrary activation function. We aim to answer what properties good activation functions need to have in order to allow denoising with local methods in this simplified model.

Consider the decomposition of in the Gegenbauer polynomials (choosing the correct normalization of the polynomials will be important). If let . Then the critical points of (2) are close to if is not too small in comparison with .

Spherical harmonics

Squaring the coefficients in the Gegenbauer decomposition may look a priori mysterious. However, it shows up naturally due to the nice properties of the spherical harmonics. Consider the simplified setting when there is no noise. We have

and independent of . Now we use the relationship between the Gegenbauer polynomials and the spherical harmonics (see chapter 2 of [Morimoto, Analytic functionals on the sphere, 1998]). Decompose where are the spherical harmonics (homogeneous polynomials in variables, of degree , with Laplacian 0, restricted to the sphere). In particular is a finite dimensional vector space. Let a basis of , then define the bilinear form for .

It turns out is a reproducing kernel: and it also satisfies that only depends on the inner product and in fact . Then

Therefore

and a simple computation shows that the only critical points are if for all . The analysis for the noisy case that we do in the paper is still simple but more interesting, and it involves considering tight frames in .