Yang Song, Stefano Ermon.
Generative modeling by estimating the gradients of the data distribution. In Advances in Neural Information Processing Systems, 2019.
@article{song2019generative,
title={Generative modeling by estimating gradients of the data distribution},
author={Song, Yang and Ermon, Stefano},
journal={Advances in neural information processing systems},
volume={32},
year={2019}
}
Before reading this, please check the paper: https://www.notion.so/Song-et-a-UAI-2020-Sliced-Score-Matching-A-Scalable-Approach-to-Density-and-Score-Estimation-123658e019b480688970f7b59db617e0?pvs=4
TD;DR: First method to outperform GANS in inception score.
Given a large dataset, we can use principle statistical methods like score matching to train a score model to estimate the underlying score function.
In order to build a genertative model, we have to find a certain approach to create new datapoints from the given vector field of score functions.
But, how can we do this?
Sppose we are given the score function $s_\theta(\mathbf{x})$ (Fig. 1, left) and imagine that there are many random points scattered accross it (Fig. 1, middle). Can we move those data points to form samples from the score function? One idea is that we can potentially move those points by following the directions predicted by the score function. However this will not give us valid samples because all of those points will eventually collapse into each other (Fig. 1, right).
This problem can be addressed by following a noise inversion of the score function. Equivalently, we inject gaussian noise to our score function and follow those noise perturbed score functions (Fig. 1, right). If we keep this sampling procedure long enough to reach convergence, and if we set the step size to be very very small, then this method will give us the orrect samples from the score function.
This method is the well-known approach of Langevin dynamics [Parisi 1981], [Grenander and Miller 1994].
Langevin dynamics describes a type of stochastic differential equation (SDE) commonly used to model the evolution of systems subject to both deterministic forces and random fluctuations (noise). In its most basic form, Langevin dynamics for a particle can be written as:
\[ \frac{d\mathbf{x}}{dt} = -\nabla U(\mathbf{x}) + \gamma \mathbf{z}(t), \]