Class mcmc_stepper_hmc (o2scl)¶

O2scl : Class List

template<class func_t, class data_t, class vec_t, class grad_t = std::function<int(size_t, const vec_t&, func_t&, vec_t&, data_t&)>, class vec_bool_t = std::vector<bool>> class mcmc_stepper_hmc : public o2scl::mcmc_stepper_base<func_t, data_t, vec_t>¶

Hamiltonian Monte Carlo for MCMC.

The vectors step_fac, mom_step, and auto_grad are provided to give different values for each of the parameters. However, if these vectors have a smaller size, then the vector index is wrapped back around to the beginning (using the modulus operator).

The Hamiltonian is

\[ H(q,p) = \frac{1}{2} p^{T} M^{-1} p - \log {\cal{L}} (q) \]

where \( M \) is the mass matrix and \( {\cal{L}} \) is the likelihood. (The mcmc_para_base class presumes flat prior distributions.) This class assumes the mass matrix is the identity matrix.

The step is then accepted with probability

\[ \mathrm{min}\left\{ 1,\frac{\exp \left[-H_{\mathrm{new}}\right]} {\exp \left[-H_{\mathrm{old}} \right]} \right\} \]

Because the user-specified function, \( f \), computes \( \log {\cal{L}} (q) \), the step should be accepted with probability

\[ \mathrm{min} \left\{ 1,\exp \left[ - \sum_i^{N} p_{i,\mathrm{new}} ^2 \mu_i/2 + f_{\mathrm{new}} + \sum_i^{N} p_{i,\mathrm{old}} ^2 \mu_i/2 -f_{\mathrm{old}} \right] \right\} \]

where \( i \) is an index over \( N \) parameters and \( \mu_i \) is the inverse mass for parameter \( i \).

This class can compute the gradients automatically by finite-differencing or can use a gradient function specified by the user. The vector auto_grad controls this behavior. If the value of

auto_grad[ i % auto_grad.size() ]

is false, then it is presumed that the derivative with respect to the parameter with index i should be computed automatically by this stepper class. If it is true, then it is presumed that then the value will be obtained from the user-specified gradient function.

The gradient function, if specified, should be of the form

int grad(size_t nv, const vec_t &x, func_t &f,
vec_t &g, data_t &dat);

If the initial gradient calculation fails, then the HMC cannot proceed and the random walk (RW) algorithm from mcmc_stepper_rw is used as a fallback. If this fallback method is required frequently over the course of a simulation, then this may mean the combined HMC plus RW method may not be sampling the target distribution. This class doesn’t yet have a method for tracking this.

The algorithm is taken from [Neal11].

See the class documentation for mcmc_stepper_base for more information.

Public Functions

inline virtual const char *step_type()¶: Stepper type, “HMC”.

inline mcmc_stepper_hmc()¶

inline virtual ~mcmc_stepper_hmc()¶

inline void allocate(size_t n_params, size_t n_threads)¶: Allocate cache.

inline virtual void write_params(o2scl_hdf::hdf_file &hf)¶: Write stepper parameters to the HDF5 file.

inline void set_gradients(std::vector<grad_t> &vg)¶

Set the vector of user-specified gradients.

Note that this function stores a pointer to the vector of gradient objects and thus the user must ensure that this object is in scope when the MCMC is performed.

inline int grad_pot(size_t n_params, const vec_t &x, func_t &f, vec_t &g, data_t &dat)¶: Automatically compute the gradient using finite-differencing.

Note

The potential energy is the negative of the log-likelihood. This function returns the gradient of the log-likelihood, which is the negative of the gradient of the potential energy.

inline virtual void step(size_t i_thread, size_t n_params, func_t &f, const vec_t &current, vec_t &next, double w_current, double &w_next, const vec_t &low, const vec_t &high, int &func_ret, bool &accept, data_t &dat, rng<> &r, int verbose)¶

Construct a step.

This function constructs next and w_next, the next point and log weight in parameter space. The objective function f is then evaluated at the new point, the return value is placed in func_ret, and the step acceptance or rejection is stored in accept.

The first half step is:

\[\begin{split}\begin{eqnarray*} p^{1/2} &=& p^{0} - (\epsilon)/2 \frac{\partial U}{\partial q}(q^{0}) \\ \end{eqnarray*}\end{split}\]

Then for \( i \in [1,N] \),

\[\begin{split}\begin{eqnarray*} q^{i} &=& q^{i-1} + \epsilon p^{i-1/2} \\ \mathrm{if~(i<N)}\quad~p^{i+1/2} &=& p^{i-1/2} - \epsilon \frac{\partial U}{\partial q}(q^{i}) \\ \end{eqnarray*}\end{split}\]

The last half step is:

\[\begin{split}\begin{eqnarray*} p^{N} &=& p^{N-1/2} - (\epsilon)/2 \frac{\partial U}{\partial q}(q^{N}) \\ \end{eqnarray*}\end{split}\]

Public Members

vec_t step_fac¶: The factor controlling the step size for the fallback random walk (default is a 1-element vector containing 10.0)

size_t traj_length¶: Trajectory length (default 20)

prob_dens_gaussian pdg¶: Standard Gaussian for kinetic energy.

vec_t mom_step¶: Initial distribution in momentum space (default is a one-element vector containing 1.0)

double epsilon¶: Scale for gradient (default 0.2)

vec_bool_t auto_grad¶

Indicate which elements of the gradient need to be computed automatically (default is a one-element vector containing true).

For parameters in which this vector has an entry of true, it is assumed that the user-specified gradient object (if present) cannot compute the gradient and thus a simple numerical finite-differencing gradient is required.

double epsrel¶: The relative stepsize for finite-differencing (default \( 10^{-6} \) )

double epsmin¶: The minimum stepsize (default \( 10^{-15} \))

ubmatrix param_cache¶: Cache for parameters.

ubmatrix grad_cache¶: Cache for gradients.

Public Static Attributes

static const size_t grad_failed = 30¶: Error if gradient failed.

Protected Attributes

std::vector<grad_t> *grad_ptr¶: Pointer to user-specified gradients.