Class mcmc_para_base (o2scl)¶
-
template<class func_t, class measure_t, class data_t, class vec_t = ubvector, class stepper_t = mcmc_stepper_rw<func_t, data_t, vec_t>>
class mcmc_para_base¶ A generic MCMC simulation class.
This class performs a Markov chain Monte Carlo simulation of a user-specified function using OpenMP and/or MPI. The class works with a stepper object of type mcmc_stepper_base or an internal implementation of an affine-invariant sampling method. See also mcmc_stepper_rw (the default), mcmc_stepper_mh, and mcmc_stepper_hmc.
The affine-invariant sampling method follows algorithm [Goodman10].
The function type is a template type,
func_t
, which should be of the formwhich computesint f(size_t num_of_parameters, const vec_t ¶meters, double &log_pdf, data_t &dat)
log_pdf
, the natural logarithm of the function value, for any point in parameter space (any point betweenlow
andhigh
).If the function being simulated returns mcmc_skip then the point is automatically rejected. After each acceptance or rejection, a user-specified “measurement” function (of type
measure_t
) is called, which can be used to store the results. In either the measurement function or the probability distribution function returns the value mcmc_done, then the MCMC stops.A generic proposal distribution can be specified in set_proposal(). To go back to the default random walk method, one can call the function unset_proposal().
If aff_inv is set to true, then affine-invariant sampling is used. For affine-invariant sampling, the variable step_fac represents the value of \( a \), the limits of the distribution for \( z \) (which defaults to 2). If aff_inv is true and an initial point fails, then mcmc() chooses random points inside the hypercube to attempt to find enough initial points to proceed. This method of finding initial points, however, is often too slow for large parameter spaces.
Affine-invariant sampling works best when the number of walkers is much larger than the number of parameters. If n_walk is 0 or 1, then this class automatically sets n_walk to three times the number of parameters. This class will otherwise allow the user to set a smaller number of walkers than parameters without notifying the user.
In order to store data at each point, the user can store this data in any object of type
data_t
. If affine-invariant sampling is used, then each chain has it’s own data object. The class keeps twice as many copies of these data object as would otherwise be required, in order to avoid copying of data objects in the case that the steps are accepted or rejected.Whether or not aff_inv is true, there is a virtual function called outside_parallel() which is called during the MCMC. Class descendants can replace this function with code which must be run outside of an OpenMP parallel region. Note that this is not implemented via, e.g. an OpenMP CRITICAL region so that no loss of performance is expected. If aff_inv is false, then outside_parallel() is called every steps_in_parallel MCMC steps (for each OpenMP thread). If aff_inv is true, then outside_parallel() is called after all the walkers have completed for each thread.
Verbose output: If verbose is 0, no output is generated (the default). If verbose is 1, then output to
cout
occurs only if the settings are somehow misconfigured and the class attempts to recover from them, for example if not enough functions are specified for the requested number of OpenMP threads, or if more than one thread was requested but O2SCL_SET_OPENMP was not defined, or if a negative value for step_fac was requested. When verbose is 1, a couple messages are written to scr_out : a summary of the number of walkers, chains, and threads at the beginning of the MCMC simulation, a message indicating why the MCMC simulation stopped, a message when the warm up iterations are completed, a message every time files are written to disk, and a message at the end counting the number of acceptances and rejections. If verbose is 2, then the file prefix is output tocout
during initialization.Todos
Todo
In class ref mcmc_para_base:
The main loop with the affine-invariant sampling could be modified with a new inner loop to do many function evaluations for each thread. However, I think this would demand combining the two sequential parallel loops.
There is a little code in mcmc_init() and mcmc_cleanup() and I should document why that code needs to be there.
Subclassed by o2scl::mcmc_para_table< func_t, fill_t, data_t, ubvector, mcmc_stepper_rw< func_t, data_t, ubvector > >, o2scl::mcmc_para_table< func_t, fill_t, data_t, vec_t, stepper_t >
MPI properties
-
int mpi_rank¶
The MPI processor rank.
-
int mpi_size¶
The MPI number of processors.
-
std::ofstream scr_out¶
The screen output file.
-
std::vector<o2scl::prob_cond_mdim<vec_t>*> prop_dist¶
Pointer to proposal distribution for each thread.
-
bool pd_mode¶
If true, then use the user-specified proposal distribution (default false)
-
bool warm_up¶
If true, we are in the warm up phase (default false)
-
std::vector<vec_t> current¶
Current points in parameter space for each walker and each OpenMP thread.
This is an array of size n_threads times n_walk initial guesses, indexed by
thread_index*n_walk+walker_index
.
-
std::vector<bool> switch_arr¶
Data switch array for each walker and each OpenMP thread.
This is an array of size n_threads times n_walk initial guesses, indexed by
thread_index*n_walk+walker_index
.
-
std::vector<std::vector<size_t>> ret_value_counts¶
Return value counters, one vector for each independent chain.
Interface customization
-
std::vector<size_t> curr_walker¶
Index of the current walker, one for each OpenMP thread.
This quantity has to be a vector because different threads may have different values for the current walker during the initialization phase for the affine sampling algorithm.
-
bool meas_for_initial¶
If true, call the measurement function for the initial point.
-
static const int mcmc_done = -10¶
Integer to indicate completion.
-
static const int mcmc_skip = -20¶
Integer to indicate rejection.
-
inline virtual int mcmc_init()¶
Initializations before the MCMC.
-
inline virtual void mcmc_cleanup()¶
Cleanup after the MCMC.
Output quantities
Settings
-
double mpi_start_time¶
The MPI starting time (defaults to 0.0)
This can be set by the user before mcmc() is called, so that the time required for initializations before the MCMC starts can be counted.
-
size_t max_iters¶
If non-zero, the maximum number of MCMC iterations (default 0)
If both max_iters and max_time are nonzero, the MCMC will stop when either the number of iterations exceeds max_iters or the time exceeds max_time, whichever occurs first.
-
double max_time¶
Time in seconds (default is 0)
If both max_iters and max_time are nonzero, the MCMC will stop when either the number of iterations exceeds max_iters or the time exceeds max_time, whichever occurs first.
-
std::string prefix¶
Prefix for output filenames (default “mcmc”)
-
bool aff_inv¶
If true, use affine-invariant Monte Carlo (default false)
-
double step_fac¶
Stepsize factor for affine-invariant sampling (default 2.0)
-
bool couple_threads¶
If true, couple the walkers across threads during affine-invariant sampling (default false)
-
size_t n_warm_up¶
Number of warm up steps (successful steps not iterations) (default 0)
Note
Not to be confused with
warm_up
, which is a protected boolean local variable in some functions which indicates whether we’re in warm up mode or not.
-
int user_seed¶
If non-zero, use this number as the seed for the random number generator (default 0)
The random number generator is modified so that each OpenMP thread and each MPI rank has a different set of random numbers.
If this value is zero, then the random number generators are seeded by the clock time in seconds, so that if two separate simulations begin at the same time (to within 1 second) they will produce identical results. This can be avoided simply by ensuring that user_seed is different between the two simulations.
-
int verbose¶
Output control (default 0)
-
size_t max_bad_steps¶
Maximum number of failed steps when generating initial points with affine-invariant sampling (default 1000)
-
size_t n_walk¶
Number of walkers (per openMP thread) for affine-invariant MC or 1 otherwise (default 1)
-
bool err_nonconv¶
If true, call the error handler when either the object function or the measure function does not return success (default true)
-
bool always_accept¶
If true, accept all steps.
-
double ai_initial_step¶
Initial step fraction for affine-invariance sampling walkers (default 0.1)
-
size_t n_threads¶
Number of OpenMP threads.
-
std::vector<ubvector> initial_points¶
Initial points in parameter space.
To fully specify all of the initial points, this should be a vector of size n_walk times n_threads . Initial points are used for multiple threads and/or walkers if the full number of initial points is not specified.
If this is empty, then the midpoint between
low
andhigh
is used as the initial point for all threads and walkers. All initial points must be betweenlow
andhigh
, or the error handler will be called.
-
size_t steps_in_parallel¶
The number of steps in parallel when affine invariant sampling is not used (default 100)
-
inline mcmc_para_base()¶
-
inline virtual void outside_parallel()¶
Function outside parallel region.
Basic usage
-
inline virtual int mcmc(size_t n_params, vec_t &low, vec_t &high, std::vector<func_t> &func, std::vector<measure_t> &meas, std::vector<data_t> &data)¶
Perform a MCMC simulation.
Perform a MCMC simulation over
n_params
parameters starting at initial pointinit
, limiting the parameters to be betweenlow
andhigh
, usingfunc
as the objective function and calling the measurement functionmeas
at each MC point.The vector
data
should be of size2*n_walk*n_threads
.
Proposal distribution
-
template<class prob_vec_t>
inline void set_proposal(prob_vec_t &pv)¶ Set the proposal distribution.
-
template<class prob_ptr_vec_t>
inline void set_proposal_ptrs(prob_ptr_vec_t &pv)¶ Set pointers to proposal distributions.
-
inline virtual void unset_proposal()¶
Go back to random-walk Metropolis with a uniform distribution.