| Trees | Indices | Help |
|
|---|
|
|
Adaptive Markov-Chain Monte-Carlo algorithm. This can be used to generate samples from a probability distribution, or also as a simulated annealing algorithm for maximization. This can be used as a script, from the command line, or (more likely) it can be imported and its functions and classes can be used.
The central interfaces are the BootStepper
class, and within it, the step() method is used iteratively
to take a Markov step.
To use it as a script, you create your own module that implements a
few functions, and run mcmc.py. It will import your module
and call your functions repeatedly.
The algorithm is described in Kochanski and Rosner 2010, and earlier
versions have been used in ZZZ. It evolved originally from amoeba_anneal
(in Numerical Recipes, Press et al.). The essential feature is that it
keeps a large archive of previous positions (possibly many times more
than N of them). It samples two positions from the archive
and subtracts them to generate candidate steps. This has the nice
property that when sampling from a multivariate Gaussian distribution,
the candidate steps match the distribution nicely.
It can be operated in two modes (or set to automatically switch). One is optimization mode where it heads for the maximum of the probability distribution. The other mode is sampling mode, where it asymptotically follows a Markov sampling procedure and has the proper statistical properties.
As a script, it takes the following arguments:
-o fff Specifies a log file for some basic logging.
ddd instead of
the python search path, looking for a module named mm.
If no slash, then it looks up the module in PYTHONPATH.
-NI NN Sets the number of iterations. Required,
unless the module defines yourmod.NI.
-- Stops interpretation of the argument list. The
remainder is passed to yourmod.init(), if it is defined, then to
yourmod.start(), if that is defined.
It uses the following functions and variables:
yourmod.c (required). This is an object that is passed
into yourmod.resid() or yourmod.logp().
You can make it be anything you want, as it is only touched by your
functions. Often, it contains data or parameters for these functions,
but it can be None if you have no need for
it.
yourmod.start(arglist, c) (Not required). This is where
you initialize the Monte-Carlo iteration. If you define
yourmod.start(), it is passed the arglist, after the standard
mcmc.py flags and arguments are removed from it.
Yourmod.start() must return a list of one or more
starting vectors, (either a list of numpy arrays or a list of
sequences of floats). These starting vectors are used to seed the
iteration.
If it is not defined, mcmc.py will read a list of starting vectors in ascii format from the standard input using _read().
yourmod.init(ndim, ni, c, arglist) (Not required).
This is primarily there to open a log file. Init is passed the
remainder of the scripts argument list, after yourmod.start() looks
at it. It can do anything it wishes with the argument list. It
can return an object or None.
If it returns something other than None,
it will be used this way:
x = yourmod.init(ndim, ni, yourmod.c, arglist) x.add(parameters, iteration) # add is there to accumulate averages of parameters # or to log parameter vectors. x.finish(sys.stdout) # finish can be used to print a summary or close a log file.
Ndim is the length of the parameter vector,
ni is the number of iterations, c is
yourmod.c.
yourmod.resid(x, c) (Either yourmod.logp() or
yourmod.resid() is required.)
yourmod.logp(x, c) (Either yourmod.logp or
yourmod.resid is required.) These funtions are the thing
which is to be optimized. Yourmod.log returns the
logarithm (base e) of the probability density at position
x. (It does not need to be normalized, so it really
only needs something proportional to the probability density.)
X is a numpy array (a 1-dimensional vector), and
c is an arbitrary object that you define.
Yourmod.logp should return None if x is a silly value. Either that, or it
can raise a NotGoodPosition exception. Note that the optimizer
will tend to increase values of logp. It's a maximizer, not a
minimizer!
If you give yourmod.resid() instead, logp() will be calculated as -0.5 times the sum of the squares of the residuals. It should return a numpy vector. It may return None, or raise NotGoodPosition, which will cause the optimizer to treat the position as extremely bad.
If it is not specified, mcmc will use some crude approximation and will probably start more slowly, but should work OK. If it is not specified and yourmod.start() returns only one starting position, then it will just use an identity matrix as the covariance matrix, and it might still work, but don't expect too much.
| Classes | |
| def_logger | |
| Functions | |||
|
|||
|
|||
| Variables | |
Debug = 0
|
|
MEMORYS_WORTH_OF_PARAMETERS = 100000000.0
|
|
__package__ = Nonehash(x) |
|
Imports: sys, numpy, die, mcmc, gpkmisc, g_implements, MVN
| Trees | Indices | Help |
|
|---|
| Generated by Epydoc 3.0.1 on Thu Sep 22 04:25:02 2011 | http://epydoc.sourceforge.net |