The VEGAS algorithm of Lepage is based on importance sampling. It samples points from the probability distribution described by the function @math{|f|}, so that the points are concentrated in the regions that make the largest contribution to the integral.
In general, if the Monte Carlo integral of @math{f} is sampled with points distributed according to a probability distribution described by the function @math{g}, we obtain an estimate @math{E_g(f; N)},
with a corresponding variance,
If the probability distribution is chosen as @math{g = |f|/I(|f|)} then it can be shown that the variance @math{V_g(f; N)} vanishes, and the error in the estimate will be zero. In practice it is not possible to sample from the exact distribution @math{g} for an arbitrary function, so importance sampling algorithms aim to produce efficient approximations to the desired distribution.
The VEGAS algorithm approximates the exact distribution by making a number of passes over the integration region while histogramming the function @math{f}. Each histogram is used to define a sampling distribution for the next pass. Asymptotically this procedure converges to the desired distribution. In order to avoid the number of histogram bins growing like @math{K^d} the probability distribution is approximated by a separable function: @math{g(x_1, x_2, ...) = g_1(x_1) g_2(x_2) ...} so that the number of bins required is only @math{Kd}. This is equivalent to locating the peaks of the function from the projections of the integrand onto the coordinate axes. The efficiency of VEGAS depends on the validity of this assumption. It is most efficient when the peaks of the integrand are well-localized. If an integrand can be rewritten in a form which is approximately separable this will increase the efficiency of integration with VEGAS.
VEGAS incorporates a number of additional features, and combines both stratified sampling and importance sampling. The integration region is divided into a number of "boxes", with each box getting in fixed number of points (the goal is 2). Each box can then have a fractional number of bins, but if bins/box is less than two, Vegas switches to a kind variance reduction (rather than importance sampling).
The VEGAS algorithm computes a number of independent estimates of the
integral internally, according to the iterations
parameter
described below, and returns their weighted average. Random sampling of
the integrand can occasionally produce an estimate where the error is
zero, particularly if the function is constant in some regions. An
estimate with zero error causes the weighted average to break down and
must be handled separately. In the original Fortran implementations of
VEGAS the error estimate is made non-zero by substituting a small
value (typically 1e-30
). The implementation in GSL differs from
this and avoids the use of an arbitrary constant -- it either assigns
the value a weight which is the average weight of the preceding
estimates or discards it according to the following procedure,
The VEGAS algorithm is highly configurable. The following variables
can be accessed through the gsl_monte_vegas_state
struct,
alpha
controls the stiffness of the rebinning
algorithm. It is typically set between one and two. A value of zero
prevents rebinning of the grid. The default value is 1.5.
stage = 0
which begins with a new uniform grid and empty weighted
average. Calling vegas with stage = 1
retains the grid from the
previous run but discards the weighted average, so that one can "tune"
the grid using a relatively small number of points and then do a large
run with stage = 1
on the optimized grid. Setting stage =
2
keeps the grid and the weighted average from the previous run, but
may increase (or decrease) the number of histogram bins in the grid
depending on the number of calls available. Choosing stage = 3
enters at the main loop, so that nothing is changed, and is equivalent
to performing additional iterations in a previous call.
GSL_VEGAS_MODE_IMPORTANCE
,
GSL_VEGAS_MODE_STRATIFIED
, GSL_VEGAS_MODE_IMPORTANCE_ONLY
.
This determines whether VEGAS will use importance sampling or
stratified sampling, or whether it can pick on its own. In low
dimensions VEGAS uses strict stratified sampling (more precisely,
stratified sampling is chosen if there are fewer than 2 bins per box).
-1
, which turns off all output. A
verbose value of 0
prints summary information about the
weighted average and final result, while a value of 1
also
displays the grid coordinates. A value of 2
prints information
from the rebinning procedure for each iteration.