Boost.Random and Boost.Accumulators – Part 1

Wallpapers-random-30957435-1920-1080Boost

 

Bonjour! This is the first post of two which will discuss Boost’s random number generators and statistical accumulator functionality. They are both useful for Monte Carlo and statistical studies and so I will outline their use in turn in separate postings.

Boost.Random To begin with we will take look at an example of how to set up a random number generator for a given distribution. To begin I will point out what random distributions Boost comes pre-packed with, then provide an example using a uniform distribution. I will then post a link to a wrapper class that allows you to use 9 different random distributions. The official documentation for Boost.Random is always the best place to start! I hope to provide some insight into what I would deem the most common distributions but you will come across far more with some light reading of the Boost.Random pages (Boost.Random), which in my opinion, are some of the better documented in the Boost collection.

Why Boost.Random? The basis for random number generation is built in a series of software implemented algorithms that when given a “seed” or initial state, go on to generate a sequence of random numbers. These types of applied algorithms are usually known as pseudo-random number generators (PRNG). There are many use cases for pseudo-random number generators including but not limited to – encryption, simulations, games, modelling random processes as well as many other real world applications. Of course there is no such thing as a completely random number generating algorithm, if we know the initial conditions then we can work out the random number that would be produced (if we could be bothered to follow the calculation). If we take a large enough ensemble of these random events, they will display similar behaviour to that of the requested distribution. I want to introduce the following examples provided by the Boost.Random library:-

  1. Uniform distribution – A function with constant probability that is defined on the bounded region, f(x;a,b) with x \in (a,b). An example could be some form of combinatorial background, such that random combinations populate the probability space but have no peaking structures; exhibiting a flat (or uniform) distribution.
  2. Gaussian distribution – A function containing a mean and the spread about that mean, taking two arguments f(x;\mu,\sigma). This is one of the most common statistical distributions there are!
  3. Exponential distribution – Generally used to model waiting times between events, such as a Poisson distribution. All these concern the time we need to wait before a given event occurs. If this waiting time is unknown, it is often appropriate to think of it as a random variable having an exponential distribution
  4. Gamma distribution – Based on the \Gamma(\alpha,\beta) function.
  5. Chi-Squared distribution –  It is one of the most widely used in hypothesis testing or in the construction of confidence intervals. Can provide measures of goodness for a fit to data points for example.
  6. Cauchy distribution (to those particle physicists out there, known as the non-relativistic Breit-Wigner distribution) – distribution which has no mean, variance or higher moments defined and spans the range x \in (-\infty,\infty).
  7. Poisson distribution – A discrete process, depicts the probability for the number of events occurring within some time frame or spacial window, classic example is radioactivity (nuclear decay) but there are many other applications.
  8. Binomial distribution – This again is a discrete random process, governs the probability of getting k successes given n trials and probability p for each test. Classic example is a coin toss.
  9. Triangular – Random distribution returning the distance between 3 random numbers.

For those interested, there are plenty of resources on-line, for instance see probability distributions.


Boost.Random the Code!

It all starts with the inclusion of the “boost/random.hpp” header file. This header calls all the necessary classes and constructs that we need for the random number distributions. I thought for now I can outline a very simple example of how to set up a random number distribution, then from there will introduce a simple class I made that has those listed above in a configurable way. The following example like I say is a simple explicit display of how to set up an integer uniform random number generator in the range (0,10).

#include "boost/random.hpp"
// Initialise Boost random numbers, uniform integers from min to max
 const int rangeMin = 0;
 const int rangeMax = 10;
 typedef boost::uniform_int<> NumberDistribution; // choose a distribution
 typedef boost::mt19937 RandomNumberGenerator;    // pick the random number generator method,
 typedef boost::variate_generator< RandomNumberGenerator&, NumberDistribution > Generator;  // link the generator to the distribution

 NumberDistribution distribution( rangeMin, rangeMax );
 RandomNumberGenerator generator;
 Generator numberGenerator(generator, distribution);
 generator.seed( seed ); // seed with some initial value

 int N(100);
 for (int i(0); i < N; ++i) {
  std::cout << numberGenerator() << std::endl; // each time the the operator()
 }

There are several typedef‘s in the example which just make for convenient coding, and these could easily be replaced by some other attribute to those provided. For instance, the random number generator is that of the Mersenne Twister algorithm, boost::mt19937, and the distribution is that of a uniform variate between 0 and 10, boost::uniform_int();. Feel free to browse generators or distributions respectively for an exhaustive list of other generators and distributions. The generator and distribution are then bound by the use of boost::variate_generator(RandomNumberGenerator&, NumberDistribution). This means that each time the numberGenerator() method is called a new random value will be generated, as the variate_generator increments the generator each time such that a new value is created. So hopefully that gives a brief overview. What I next outline is a wrapper class I made to make these a little more accessible and easy to configure. You can download a header file called BoostRandom.hpp which configures all of the 9 distributions I outline above. To get the code and some examples please get the following

# Download and execute variable binning macro.
wget https://dl.dropboxusercontent.com/u/88131281/BoostRandom.tar.gz
tar xzvf BoostRandom.tar.gz
cd BoostRandom
make
./bin/gauss.exe
./bin/all.exe

The first example shows a Gaussian distribution using the configurable, the second shows how we can set many distributions at the same time generating various random numbers at once. Have fun with it, I hope it is useful. The only thing that must always be done is to set the parameters for the distribution you want to use, there are no defaults! Also do not set the template to integers. Quite simply there is a type trait set in Boost.Random such that a uniform distribution between (0,1) always requires floating point precision. This kind of makes sense since an integer in this range would simply be 0 or 1 which would not require much effort. The Gaussian distribution (as well as others) uses such a uniform distribution as a building block since you can use a Box-Muller transformation to generate other distributions; like a Gaussian distribution. Hence only use this code for types that are: float, double and long double’s. Enjoy! A simple example is posted below so you can see how easy it is to configure the class.

// simple example of Gausian
#include <iostream>

// BoostRandom header
#include "BoostRandom.hpp"

using namespace boost::distribution;
int main() {
    
    // BoostRandom is templated, need to pass < NumberGenerator, precision type > e.g. BoostRandom< boost::mt19937, double >.
    // This uses the Mersenne Twister algorithm and is the one I use most of the time so there is a typdef to this BoostRandomD
    BoostRandomD random( Poisson, 0 );  // 0 could be std::time(0);
    random.setParams( Poisson_Mean, 5 );  
    std::cout << random() << std::endl;
    return 0;
}

The above code would simply print out 1 random number sampled from a Poisson distribution, but I think this highlights the ease of generation. There are a few useful functions in there such as “simulate” where you fill stl containers immediately with your desired distribution and however many variables you so choose.


Example Distributions!

See the below links to view all the distributions available in the BoostRandom class.

uniform triangular poisson gauss gamma exponential chi2 cauchy binomial