Random Number Generating

Vectorized Random Number Generating

The generic function rand provides vectorized random number generating. There are two main variants.

The first operates on RNG engines and generates unsigned random integers,

namespace mckl
{

template <typename RNGType>
void rand(RNGType &rng, size_t n, typename RNGType::result_type *r);

}

For example,

::mckl::RNG rng;
constexpr size_t n = 1024;
::mckl::RNG::result_type r[n];
::mckl::rand(rng, n, r);

The effect of the function call above is equivalent to the following loop,

for (size_t i = 0; i != n; ++i)
    r[i] = rng();

The results are always the same unless a Non-Deterministic Random Number Generators is used. For some RNGs implemented in MCKL, the vectorized function may have considerable performance advantage.

The second variant of rand is for generating distribution random variates,

namespace mckl
{

template <typename RNGType, typename DistributionType>
void rand(RNGType &rng, DistributionType &distribution, size_t n,
    typename DistributionType::result_type *r);

}

For example,

::mckl::RNG rng;
::mckl::NormalDistribution<double> normal(mean, sd);
constexpr size_t n = 1024;
double r[n];
::mckl::rand(rng, normal, n, r);

This is similar to the following loop,

for (size_t i = 0; i != n; ++i)
    r[i] = normal(rng);

Depending on the type of RNG and the distribution (including its parameters), the vectorized function may have superior performance. However, the results are not always the same as using a loop. The Uniform Bits Distribution and the Standard Uniform Distribution will always produce exactly the same results as using a loop. For some distributions, such as those using the inverse method, the difference is up to rounding errors. For others, completely different sequences of random numbers might be generated.

Counter-Based Random Number Generators

The development by [Salmon2011] made high performance parallel RNGs much more accessible than before. The RNGs introduced in the paper use bijection \(f_k\), such that, for a sequence \(\{c_i = i\}_{i\ge0}\), the sequence \(\{y_i = f_k(c_i)\}_{i\ge0}\) appears random. In addition, for \(k_1 \ne k_2\), \(f_{k_1}\) and \(f_{k_2}\) generate two sequences that appear statistically independent. Compared to more conventional RNGs which use recursions \(y_i = f_k(y_{i - 1})\), these counter-based RNGs are much easier to use in a parallelized environment. If \(c\), the counter, is an unsigned integer with \(b\) bits, and \(k\), the key, is an unsigned integer with \(d\) bits. Then for each \(k\), the RNG has a period \(2^b\). And there can be at most \(2^d\) independent streams. MCKL defines the following class template as the interface,

namespace mckl
{

template <typename ResultType, typename Generator>
class CounterEngine;

}

where ResultType shall be an unsigned integer type and it is the output type of the RNG engine. An instance of this class template is compatible with standard library RNG engines, and can be used as a drop-in replacement of classes such as std::mt19937. A few classes that can be used as the Generator template argument are implemented in MCKL and discussed briefly in this section. See [Salmon2011] for details of each algorithm.

AES Round Function Based Random Number Generators

AES round function based RNGs in [Salmon2011] are implemented in the following generator.

namespace mckl
{

template <typename KeySeqType>
class AESGenerator;

}

The corresponding RNG engine is,

namespace mckl
{

template <typename ResultType, typename KeySeqType>
using AESEngine = CounterEngine<ResultType, AESGenerator<KeySeqType>;

}

where KeySeqType is the class used to generate the sequences of round keys. When AESNI instructions are available, they are used for performance boost. Without going into details, there are four types of sequences of round keys implemented by MCKL,

namespace mckl
{

template <size_t Rounds = MCKL_AES128_ROUNDS>
class AES128KeySeq;

template <size_t Rounds = MCKL_AES192_ROUNDS>
class AES192KeySeq;

template <size_t Rounds = MCKL_AES256_ROUNDS>
class AES256KeySeq;

template <size_t Rounds = MCKL_ARS_ROUNDS,
    typename Constants = ARSConstants>
class ARSKeySeq;

}

The default rounds of the first three are 10, 12 and 14, respectively. And thus they are equivalent to the AES-128, AES-192, and AES-256 block ciphers, respectively. The last one is the ARS algorithm in [Salmon2011]. The default rounds is 5, instead of 7 as in the paper, but the same as its Intel MKL implementation.

The trait class Constants defines the Weyl’s sequence constants. The only restriction on this trait class is that the following expressions are valid,

constexpr uint64_t w0 = Constants::weyl::value[0];
constexpr uint64_t w1 = Constants::weyl::value[1];

The member data value will not be ODR used. The default constants are taken from the paper. Correspondingly, there are four RNG engines,

namespace mckl
{

template <typename ResultType, size_t Rounds = MCKL_AES128_ROUNDS>
using AES128Engine = AESEngine<ResultType, AES128KeySeq<Rounds>>;

template <typename ResultType, size_t Rounds = MCKL_AES192_ROUNDS>
using AES192Engine = AESEngine<ResultType, AES192KeySeq<Rounds>>;

template <typename ResultType, size_t Rounds = MCKL_AES256_ROUNDS>
using AES256Engine = AESEngine<ResultType, AES256KeySeq<Rounds>>;

template <typename ResultType, size_t Rounds = MCKL_ARS_ROUNDS,
    typename Constants = ARSConstants>
using ARSEngine = AESEngine<ResultType, ARSKeySeq<Rounds, Constants>>;

}

A few type aliases are defined for convenience.

namespace mckl
{

using AES128 = AES128Engine<uint32_t>;
using AES192 = AES192Engine<uint32_t>;
using AES256 = AES256Engine<uint32_t>;
using ARS    = ARSEngine<uint32_t>;

using AES128_64 = AES128Engine<uint64_t>;
using AES192_64 = AES192Engine<uint64_t>;
using AES256_64 = AES256Engine<uint64_t>;
using ARS_64    = ARSEngine<uint64_t>;

}

Philox

The Philox algorithm in [Salmon2011] is implemented in the following generator,

namespace mckl
{

template <typename T, size_t K, size_t Rounds = MCKL_PHILOX_ROUNDS,
    typename Constants = PhiloxConstants<T, K>>
class PhiloxGenerator;

}

The corresponding RNG engine is,

namespace mckl
{

template <typename ResultType, typename T, size_t K,
    size_t Rounds = MCKL_PHILOX_ROUNDS,
    typename Constants = PhiloxConstants<T, K>>
using PhiloxEngine =
    CounterEngine<ResultType, PhiloxGenerator<T, K, Rounds, Constants>>;

}

The template parameter Constants is a trait class that defines the Weyl’s sequence constants and the multipliers. The only restriction on this trait class is that the following expressions are valid,

// i is a compile time constant expression
constexpr T w = Constants::weyl::value[i];       // i = 0, … , K / 2 - 1
constexpr T m = Constants::multiplier::value[i]; // i = 0, … , K / 2 - 1

The member data value will not be ODR used. The defaults are taken from [Salmon2011]. Four engines are defined in MCKL,

namespace mckl
{

template <typename ResultType>
using Philox2x32Engine = PhiloxEngine<ResultType, uint32_t, 2>;

template <typename ResultType>
using Philox4x32Engine = PhiloxEngine<ResultType, uint32_t, 4>;

template <typename ResultType>
using Philox2x64Engine = PhiloxEngine<ResultType, uint64_t, 2>;

template <typename ResultType>
using Philox4x64Engine = PhiloxEngine<ResultType, uint64_t, 4>;

}

A few type aliases are defined for convenience,

namespace mckl
{

using Philox2x32 = Philox2x32Engine<uint32_t>;
using Philox4x32 = Philox4x32Engine<uint32_t>;
using Philox2x64 = Philox2x64Engine<uint32_t>;
using Philox4x64 = Philox4x64Engine<uint32_t>;

using Philox2x32_64 = Philox2x32Engine<uint64_t>;
using Philox4x32_64 = Philox4x32Engine<uint64_t>;
using Philox2x64_64 = Philox2x64Engine<uint64_t>;
using Philox4x64_64 = Philox4x64Engine<uint64_t>;

}

Threefry

The Threefry algorithm in [Salmon2011] is implemented in the following generator,

namespace mckl
{

template <typename T, size_t K, size_t Rounds = MCKL_THREEFRY_ROUNDS,
    typename Constants = ThreefryConstants<T, K>>
class ThreefryGenerator;

}

The corresponding RNG engine is,

namespace mckl
{

template <typename ResultType, typename T, size_t K,
    size_t Rounds = MCKL_THREEFRY_ROUNDS,
    typename Constants = ThreefryConstants<T, K>>
using ThreefryEngine =
    CounterEngine<ResultType, ThreefryGenerator<T, K, Rounds, Constants>>;

}

The template parameter Constants is a trait class that defines the parity constants, the rotation constants, and the permutation. The only restriction on this trait class is that the following expressions are valid,

// i, j are compile time constant expression
constexpr T k = Constants::parity::value;
constexpr int r = Constants::rotate::value[i][j];  // i = 0, … , K / 2 - 1
                                                   // j = 1, … , 8
constexpr size_t p = Constants::permute::value[i]; // i = 0, … , K - 1

The member data value will not be ODR used. The defaults are taken from the skein hash function and [Salmon2011]. Six engines are defined in MCKL,

namespace mckl
{

template <typename ResultType>
using Threefry2x32Engine = ThreefryEngine<ResultType, uint32_t, 2>;

template <typename ResultType>
using Threefry4x32Engine = ThreefryEngine<ResultType, uint32_t, 4>;

template <typename ResultType>
using Threefry2x64Engine = ThreefryEngine<ResultType, uint64_t, 2>;

template <typename ResultType>
using Threefry4x64Engine = ThreefryEngine<ResultType, uint64_t, 4>;

template <typename ResultType>
using Threefry8x64Engine = ThreefryEngine<ResultType, uint64_t, 8>;

template <typename ResultType>
using Threefry16x64Engine = ThreefryEngine<ResultType, uint64_t, 16>;

}

In addition, three engines that are equivalent to Threefish-256, Threefish-512 and Threefish-1024 block ciphers, respectively, are also defined,

namespace mckl
{

template <typename ResultType>
using Threefish256Engine = ThreefryEngine<ResultType, uint64_t, 4, 72>;

template <typename ResultType>
using Threefish512Engine = ThreefryEngine<ResultType, uint64_t, 8, 72>;

template <typename ResultType>
using Threefish1024Engine = ThreefryEngine<ResultType, uint64_t, 16, 80>;

}

A few type aliases are defined for convenience,

namespace mckl
{

using Threefry2x32  = Threefry2x32Engine<uint32_t>;
using Threefry4x32  = Threefry4x32Engine<uint32_t>;
using Threefry2x64  = Threefry2x64Engine<uint32_t>;
using Threefry4x64  = Threefry4x64Engine<uint32_t>;
using Threefry8x64  = Threefry8x64Engine<uint32_t>;
using Threefry16x64 = Threefry16x64Engine<uint32_t>;

using Threefry2x32_64  = Threefry2x32Engine<uint64_t>;
using Threefry4x32_64  = Threefry4x32Engine<uint64_t>;
using Threefry2x64_64  = Threefry2x64Engine<uint64_t>;
using Threefry4x64_64  = Threefry4x64Engine<uint64_t>;
using Threefry8x64_64  = Threefry8x64Engine<uint64_t>;
using Threefry16x64_64 = Threefry16x64Engine<uint64_t>;

using Threefish256  = Threefish256Engine<uint32_t>;
using Threefish512  = Threefish512Engine<uint32_t>;
using Threefish1024 = Threefish1024Engine<uint32_t>;

using Threefish256_64  = Threefish256Engine<uint64_t>;
using Threefish512_64  = Threefish512Engine<uint64_t>;
using Threefish1024_64 = Threefish1024Engine<uint64_t>;

}

MKL Random Number Generators

Intel MKL provides some high performance RNGs. MCKL implements a wrapper class

namespace mckl
{

template <MKL_INT BRNG, int Bits>
class MKLEngine;

}

that makes them accessible as C++11 engines. The output is either 32- or 64-bit unsigned integers. This is determined by the template parameter Bits, which can only take one of these two values. The template parameter BRNG can be any Intel MKL basic RNG that supports viRngUniformBits32 (Bits is 32) or viRngUniformBits64 (Bits is 64). Type aliases are listed below,

namespace mckl
{

using MKL_ARS5             = MKLEngine<VSL_BRNG_ARS5, 32>;
using MKL_ARS5_64          = MKLEngine<VSL_BRNG_ARS5, 64>;
using MKL_PHILOX4X32X10    = MKLEngine<VSL_BRNG_PHILOX4X32X10, 32>;
using MKL_PHILOX4X32X10_64 = MKLEngine<VSL_BRNG_PHILOX4X32X10, 64>;
using MKL_MCG59            = MKLEngine<VSL_BRNG_MCG59, 32>;
using MKL_MCG59_64         = MKLEngine<VSL_BRNG_MCG59, 64>;
using MKL_MT19937          = MKLEngine<VSL_BRNG_MT19937, 32>;
using MKL_MT19937_64       = MKLEngine<VSL_BRNG_MT19937, 64>;
using MKL_MT2203           = MKLEngine<VSL_BRNG_MT2203, 32>;
using MKL_MT2203_64        = MKLEngine<VSL_BRNG_MT2203, 64>;
using MKL_SFMT19937        = MKLEngine<VSL_BRNG_SFMT19937, 32>;
using MKL_SFMT19937_64     = MKLEngine<VSL_BRNG_SFMT19937, 64>;
using MKL_NONDETERM        = MKLEngine<VSL_BRNG_NONDETERM, 32>;
using MKL_NONDETERM_64     = MKLEngine<VSL_BRNG_NONDETERM, 64>;

}

Note that, Intel MKL RNGs perform the best when they are used to generate vectors of random numbers. These wrappers use a buffer to store such vectors. And thus they have much larger state space than usual RNGs. When there are Intel MKL routines for generating distribution random variates for one of the distributions discussed later in Continuous Distribution and Discrete Distribution, MCKL automatically uses these routines for vectorized random number generating if the RNG is one of that listed above. For example,

::mckl::MKL_MT2203 rng;
::mckl::NormalDistribution<double> normal;
normal(rng, n, r);               // MKL rountines used
::mckl::rand(rng, normal, n, r); // MKL rountines used

Note that, this is applicable when the distribution is a class in MCKL. It does not work with classes such as std::normal_distribution. This is also applicable when the distribution is not directly supported by Intel MKL, but can be easily generated using other distributions, e.g., the Student’s t-distribution. In addition, it is also applicable if a distribution is a special case of one of the distributions supported by Intel MKL, e.g., the \(\chi^2\)-distribution.

Non-Deterministic Random Number Generators

If RDRAND instructions are supported, MCKL also implements three non-deterministic RNGs, RDRAND16, RDRAND32 and RDRAND64. They output 16-, 32-, and 64-bit random integers, respectively. RDRAND instructions may not return a random integer at all. The RNG engine keeps trying until it succeeds. One can limit the maximum number of trials by defining the configuration macro MCKL_RDRAND_NTRIAL_MAX. A value of zero, the default, means that the number of trials is unlimited. If it is a positive number, and if after the specified number of trials no random integer is returned by RDRAND instructions, zero is returned.

Seeding Random Number Generators

The following singleton class template,

namespace mckl
{

template <typename ResultType,
    typename ID = std::integral_constant<size_t, sizeof(ResultType)>,
    bool Randomize = true, bool Atomic = true>
class SeedGenerator;

}

can be used to generate distinctive seeds. The method instance returns a reference to the singleton. For example,

// Generate integers
auto &seed = SeedGenerator<unsigned>::instance();

// Generate keys for ARS
auto &keys = SeedGenerator<ARS::key_type>::instance();

Different combinations of the template parameters will create different instances of the singleton. To generate new seeds,

::mckl::RNG rng1(seed.get()); // Construct with a random seed
::mckl::RNG rng2(seed.get()); // Construct with a different random seed
::mckl::ARS ars1(keys.get()); // Construct with a random key
::mckl::ARS ars2(keys.get()); // Construct with a different random key

The procedure for generating the seeds is described here. Let \(N\) be the total number of bits of ResultType, that is,

constexpr int N = sizeof(ResultType) * CHAR_BIT;

If \(N\) is not a multiple of 32, then a compile time error will be raised. Otherwise, let \(S = \min\{N,64\}\) and \(M = N / S\). The generator keeps an \(S\)-bit unsigned integer as its internal counter, say \(c\). Use the set method to set the value of this internal integer. For example,

seed.set(101);

Each time a new seed is requested, this counter is incremented. If the template parameter Atomic is true, then this increment is atomic and thread-safe. That is, let the old value be \(s\),

\[\begin{split}s &\leftarrow c,\\ c &\leftarrow c + 1.\end{split}\]

The generator first calculates an \(N\)-bit unsigned integer \(t\) as the following.

\[\begin{split}t = \begin{cases} (s \bmod m)p + r & \text{if } M = 1 \\ s + 2^{N - S}r & \text{otherwise} \end{cases},\end{split}\]

where \(m = (2^S - 1 - r) / p + 1\), \(p = 1\) and \(r = 0\) by default. To change their values, use the partition method. For example,

seed.partition(10, 3); // p = 10, r = 3

Last, if Randomize is false, then \(t\) is returned as the requested seed, with possible reordering of bytes on big-endian platforms such that the results are exactly the same as on little-endian platforms. Otherwise, i.e., Randomize is true, then \(t\) is transformed through a randomize function. If \(N = 32\), then it is transformed using a 32-bit Speck block cipher with a zero key. If \(N = 2^W\), \(W = 5,\dots,10\), then it is transformed with the bijection of Threefry2x32, Threefry2x64, Threefry4x64, Threefry8x64, and Threefry16x64 RNGs with zero keys, respectively. Otherwise, it is transformed with the Skein-512 hash function.

It is clear that, if \(N = 2^W\), \(W = 5,\dots,10\), or Randomize is false, then the seeds generated are always distinctive if the values of the internal counter \(c\) are distinctive. Therefore, the seed generator has a period of \(\lfloor 2^S / p \rfloor\) if \(M = 1\) and \(2^S\) if \(M > 1\). Moreover, the values of \(t\) belongs to the equivalent class \(t \equiv r \pmod{p}\) if \(M = 1\) and \(\lfloor t / 2^{N - S} \rfloor \equiv r\) if \(M > 1\). Therefore, using the partition method, one can generate distinctive seeds across multiple computing nodes or multiple programs.

One can save and restore the seed generator using standard library streams. For example,

std::ifstream is("seed.txt");
if (is)
    is >> ::mckl::Seed<RNG>::instance();    // Read seed from a file
if (!is)
    ::mckl::Seed<RNG>::instance().set(101); // Set it manually
is.close();
// Using Seed
std::ofstream os("seed.txt");
os << Seed<RNG>::instance();        // Write the seed to a file
os.close();

This is useful when one need to run a simulation program multiple times, but need a different set of seeds for each run.

Last, the following class is defined for convenience,

template <typename RNGType>
class Seed;

which is a derived class of the following generator,

::mckl::SeedGenerator<typename SeedType<RNGType>>;

where SeedType is an alias to SeedTrait<RNGType>::type,

namespace mckl
{

template <typename RNGType>
class SeedTrait
{
    public:
    using type = unsigned;
};

template <typename ResultType, typename Generator>
class SeedTrait<CounterEngine<ResultType, Generator>>
{
    public:
    using type = typename CounterEngine<ResultType, Generator>::key_type;
};

}

Therefore, for most RNGs, the unsigned integers are generated as seeds and they share the same internal counter \(c\). For counter-based RNGs implemented in MCKL, keys will be generated as seeds and RNGs with the same key width will share the same internal counter \(c\).

Using Multiple Random Number Generators

The class template RNGSet can be used to manage multiple RNG instances within a parallel program. Three of them are implemented in MCKL. They all have the same interface,

::mckl::RNGSet<RNG> rng_set(N); // A set of N RNGs
rng_set.resize(n);              // Change the size of the set
rng_set[i];                     // Get a reference to the i-th RNG
rng_set.reset();                // Re-seed each RNG in the set

The reset method use Seed<RNG> discussed earlier to generate new seeds.

The first implementation is RNGSetScalar. As its name suggests, it is only a wrapper of a single RNG. All calls to rng_set[i] returns a reference to the same RNG. It is only useful when an RNGSet interface is required while the thread-safety and other issues are not important. The second implementation is RNGSetVector. It is an array of RNGs with length \(N\). It has memory cost \(O(N)\). Many of the Counter-Based Random Number Generators have small state sizes and thus for moderate \(N\), this cost is not significant. The method calls rng_set[i] and rng_set[j] return independent RNGs if \(i \ne j\). If Intel TBB is available, there is a third implementation, RNGSetTBB, which uses thread-local storage (TLS). It has much smaller memory footprint than RNGSetVector while maintains better thread-safety. The type alias RNGSet is defined to be RNGSetTBB if tbb is available. Otherwise it is defined to be RNGSetVector.

Uniform Bits Distribution

The class template,

namespace mckl
{

template <typename UIntType>
class UniformBitsDistribution;

}

is similar to the standard library’s std::independent_bits_engine, except that it always generates full size random integers and UIntType must have a size of at least of 16 bits. That is, let \(W\) be the number of bits of UIntType, then the output is uniform on the set \(\{0,\dots,2^W - 1\}\). For example,

::mckl::UniformBitsDistribution<uint32_t> ubits;
ubits(rng); // Return 32-bit random integers

Let \(r_{\mathrm{min}}\) and \(r_{\mathrm{max}}\) be the minimum and maximum of the random integers generated by rng. Let \(R = r_{\mathrm{max}} - r_{\mathrm{min}} + 1\). Let \(r_i\) be consecutive output of rng(). If there exists an integer \(V > 0\) such that \(R = 2^V\), then the result is,

\[U = \sum_{k = 0}^{K - 1} (r_k - r_{\mathrm{min}}) 2^{kV} \bmod 2^W\]

where \(K = \lceil W / V \rceil\). Unlike std::independent_bits_engine, the calculation can be vectorized, which leads to better performance. Note that, all constants in the algorithm are computed at compile-time and the summation is fully unrolled. There is no runtime overhead. In the case \(r_{\mathrm{min}} = 0\) and \(V = W\), most optimizing compilers shall be able to generate instructions such that the distribution does exactly nothing and returns the results of rng() directly. If there does not exist an integer \(V > 0\) such that \(R = 2^V\), then std::indepdent_bits_engine is used.

Standard Uniform Distribution

MCKL provides five standard uniform distributions. They are all class templates with a single template type parameter RealType. The random integers produced by RNGs are transferred to 32- or 64-bit random integers through the class UniformBitsDistribution before they are mapped to floating point numbers within the interval \([0, 1]\). The integer type depends on RealType and the range of the RNG, \(R\). If \(\log_2 R \ge 64\) or RealType is long double, then the integer type is uint64_t. If MCKL_U01_USE_64BITS_DOUBLE is true and RealType is double, then the integer type is also uint64_t. Otherwise, the integer type is uint32_t.

In the following, let \(W\) be the number of bits of the integer type, and \(M\) be the number of significant bits (including the implicit one) of RealType. We also denote the input random integers as \(U\) and the output random real numbers as \(X\). The type U01Distribution is aliased to U01CanonicalDistribution if the MCKL_U01_USE_FIXED_POINT is set to false (the default). Otherwise it is aliased to U01CODistribution.

Canonical Form

The class template,

namespace mckl
{

template <typename RealType = double>
class U01CanonicalDistribution;

}

implements the uniform distribution on \([0, 1)\). It is implemented through the mapping,

\[\begin{split}P & = \lfloor (W + M - 1) / W \rfloor,\\ K & = \max\{1, P\},\\ X & = \sum_{k=0}^{K - 1} U_k 2^{-(K - k)W}\end{split}\]

This is equivalent to the standard library std::generate_canonical. The minimum and maximum are \(0\) and \(1 - 2^{-KW}\), respectively.

Closed-Closed Interval

The class template,

namespace mckl
{

template <typename RealType = double>
class U01CCDistribution;

}

implements the uniform distribution on \([0, 1]\) through the mapping,

\[\begin{split}P &= \min\{W - 1, M\},\\ V &= \begin{cases} U & \text{if } P + 1 < W \\ \lfloor (U \bmod 2^{W - 1}) / 2^{W - P -2} \rfloor & \text{otherwise} \end{cases},\\ Z &= (V \bmod 2) + V,\\ X &= 2^{-(P + 1)} Z.\end{split}\]

The minimum and maximum are \(0\) and \(1\), respectively.

Closed-Open Interval

The class template,

namespace mckl
{

template <typename RealType = double>
class U01CODistribution;

}

implements the uniform distribution on \([0, 1)\) through the mapping,

\[\begin{split}P &= \min\{W, M\},\\ V &= \lfloor U / 2^{W - P} \rfloor,\\ X &= 2^{-P} V.\end{split}\]

The minimum and maximum are \(0\) and \(1 - 2^{-P}\), respectively.

Open-Closed Interval

The class template,

namespace mckl
{

template <typename RealType = double>
class U01OCDistribution;

}

implements the uniform distribution on \((0, 1]\) through the mapping,

\[\begin{split}P &= \min\{W, M\},\\ V &= \lfloor U / 2^{W - P} \rfloor,\\ X &= 2^{-P} V + 2^{-P}.\end{split}\]

The minimum and maximum are \(2^{-P}\) and \(1\), respectively.

Open-Open Interval

The class template,

namespace mckl
{

template <typename RealType = double>
class U01CODistribution;

}

implements the uniform distribution on \((0, 1)\) through the mapping,

\[\begin{split}P &= \min\{W + 1, M\},\\ V &= \lfloor U / 2^{W + 1 - P} \rfloor,\\ X &= 2^{-(P - 1)} V + 2^{-P}.\end{split}\]

The minimum and maximum are \(2^{-P}\) and \(1 - 2^{-P}\), respectively.

Continuous Distribution

Arcsine Distribution

The class template,

namespace mckl
{

template <typename RealType = double>
class ArcsineDistribution;

}

implements the distribution with PDF,

\[\begin{split}f(x;\alpha,\beta) = \frac{1}{\pi\sqrt{(x - \alpha)(\beta - x)}},\\ x \in [a, b],\quad a \in (0, \infty),\quad b \in (0, \infty),\end{split}\]

using the inverse method.

Beta Distribution

The class template,

namespace mckl
{

template <typename RealType = double>
class BetaDistribution;

}

implements the distribution with PDF,

\[\begin{split}f(x;\alpha,\beta) = \frac{\Gamma(\alpha + \beta)} {\Gamma(\alpha)\Gamma(\beta)} x^{\alpha - 1}(1 - x)^{\beta - 1},\\ x \in (0, 1),\quad \alpha \in (0, \infty),\quad \beta \in (0, \infty).\end{split}\]

The specific algorithm used depends on the parameters. If \(\alpha = 1/2\) and \(\beta = 1/2\), or \(\alpha = 1\) or \(\beta = 1\), then the inverse method is used. If \(\alpha > 1\) and \(\beta > 1\), the method in [Cheng1978] is used. Otherwise, let \(K = 0.852\), \(C = -0.956\), and \(D = \beta + K\alpha^2 + C\). If \(\alpha < 1\), \(\beta < 1\) and \(D \le 0\), then Jöhnk’s method [Devroye1986] (sec. 3.5) is used. In all other cases, one of the switching algorithms in [Atkinson1979] is used.

Cauchy Distribution

The class template,

namespace mckl
{

template <typename RealType = double>
class CauchyDistribution;

}

implements the distribution with PDF,

\[\begin{split}f(x;a,b) = \frac{1}{\pi b\Bigl(1 + \Bigl(\frac{x - a}{b}\Bigr)^2\Bigr)},\\ x \in \mathbb{R},\quad a \in \mathbb{R},\quad b \in (0,\infty),\end{split}\]

using the inverse method.

\(\chi^2\)-Distribution

The class template,

namespace mckl
{

template <typename RealType = double>
class ChiSquaredDistribution;

}

implements the distribution with PDF,

\[\begin{split}f(x;n) = \frac{x^{n/2 - 1}\mathrm{e}^{-x/2}}{2^{n/2}\Gamma(n/2)},\\ x \in (0, \infty),\quad n \in (0, \infty).\end{split}\]

The implementation uses the fact that if \(X\) is a Gamma random variable with shape \(n / 2\) and scale \(2\), then \(X\) is also \(\chi^2\)-distributed with degree of freedom \(n\).

Exponential Distribution

The class template,

namespace mckl
{

template <typename RealType = double>
class ExponentialDistribution;

}

implements the distribution with PDF,

\[\begin{split}f(x;\lambda) = \lambda\mathrm{e}^{-\lambda x},\\ x \in [0, \infty),\quad \lambda \in (0, \infty),\end{split}\]

using the inverse method.

Extreme Value Distribution

The class template,

namespace mckl
{

template <typename RealType = double>
class ExtremeValueDistribution;

}

implements the distribution with PDF,

\[\begin{split}f(x;a,b) = \frac{1}{b} \exp\Bigl\{ \frac{a - x}{b} - \exp\Bigl\{\frac{a - x}{b}\} \Bigr\},\\ x \in \mathbb{R},\quad a \in \mathbb{R},\quad b \in (0, \infty),\end{split}\]

using the inverse method.

Fisher’s F-distribution

The class template,

namespace mckl
{

template <typename RealType = double>
class FisherFDistribution;

}

implements the distribution with PDF,

\[\begin{split}f(x;m,n) = \frac{\Gamma\Bigl(\frac{m + n}{2}\Bigr)} {\Gamma\Bigl(\frac{m}{2}\Bigr)\Gamma\Bigl(\frac{n}{2}\Bigr)} \Bigl(\frac{m}{n}\Bigr)^{m/2} x^{m / 2 - 1} \Bigl(1 + \frac{m}{n}x\Bigr)^{-(m + n) / 2} \\ x \in [0, \infty),\quad m \in (0, \infty),\quad n \in (0, \infty).\end{split}\]

The implementation uses the fact that if \(U\) and \(V\) are \(\chi^2\)-distributed random variables with degrees of freedom \(m\) and \(n\), respectively, and they are independent, then \(X = (U / V)(m / n)\) is a Fisher’s F-distributed random variable with the respective degrees of freedom.

Gamma Distribution

The class template,

namespace mckl
{

template <typename RealType = double>
class GammaDistribution;

}

implements the distribution with PDF,

\[\begin{split}f(x;\alpha,\beta) = \frac{\mathrm{e}^{-x/\beta}}{\Gamma(\alpha)} \beta^{-\alpha}x^{\alpha-1},\\ x \in (0, \infty),\quad \alpha \in (0, \infty),\quad \beta \in (0, \infty).\end{split}\]

The specific algorithm used depends on the parameters. If \(\alpha = 1\), it becomes the exponential distribution. If \(0 < \alpha < 0.6\), it is generated through transformation of exponential power distribution [Devroye1986] (sec 2.6). If \(0.6\le\alpha<1\), then rejection method from the Weibull distribution is used [Devroye1986] (sec. 3.4). If \(\alpha > 1\), then the method in [Marsaglia2000vq] is used.

Laplace Distribution

The class template,

namespace mckl
{

template <typename RealType = double>
class LaplaceDistribution;

}

implements the distribution with PDF,

\[\begin{split}f(x;a,b) = \frac{1}{2b}\exp\Bigl\{-\frac{\lvert{x - a}\rvert}{b}\Bigr\},\\ x \in \mathbb{R},\quad b \in (0, \infty),\end{split}\]

using the inverse method.

Levy Distribution

The class template,

namespace mckl
{

template <typename RealType = double>
class LevyDistribution;

}

implements the distribution with PDF,

\[\begin{split}f(x;a,b) = \sqrt{\frac{b}{2\pi}} \frac{\exp\Bigl\{-\frac{b}{2(x - a)}\Bigr\}}{(x - a)^{3/2}},\\ x \in [a, \infty),\quad a \in \mathbb{R},\quad b \in (0, \infty).\end{split}\]

The implementation uses the fact that if \(Z\) is a standard Normal random variable, then \(X = a + b / Z^2\) is Levy distributed with location \(a\) and scale \(b\).

Logistic Distribution

The class template,

namespace mckl
{

template <typename RealType = double>
class LaplaceDistribution;

}

implements the distribution with PDF,

\[\begin{split}f(x;a,b) = \frac{1}{4b}\mathrm{sech}^2\Bigl(\frac{x - a}{2b}\Bigr),\\ x \in \mathbb{R},\quad a \in \mathbb{R},\quad b \in (0, \infty),\end{split}\]

using the inverse method.

Log-Normal Distribution

The class template,

namespace mckl
{

template <typename RealType = double>
class LognormalDistribution;

}

implements the distribution with PDF,

\[\begin{split}f(x;m,s) = \frac{1}{x\sigma\sqrt{2\pi}} \exp\Bigl\{-\frac{(\ln x - m)^2}{2\sigma^2}\Bigr\},\\ x \in (0, \infty),\quad m \in \mathbb{R},\quad s \in (0, \infty).\end{split}\]

The implementation uses the fact that if \(Z\) is a standard Normal random variable, then \(X = \exp\{m + sZ\}\) is Log-normal distributed with location \(m\) and scale \(s\).

Normal Distribution

The class template,

namespace mckl
{

template <typename RealType = double>
class NormalDistribution;

}

implements the distribution with PDF,

\[\begin{split}f(x;\mu,\sigma) = \frac{1}{\sqrt{2\pi\sigma^2}} \exp\Bigl\{-\frac{(x-\mu)^2}{2\sigma^2}\Bigr\},\\ x \in \mathbb{R},\quad \mu \in \mathbb{R},\quad \sigma \in (0, \infty),\end{split}\]

using the Box-Muller method [Box1958].

Pareto Distribution

The class template,

namespace mckl
{

template <typename RealType = double>
class ParetoDistribution;

}

implements the distribution with PDF,

\[\begin{split}f(x;a,b) = \frac{a b^a}{x^{a + 1}},\\ x \in [b, \infty),\quad a \in [0, \infty),\quad b \in [0, \infty),\end{split}\]

using the inverse method.

Rayleigh Distribution

The class template,

namespace mckl
{

template <typename RealType = double>
class RayleighDistribution;

}

implements the distribution with PDF,

\[\begin{split}f(x;\sigma) = \frac{x}{\sigma^2}\exp\Bigl\{-\frac{x^2}{2\sigma^2}\Bigr\},\\ x \in [0, \infty),\quad \sigma \in (0, \infty),\end{split}\]

using the inverse method.

Stable Distribution

The class template,

namespace mckl
{

template <typename RealType = double>
class StableDistribution;

}

implements the distribution with PDF,

\[\begin{split}f(x;\alpha,\beta,a,b) = \frac{1}{2\pi}\int_{\infty}^{\infty} \varphi(t;\alpha,\beta,a,b)\mathrm{e}^{-ixt} \,dt\\ \varphi(t;\alpha,\beta,a,b) = \exp\{ ita - \lvert{bt}\rvert^{\alpha} (1 - i\beta\mathrm{sgn}(t)\Phi(t;\alpha)) \} \\ \Phi(t;\alpha) = \begin{cases} \tan\Bigl(\frac{\pi}{2}\alpha\Bigr) & \alpha \ne 1 \\ -\frac{2}{\pi}\log\lvert{t}\rvert & \alpha = 1 \end{cases}, \\ x \in \mathbb{R},\quad \alpha \in (0, 2],\quad \beta \in [-1, 1],\quad a \in \mathbb{R},\quad b \in (0, \infty).\end{split}\]

The implementation uses the method in [Chambers1976].

Student’s t-Distribution

The class template,

namespace mckl
{

template <typename RealType = double>
class StudentTDistribution;

}

implements the distribution with PDF,

\[\begin{split}f(x;n) = \frac{\Gamma\Bigl(\frac{n + 1}{2}\Bigr)} {\sqrt{n\pi}\Gamma\Bigl(\frac{n}{2}\Bigr)} \Bigl(1 + \frac{x^2}{n}\Bigr)^{-(n + 1)/2},\\ x \in \mathbb{R},\quad n \in (0, \infty).\end{split}\]

The implementation uses the fact that if \(Z\) is a standard Normal random variable, \(V\) is a \(\chi^2\)-distributed random variable with degree of freedom \(n\), and they are independent, then \(X = Z/\sqrt{V / n}\) is Student’s t-distributed with the respective degree of freedom.

Uniform Real Distribution

The class template,

namespace mckl
{

template <typename RealType = double>
class UniformRealDistribution;

}

implements the distribution with PDF,

\[\begin{split}f(x;a,b) = \frac{1}{b - a},\\ x \in [a, b),\quad a \in \mathbb{R},\quad b \in (a, \infty),\end{split}\]

using the inverse method.

Weibull Distribution

The class template,

namespace mckl
{

template <typename RealType = double>
class WeibullDistribution;

}

implements the distribution with PDF,

\[\begin{split}f(x;a,b) = \frac{a}{b}\Bigl(\frac{x}{b}\Bigr)^{a - 1} \exp\Bigl\{-\Bigl(\frac{x}{b}\Bigr)^a\Bigr\},\\ x \in [0, \infty),\quad a \in (0, \infty),\quad b \in (0, \infty),\end{split}\]

using the inverse method.

Discrete Distribution

Bernoulli Distribution

The class template,

namespace mckl
{

template <typename IntType = bool>
class BernoulliDistribution;

}

implements the distribution with PDF,

\[\begin{split}\mathbb{P}(X = k;p) = kp + (1 - k)(1 - p),\\ k \in \{0, 1\},\quad p \in [0, 1].\end{split}\]

Unlike other discrete distributions, the Bernoulli distribution supports any integer type, while others require an integer type with size larger than 16 bits. The implementation uses the simple fact that if \(U\) is a standard uniform random variable, than \(\mathbb{I}_{[0,p)}(U)\) is Bernoulli distributed with success probability \(p\). This is not a drop-in replacement for std::bernoulli_distribution, which is not a class template.

Geometric Distribution

The class template,

namespace mckl
{

template <typename IntType = int>
class GeometricDistribution;

}

implements the distribution with PDF,

\[\begin{split}\mathbb{P}(X = k;p) = p(1-p)^k,\\ k \in \mathrm{N},\quad p \in (0, 1].\end{split}\]

The implementation uses the fact that if \(U\) is a standard uniform random variable, then \(X = \lfloor{\ln U / \ln(1-p)}\rfloor\) is a Geometric random variable with success probability \(p\).

Uniform Integer Distribution

The class template,

namespace mckl
{

template <typename IntType = int>
class UniformIntDistribution;

}

implements the distribution with PDF,

\[\begin{split}\mathbb{P}(X = k;a,b) = \frac{1}{b - a + 1},\\ k \in \{a,\dots,b\},\quad a \in \mathbb{Z},\quad b \in \{x \in \mathbb{Z} \mid x \ge a\}.\end{split}\]

The specific algorithm used depends on the parameters. If \(a = b\), then it simply returns \(a\). If \(b - a + 1 = 2^W\), where \(W\) is the number of bits of IntType, then UniformBitsDistribution is used (see Uniform Bits Distribution). If \(\max\{\lvert{a}\rvert, b\} < 2^{32}\), then it uses the fact that if \(U\) is a standard uniform random variable, then \(X = \lfloor{a + (b - a + 1) U}\rfloor\) is uniform on the set \(\{a,\dots,b\}\). Otherwise the standard library is used.

Discrete Distribution

The class template,

namespace mckl
{

template <typename IntType = int>
class DiscreteDistribution;

}

implements the discrete distribution. It supports the same interface as the standard library with two modificiations. First, the probabilities method return an mckl::Vector<double> object instead of std::vector<double>. Second, it support the following additonal callable operator,

template <typename RNGType, typename InputIter>
result_type operator()(RNGType &rng, InputIter first, InputIter last,
    bool normalized = false) const;

Where the iterators points to the range of probabilies. The optional argument normalized specify if they are normalized. This operator has \(O(1)\) memory cost and \(O(n)\) runtime cost.

Sampling Distribution

The class template,

namespace mckl
{

template <typename IntType>
class SamplingDistribution;

}

implements the sampling distribution with PDF,

\[\begin{split}\mathbb{P}(S = s; n, m) = \binom{n}{m}^{-1} \\ s \in \{\text{size } m \text{ subsets of }\{0,\dots,n-1\}\},\quad n\in\mathbb{N},\quad m\in\mathbb{N},\quad 0 < m \le n\quad.\end{split}\]

Multivariate Distribution

Dirichlet Distribution

The class template,

namespace mckl
{

template <typename RealType = double, size_t Dim = Dynamic>
class DirichletDistribution;

}

implements the distribution with PDF,

\[\begin{split}f(x_{1:d};\alpha_{1:d}) = \frac{\Gamma\Bigl(\sum_{i=1}^d\alpha_i\Bigr)} {\prod_{i=1}^d\Gamma(\alpha_i)} \prod_{i=1}^d x_i^{\alpha_i - 1},\\ \sum_{i=1}^d x_i = 1,\quad x_{1:d}\in(0,1)^d,\quad \alpha_{1:d}\in(0,\infty)^d.\end{split}\]

The template parameter Dim is the dimension of the distribution. If it is positive, then the dimension is fixed. The distribution generator can be constructed by,

::mckl::DirichletDistribution<double, Dim> dirichlet(alpha);

and if it is zero (recall that Dynamic is just an enumerator with value zero), then the dimension has to be specified at runtime. The distribution generator can be constructed by,

::mckl::DirichletDistribution<double> dirichlet(dim, alpha);

The parameter alpha can be either a pointer to a \(d\)-vector or a scalar. If it is a scalar, say \(\alpha\), then \(\alpha_i = \alpha\) for \(i = 1,\dots,d\). To generate one random variate,

dirichlet(rng, r);
::mckl::rand(rng, dirichlet, r);

where the output parameter r is a pointer to a \(d\)-vector. Vectorized generating is also possible,

dirichlet(rng, n, r);
::mckl::rand(rng, dirichlet, n, r);

where the output parameter r is a pointer to an \(n \times d\) matrix of row major order.

Multivariate Normal Distribution

The class template,

namespace mckl
{

template <typename RealType = double, size_t Dim = Dynamic>
class NormalMVDistribution;

}

implements the distribution with PDF,

\[\begin{split}f(x_{1:d};\mu_{1:d},\Sigma) = \frac{1}{\sqrt{(2\pi)^d\lvert{\Sigma}\rvert}} \exp\Bigl\{ -\frac{1}{2}(x_{1:d} - \mu_{1:d})^{\mathrm{T}}\Sigma^{-1}(x_{1:d} - \mu_{1:d}) \Bigr\},\\ x_{1:d}\in\mathbb{R}^d,\quad \mu_{1:d}\in\mathbb{R}^d,\quad \Sigma\in\{\text{positive semi-definite }d \times d\text{ matrix}\}.\end{split}\]

At the time of writing, only float and double are supported types for the template parameter RealType. The second template parameter Dim specify the dimension of the distribution, \(d\). If Dim is positive, then the dimension is fixed. The distribution generator can be constructed by,

::mckl::NormalMVDistribution<double, Dim> normal_mv(mean, chol);

Otherwise, if Dim is zero, the dimension has to be specified at runtime. The distribution generator can be constructed by,

::mckl::NormalMVDistribution<double> normal_mv(d, mean, chol);

In either case, the parameter mean is a pointer to the mean vector of length \(d\), and chol is a pointer to the lower triangular of the Cholesky decomposition of the covariance matrix packed row by row. For those unfamiliar with matrix storage schemes, this means that, chol is a vector of length \(d(d + 1) / 2\). Let \(L\) be the lower triangular of the Cholesky decomposition, that is \(LL^{\mathrm{T}} = \Sigma\), then the vector is \((L_{1,1},L_{2,1},L_{2,2},\dots,L_{d,d})^{\mathrm{T}}\). Further, both mean and chol can also be scalars instead of pointers to vectors. If mean is a scalar, say \(\mu\), then the mean vector is assumed to be a \(d\)-vector with all elements equal to \(\mu\). If chol is a scalar, say \(\sigma\), then it is assumed that \(L = \sigma I_d\) and thus the covariance matrix is \(\Sigma = \sigma^2 I_d\), where \(I_d\) is the identity matrix. To generate a single multivariate Normal random number,

normal_mv(rng, r);
::mckl::rand(rng, normal_mv, r);

where the output parameter r is a pointer to a \(d\)-vector. Vectorized generating is also possible,

normal_mv(rng, n, r);
::mckl::rand(rng, normal_mv, n, r);

where the output parameter r is a pointer to an \(n \times d\) matrix of row major order.

[Atkinson1979]Atkinson, A.C. (1979). “A family of switching algorithms for the computer generation of beta random variables.” Biometrika, 66(1), 141–145.
[Box1958]Box, G.E.P., & Muller, M.E. (1958). “A note on the generation of random Normal deviates.” The Annals of Mathematical Statistics, 29(2), 610–611.
[Chambers1976]Chambers, J.M., Mallows, C.L., & Stuck, B.W. (1976). “A method for simulating stable random variables.” Journal of the American Statistical Association, 71(354), 340–344.
[Cheng1978]Cheng, R.C.H. (1978). “Generating Beta variates with nonintegral shape parameters.” Communications of the ACM, 21(4), 317–322.
[Devroye1986](1, 2, 3) Devroye, L. (1986). Non-Uniform Random Variate Generation. New York, NY: Springer New York.
[Salmon2011](1, 2, 3, 4, 5, 6, 7, 8) Salmon, J.K., Moraes, M.A., Dror, R.O., & Shaw, D.E. (2011). “Parallel random numbers: As easy as 1, 2, 3.” Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, 1–12.