c++ - uniform_real_distribution所有可能的值生成

Question

我目前正在研究重要性抽样，出于测试目的，我需要能够生成可能uniform_real_distribution<float>为区间 [0,1] 生成的所有可能值（是的，它也从右侧关闭）。我的想法是生成整数，然后我可以将其转换为浮点数。从我所做的测试来看，[0,1] 中的统一单精度浮点数和 [0,2^24] 中的整数之间似乎存在完美的双射（我对它不是 [0 ,2^24-1] 并且我仍在试图找出原因，我最好的猜测是 0 对于浮点数来说只是特殊的，而 1 到 2^24 都会导致具有相同指数的浮点数）。我的问题是，以这种方式生成的浮点数是否正是可以从uniform_real_distribution<float>. 你可以在下面找到我的整数 <-> 浮点测试：

void floatIntegerBitsBijectionTest()
{
    uint32 two24 = 1 << 24;
    bool bij24Bits = true;
    float delta = float(1.0) / float(two24);
    float prev = float(0) / float(two24);
    for (uint32 i = 1; i <= two24; ++i)
    {
        float uintMap = float(i) / float(two24);
        if (uintMap - prev != delta || uint32(uintMap*float(two24)) != i)
        {
            std::cout << "No bijection exists between uniform floats in [0,1] and integers in [0,2^24].\n";
            bij24Bits = false;
            break;
        }
        prev = uintMap;
    }
    if(bij24Bits) std::cout << "A bijection exists between uniform floats in [0,1] and integers in [0,2^24].\n";
    std::cout << "\n";

    uint32 two25 = 1 << 25;
    bool bij25Bits = true;
    delta = float(1.0) / float(two25);
    prev = float(0) / float(two25);
    for (uint32 i = 1; i <= two25; ++i)
    {
        float uintMap = float(i) / float(two25);
        if (uintMap - prev != delta || uint32(uintMap*float(two25)) != i)
        {
            std::cout << "No bijection exists between uniform floats in [0,1] and integers in [0,2^25].\n";
            if (i == ((1 << 24) + 1)) std::cout << "The first non-uniformly distributed float corresponds to the integer 2^24+1.\n";

            bij25Bits = false;
            break;
        }
        prev = uintMap;
    }
    if (bij25Bits) std::cout << "A bijection exists between uniform floats in [0,1] and integers in [0,2^25].\n";
    std::cout << "\n";


    bool bij25BitsS = true;
    delta = 1.0f / float(two24);
    prev = float(-two24) / float(two24);
    for (int i = -two24+1; i <= two24; ++i)
    {
        float uintMap = float(i) / float(two24);
        if (uintMap - prev != delta || int(uintMap*float(two24)) != i)
        {
            std::cout << i << " " << uintMap - prev << " " << delta << "\n";
            std::cout << "No bijection exists between uniform floats in [-1,1] and integers in [-2^24,2^24].\n";
            bij25BitsS = false;
            break;
        }
        prev = uintMap;
    }
    if (bij25BitsS) std::cout << "A bijection exists between uniform floats in [-1,1] and integers in [-2^24,2^24].\n";
}

编辑：

有点相关：

https://crypto.stackexchange.com/questions/31657/uniformly-distributed-secure-floating-point-numbers-in-0-1

http://xoroshiro.di.unimi.it/random_real.c

https://www.reddit.com/r/programming/comments/29ducz/obtaining_uniform_random_floats_is_trickier_than/

https://lemire.me/blog/2017/02/28/how-many-floating-point-numbers-are-in-the-interval-01/

编辑2：

我终于设法弄清楚uniform_real_distribution<float>至少在mt19937与默认模板参数一起使用时与引擎一起使用时会发生什么（我正在谈论 VS2017 附带的实现）。可悲的是，它只是在 [0,2^32-1] 中生成一个随机整数，然后将其转换为浮点数，然后除以 2^32。不用说，这会产生非均匀分布的浮点数。然而，我猜测这适用于大多数实际目的，除非一个工作接近生成数字之间的增量的精度。

score 2 · Accepted Answer

我将假设 C++ 实现使用 IEEE-754 32 位基本二进制格式float。在这种格式中，[1, 2] 中的可表示浮点值是规则间隔的，距离为 2 ^-23。

定义x：

std::uniform_real_distribution<float> x(1, 2);

然后，假设uniform_real_distribution实现得很好并且使用了适当的引擎，将为[0, 2 ²³ ) 中的整数nx(engine) - 1生成等于n / 2 ²³的值，并且分布均匀。

笔记

uniform_real_distribution我对C++ 中的规范有疑虑。它是根据实数算术定义的。它返回具有恒定概率密度的值的要求需要一组连续的数字，而浮点格式不提供这些数字。此外，我不确定实现将如何处理端点。

由于分布被迫是离散的，因此不妨使用uniform_int_distribution样本并将其乘以 2 ^-23（可用作numeric_limits<float>::epsilon()）。这样做的好处是可以根据需要澄清端点并轻松支持 [0, 1) 或 [0, 1] 的区间。

即使 C++ 标准不使用 IEEE-754，[1, 2] 中的可表示值也应该均匀分布，因为 C++ 标准中对浮点值的描述是由某个基数中的一些数字表示的，乘以提高到某个幂的基数。对于幂零，从 1 到 2 的值将根据格式中最低有效位的值进行间隔。如上所述，该距离为numeric_limits<float>::epsilon()。

脚注

¹ C++ 标准使用传统术语“尾数”，但首选术语是“有效位”。</p>

score 1 · Accepted Answer

你可以强迫这个问题。滚动您自己的随机浮点生成器。

编辑：我刚刚发现std::generate_canonical<float>()，它做同样的事情，但不依赖于幻数 24。它可以从std::numerical_limits<float>::digits等中解决...

#include <random>

static const unsigned long big = 1 << 24;
static std::default_random_engine re;
static std::uniform_int_distribution<unsigned long> uint(0, big - 1);

float rand_float() {
    return uint(re) / static_cast<float>(big);
}

c++ - uniform_real_distribution所有可能的值生成

2 回答 2

笔记

脚注

Related

Reference