c++ - C ++中的加权RNG速度问题

Question

编辑：澄清一下，问题出在第二种算法上。

我有一些 C++ 代码可以从 52 张卡片组中采样卡片，效果很好：

void sample_allcards(int table[5], int holes[], int players) {
    int temp[5 + 2 * players];
    bool try_again;
    int c, n, i;

    for (i = 0; i < 5 + 2 * players; i++) {
        try_again = true;
        while (try_again == true) {
            try_again = false;
            c = fast_rand52();
            // reject collisions
            for (n = 0; n < i + 1; n++) {
                try_again = (temp[n] == c) || try_again;
            }
            temp[i] = c;
        }
    }
    copy_cards(table, temp, 5);
    copy_cards(holes, temp + 5, 2 * players);
}

我正在实现代码以根据已知分布（存储为二维表）对底牌进行采样。我的代码如下：

void sample_allcards_weighted(double weights[][HOLE_CARDS], int table[5], int holes[], int players) {
    // weights are distribution over hole cards
    int temp[5 + 2 * players];
    int n, i;

    // table cards
    for (i = 0; i < 5; i++) {
        bool try_again = true;
        while (try_again == true) {
            try_again = false;
            int c = fast_rand52();
            // reject collisions
            for (n = 0; n < i + 1; n++) {
                try_again = (temp[n] == c) || try_again;
            }
            temp[i] = c;
        }
    }

    for (int player = 0; player < players; player++) {
        // hole cards according to distribution
        i = 5 + 2 * player;
        bool try_again = true;
        while (try_again == true) {
            try_again = false;
            // weighted-sample c1 and c2 at once
            // h is a number < 1325
            int h = weighted_randi(&weights[player][0], HOLE_CARDS);
            // i2h uses h and sets temp[i] to the 2 cards implied by h
            i2h(&temp[i], h);
            // reject collisions
            for (n = 0; n < i; n++) {
                try_again = (temp[n] == temp[i]) || (temp[n] == temp[i+1]) || try_again;
            }
        }
    }

    copy_cards(table, temp, 5);
    copy_cards(holes, temp + 5, 2 * players);
}

我的问题？加权采样算法慢了 10 倍。速度对我的应用程序非常重要。

有没有办法将我的算法速度提高到更合理的程度？我在实施中做错了吗？

谢谢。

编辑：有人问我这个功能，我应该发布的，因为它是关键

inline int weighted_randi(double *w, int num_choices) {
double r = fast_randd();
double threshold = 0;
int n;

for (n = 0; n < num_choices; n++) {
    threshold += *w;
    if (r <= threshold) return n;
    w++;
}
// shouldn't get this far
cerr << n << "\t" << threshold << "\t" << r << endl;
assert(n < num_choices);
return -1;

}

...而 i2h() 基本上只是一个数组查找。

score 1 · Accepted Answer

您的拒绝冲突正在将O (n) 算法变成（我认为）O (n^2) 操作。

从一副牌中选择卡片有两种方法：洗牌和弹出，或者选择套牌直到套牌中的元素是唯一的；你正在做后者，这需要大量的回溯。

我没有看代码的细节，只是快速扫描。

score 1 · Accepted Answer

您可以通过替换所有检查卡是否使用位掩码的循环来获得一些速度，例如，对于 52 张卡的池，我们可以防止这样的冲突：

DWORD dwMask[2] = {0}; //64 bits
//...
int nCard;
while(true)
{
    nCard = rand_52();
    if(!(dwMask[nCard >> 5] & 1 << (nCard & 31)))
    {
        dwMask[nCard >> 5] |= 1 << (nCard & 31);
        break;
    }
}
//...

score 0 · Accepted Answer

我的猜测是重试循环中的 memcpy(1326*sizeof(double)) 。好像没变，所以每次都要复制吗？

score 0 · Accepted Answer

与其告诉你问题是什么，让我建议你如何找到它。要么 1) 在 IDE 中单步执行它，要么 2)随机停止它以查看它在做什么。

也就是说，如果您拒绝大多数样本，那么正如您所做的那样，通过拒绝抽样可能会花费不合理的长时间。

score 0 · Accepted Answer

一旦将 try_again 设置为 true，您的内部“try_again”for 循环就应该停止 - 在您知道需要重试之后再做更多的工作是没有意义的。

for (n = 0; n < i && !try_again; n++) {
    try_again = (temp[n] == temp[i]) || (temp[n] == temp[i+1]);
}

score 0 · Accepted Answer

回答关于从加权集合中挑选的第二个问题也有一个算法替换，它应该不那么复杂。这是基于预先计算的不需要重新计算的原则。

在普通选择中，您有整数个 bin，这使得选择一个 bin 成为O (1) 操作。您的weighted_randi函数具有实际长度的 bin，因此当前版本中的选择在O (n) 时间内运行。由于您没有说（但确实暗示）权重向量w是恒定的，因此我假设它是恒定的。

You aren't interested in the width of the bins, per se, you are interested in the locations of their edges that you re-compute on every call to weighted_randi using the variable threshold. If the constancy of w is true, pre-computing a list of edges (that is, the value of threshold for all *w) is your O(n) step which need only be done once. If you put the results in a (naturally) ordered list, a binary search on all future calls yields an O(log n) time complexity with an increase in space needed of only sizeof w / sizeof w[0].

c++ - C ++中的加权RNG速度问题

6 回答 6

Related

Reference