algorithm - 非常基本的基数排序

Question

我刚刚写了一个简单的迭代基数排序，我想知道我是否有正确的想法。
递归实现似乎更为常见。

我正在排序 4 字节整数（无符号以保持简单）。
我使用 1 字节作为“数字”。所以我有 2^8=256 个桶。
我首先对最高有效数字 (MSD) 进行排序。
每次排序后，我按照它们在桶中存在的顺序将它们放回数组中，然后执行下一个排序。
所以我最终做了4桶排序。
它似乎适用于一小组数据。因为我正在做 MSD，所以我猜它不稳定，可能会因不同的数据而失败。

我错过了什么重要的事情吗？

#include <iostream>
#include <vector>
#include <list>

using namespace std;

void radix(vector<unsigned>&);
void print(const vector<list<unsigned> >& listBuckets);
unsigned getMaxForBytes(unsigned bytes);
void merge(vector<unsigned>& data, vector<list<unsigned> >& listBuckets);

int main()
{
    unsigned d[] = {5,3,6,9,2,11,9, 65534, 4,10,17,13, 268435455, 4294967294,4294967293, 268435454,65537};
    vector<unsigned> v(d,d+17);

    radix(v);
    return 0;
}

void radix(vector<unsigned>& data)
{
    int bytes = 1;                                  //  How many bytes to compare at a time
    unsigned numOfBuckets = getMaxForBytes(bytes) + 1;
    cout << "Numbuckets" << numOfBuckets << endl;
    int chunks = sizeof(unsigned) / bytes;

    for(int i = chunks - 1; i >= 0; --i) 
    {
        vector<list<unsigned> > buckets;            // lazy, wasteful allocation
        buckets.resize(numOfBuckets);

        unsigned mask = getMaxForBytes(bytes);
        unsigned shift = i * bytes * 8;
        mask = mask << shift;

        for(unsigned j = 0; j < data.size(); ++j)
        {
            unsigned bucket = data[j] & mask;       //  isolate bits of current chunk
            bucket = bucket >> shift;               //  bring bits down to least significant

            buckets[bucket].push_back(data[j]); 
        }

        print(buckets);

        merge(data,buckets);
    }
}

unsigned getMaxForBytes(unsigned bytes)
{
    unsigned max = 0;
    for(unsigned i = 1; i <= bytes; ++i)
    {
        max = max << 8;
        max |= 0xFF;
    }

    return max;
}

void merge(vector<unsigned>& data, vector<list<unsigned> >& listBuckets)
{
    int index = 0;
    for(unsigned i = 0; i < listBuckets.size(); ++i)
    {
        list<unsigned>& list = listBuckets[i];
        std::list<unsigned>::const_iterator it = list.begin();

        for(; it != list.end(); ++it)
        {
            data[index] = *it;
            ++index;
        }
    }
}

void print(const vector<list<unsigned> >& listBuckets)
{
    cout << "Printing listBuckets: " << endl;
    for(unsigned i = 0; i < listBuckets.size(); ++i)
    {
        const list<unsigned>& list = listBuckets[i];

        if(list.size() == 0) continue;

        std::list<unsigned>::const_iterator it = list.begin();  //  Why do I need std here!?
        for(; it != list.end(); ++it)
        {
            cout << *it << ", ";
        }

        cout << endl;
    }
}

更新：
似乎以 LSD 形式运行良好，可以通过更改基数中的块循环来修改它，如下所示：

for(int i = chunks - 1; i >= 0; --i)

score 3 · Accepted Answer

让我们看一下两位十进制数的示例：

49, 25, 19, 27, 87, 67, 22, 90, 47, 91

按第一位数字排序产生

19, 25, 27, 22, 49, 47, 67, 87, 90, 91

接下来，您按第二个数字排序，产生

90, 91, 22, 25, 27, 47, 67, 87, 19, 49

似乎不对，不是吗？或者这不是你在做什么？如果我弄错了，也许您可以向我们展示代码。

如果您对具有相同第一位数字的所有组进行第二次桶排序，则您的算法将等效于递归版本。它也会很稳定。唯一的区别是你会做桶排序广度优先而不是深度优先。

score 2 · Accepted Answer

您还需要确保在重新组装之前将每个桶从 MSD 分类到 LSD。示例：19,76,90,34,84,12,72,38 在 MSD 上排序为 10 个桶 [0-9] B0=[];B1=[19,12];B2=[];B3=[34 ,38];B4=[];B5=[];B6=[];B7=[76,72];B8=[84];B9=[90]; 如果您要重新组装然后再次分类，它将无法正常工作。而是递归地对每个桶进行排序。B1 被排序为 B1B2=[12];B1B9=[19] 全部排序后，您可以正确重新组装。

algorithm - 非常基本的基数排序

2 回答 2

Related

Reference