2

我有这个代码:

#include <boost/tokenizer.hpp>

typedef boost::tokenizer<boost::char_separator<char> > tokenizer;

int main() {
    using namespace std;
    boost::char_separator<char> sep(",");

    string s1 = "hello, world";
    tokenizer tok1(s1, sep);
    for (auto& token : tok1) {
        cout << token << " ";
    }
    cout << endl;

    tokenizer tok2(string("hello, world"), sep);
    for (auto& token : tok2) {
        cout << token << " ";
    }
    cout << endl;

    tokenizer tok3(string("hello, world, !!"), sep);
    for (auto& token : tok3) {
        cout << token << " ";
    }
    cout << endl;

    return 0;
}

此代码产生以下结果:

hello  world 
hello  
hello  world  !!

显然,第二行是错误的。hello world相反,我期待着。问题是什么?

4

1 回答 1

5

The tokenizer does not create a copy of the string you pass as the first argument to its constructor, nor does it compute all the tokens upon construction and then cache them. Token extraction is performed in a lazy way, on demand.

However, in order for that to be possible, the object on which the token extraction is performed must stay alive as long as token are being extracted.

Here, the object from which tokens are to be extracted goes out of scope when the initialization of tok2 terminates (the same applies to tok3). This means you will get undefined behavior when the tokenizer object will try to use iterators into that string.

Notice, that tok3 is giving you the expected output purely by chance. The expected output is indeed one of the possible outputs of a program with undefined behavior.

于 2013-06-09T16:51:44.807 回答