2

我想使用 Boost 的 Regex 库将包含标签和数字的字符串分隔成标记。例如'abc1def002g30'会被分成{'abc','1','def','002','g','30'}. 我修改了Boost 文档中给出的示例以提供以下代码:

#include <iostream>
#include <boost/regex.hpp>

using namespace std;

int main(int argc,char **argv){
    string s,str;
    int count;
    do{
        count=0;
        if(argc == 1)
        {
            cout << "Enter text to split (or \"quit\" to exit): ";
            getline(cin, s);
            if(s == "quit") break;
        }
        else
            s = "This is a string of tokens";

        boost::regex re("[0-9]+|[a-z]+");
        boost::sregex_token_iterator i(s.begin(), s.end(), re, 0);
        boost::sregex_token_iterator j;
        while(i != j)
        {
            str=*i;
            cout << str << endl;
            count++;
            i++;
        }
        cout << "There were " << count << " tokens found." << endl;

    }while(argc == 1);
    return 0;
}

存储的令牌数量count是正确的。但是,*it仅包含一个空字符串,因此不会打印任何内容。关于我做错了什么的任何猜测?

编辑:根据下面建议的修复,我修改了代码,它现在可以正常工作。

4

1 回答 1

2

从 sregex_token_iterator 上的文档:

Effects: constructs a regex_token_iterator that will enumerate one string for each regular expression match of the expression re found within the sequence [a,b), using match flags m (see match_flag_type). The string enumerated is the sub-expression submatch for each match found; if submatch is -1, then enumerates all the text sequences that did not match the expression re (that is to performs field splitting)

Since your regex matching all items (unlike the sample code, which only matched the strings), you get empty results.

Try replacing it with a 0.

于 2011-05-24T17:13:03.893 回答