c++ - 从 boost Spirit x3 解析器返回的向量中的空字符串

Question

我想检查一个文件中的所有枚举（这只是一个 MCVE，所以没什么复杂的）并且枚举的名称应该存储在std::vector我构建的解析器中，如下所示：

auto const any = x3::rule<class any_id, const x3::unused_type>{"any"}
               = ~x3::space;

auto const identifier = x3::rule<class identifier_id, std::string>{"identifier"}
                      = x3::lexeme[x3::char_("A-Za-z_") >> *x3::char_("A-Za-z_0-9")];

auto const enum_finder = x3::rule<class enum_finder_id, std::vector<std::string>>{"enum_finder"}
                       = *(("enum" >> identifier) | any);

当我尝试将带有 this 的字符串解析enum_finder为 astd::vector时，std::vector还包含很多空字符串。为什么这个解析器还将空字符串解析到向量中？

score 2 · Accepted Answer

我假设您想从自由格式文本中解析“枚举”而忽略空格。

你真正想要的是("enum" >> identifier | any)合成一个optional<string>. 可悲的是，你得到的是variant<string, unused_type>或类似的东西。

包装时也会发生同样的情况any-x3::omit[any]它仍然是相同的未使用类型。

计划 B：既然您实际上只是在解析由“任何东西”分隔的重复枚举 ID，为什么不使用列表运算符：

     ("enum" >> identifier) % any

这有点作用。现在进行一些调整：让我们避免逐个字符地吃“任何”字符。事实上，我们很可能只使用整个空格分隔的单词：（注意+~space是等效的+graph）：

auto const any = x3::rule<class any_id>{"any"}
               = x3::lexeme [+x3::graph];

接下来，为了允许连续接受多个虚假词，有一个技巧可以使列表的主题解析器成为可选：

       -("enum" >> identifier) % any;

这可以正确解析。查看完整演示：

演示

Live On Coliru

#include <boost/spirit/home/x3.hpp>
namespace x3 = boost::spirit::x3;

namespace parser {
    using namespace x3;
    auto any         = lexeme [+~space];
    auto identifier  = lexeme [char_("A-Za-z_") >> *char_("A-Za-z_0-9")];
    auto enum_finder = -("enum" >> identifier) % any;
}

#include <iostream>
int main() {

    for (std::string input : {
            "",
            "  ",
            "bogus",
            "enum one",
            "enum one enum two",
            "enum one bogus bogus more bogus enum two !@#!@#Yay",
        })
    {
        auto f = input.begin(), l = input.end();
        std::cout << "------------ parsing '" << input << "'\n";

        std::vector<std::string> data;
        if (phrase_parse(f, l, parser::enum_finder, x3::space, data))
        {
            std::cout << "parsed " << data.size() << " elements:\n";
            for (auto& el : data)
                std::cout << "\t" << el << "\n";
        } else {
            std::cout << "Parse failure\n";
        }

        if (f!=l)
            std::cout << "Remaining unparsed: '" << std::string(f,l) << "'\n";
    }

}

印刷：

------------ parsing ''
parsed 0 elements:
------------ parsing '  '
parsed 0 elements:
------------ parsing 'bogus'
parsed 0 elements:
------------ parsing 'enum one'
parsed 1 elements:
    one
------------ parsing 'enum one enum two'
parsed 1 elements:
    one
------------ parsing 'enum one bogus bogus more bogus enum two !@#!@#Yay'
parsed 2 elements:
    one
    two

c++ - 从 boost Spirit x3 解析器返回的向量中的空字符串

1 回答 1

演示

Related

Reference