c++ - boost::spirit::qi 期望解析器和解析器分组意外行为

Question

我希望有人可以通过我在精神解析中使用>and运算符的无知来照亮。>>

我有一个有效的语法，顶级规则看起来像

test = identifier >> operationRule >> repeat(1,3)[any_string] >> arrow >> any_string >> conditionRule;

它依赖于属性将解析值自动分配给融合适应的结构（即提升元组）。

但是，我知道一旦我们匹配了 operationRule，我们必须继续或失败（即我们不想让回溯尝试其他以开头的规则identifier）。

test = identifier >> 
           operationRule > repeat(1,3)[any_string] > arrow > any_string > conditionRule;

这会导致一个神秘的编译器错误 ( 'boost::Container' : use of class template requires template argument list)。Futz 绕了一下，编译如下：

test = identifier >> 
           (operationRule > repeat(1,3)[any_string] > arrow > any_string > conditionRule);

但属性设置不再起作用 - 我的数据结构在解析后包含垃圾。这可以通过添加类似的动作来解决[at_c<0>(_val) = _1]，但这似乎有点笨拙 - 以及根据 boost 文档使事情变慢。

所以，我的问题是

是否值得防止回溯？
为什么需要分组运算符()
我上面的最后一个示例是否真的在匹配后停止回溯operationRule（我怀疑不是，似乎如果(...)失败回溯中的整个解析器将被允许）？
如果上一个问题的答案是/no/，我该如何构造一个规则，如果operation/not/匹配则允许回溯，但一旦操作/is/匹配则不允许回溯？
为什么分组运算符会破坏属性语法 - 需要操作？

我意识到这是一个相当广泛的问题 - 任何指向正确方向的提示都将不胜感激！

score 5 · Accepted Answer

是否值得防止回溯？

绝对地。一般来说，防止回溯是提高解析器性能的一种行之有效的方法。
- 减少使用（负）前瞻（运算符！，运算符 - 和一些运算符＆）
- 排序分支（运算符 |、运算符 ||、运算符^ 和某些运算符 */-/+），以便最频繁/最可能的分支首先排序，或者最昂贵的分支最后尝试
使用期望点 ( >) 并不会从本质上减少回溯：它只会禁止它。这将启用有针对性的错误消息，防止无用的“解析为未知”。
为什么我需要分组operator ()

我不知道。我从这里使用我的助手进行了检查what_is_the_attr
- ident >> op >> repeat(1,3)[any] >> "->" >> any
  综合属性：
```
fusion::vector4<string, string, vector<string>, string>
```
- ident >> op > repeat(1,3)[any] > "->" > any
  综合属性：
```
fusion::vector3<fusion::vector2<string, string>, vector<string>, string>
```
我还没有发现需要使用括号对子表达式进行分组（东西编译），但显然DataT需要修改以匹配更改后的布局。
```
typedef boost::tuple<
    boost::tuple<std::string, std::string>, 
    std::vector<std::string>, 
    std::string
> DataT;
```

下面的完整代码显示了我希望如何使用适应的结构来做到这一点。

我上面的例子在 operationRule 匹配后真的停止回溯吗（我怀疑不是，似乎如果 (...) 内的整个解析器失败，回溯将被允许）？

绝对地。如果未满足期望，qi::expectation_failure<>则抛出异常。默认情况下，这会中止解析。您可以将 qi::on_error 用于retry、fail或. MiniXML例子有很好的例子来使用期望点acceptrethrowqi::on_error
如果上一个问题的答案是/no/， ~~我该如何构造一个规则，如果操作是/not/匹配则允许回溯，但一旦操作/is/匹配则不允许回溯？~~
为什么分组运算符会破坏属性语法 - 需要操作？

它不会破坏属性语法，它只是改变暴露的类型。因此，如果您将适当的属性引用绑定到规则/语法，则不需要语义操作。现在，我觉得应该有办法不用 grouping ~~，所以让我试试（最好在你的简短自包含样本上）。~~ 事实上，我发现没有这种需要。我添加了一个完整的示例来帮助您了解我的测试中发生了什么，而不是使用语义操作。

完整代码

完整的代码展示了 5 个场景：

选项 1：没有期望的原创

（无相关变化）
选项2：有期望

对 DataT 使用修改后的 typedef（如上图）
选项 3：适应结构，没有期望

使用带有 BOOST_FUSION_ADAPT_STRUCT 的用户定义结构
选项 4：适应结构，有期望

从选项 3 修改适应的结构
选项 5：前瞻黑客

这个利用了一个“聪明的”（？）hack，通过将所有内容都>>变成预期，并事先检测到operationRule-match 的存在。这当然不是最理想的，但允许您保持不变DataT，并且不使用语义操作。

显然，OPTION在编译之前定义为所需的值。

#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/karma.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <boost/fusion/adapted.hpp>
#include <iostream>

namespace qi    = boost::spirit::qi; 
namespace karma = boost::spirit::karma; 

#ifndef OPTION
#define OPTION 5
#endif

#if OPTION == 1 || OPTION == 5 // original without expectations (OR lookahead hack)
    typedef boost::tuple<std::string, std::string, std::vector<std::string>, std::string> DataT;
#elif OPTION == 2 // with expectations
    typedef boost::tuple<boost::tuple<std::string, std::string>, std::vector<std::string>, std::string> DataT;
#elif OPTION == 3 // adapted struct, without expectations
    struct DataT
    {
        std::string identifier, operation;
        std::vector<std::string> values;
        std::string destination;
    };

    BOOST_FUSION_ADAPT_STRUCT(DataT, (std::string, identifier)(std::string, operation)(std::vector<std::string>, values)(std::string, destination));
#elif OPTION == 4 // adapted struct, with expectations
    struct IdOpT
    {
        std::string identifier, operation;
    };
    struct DataT
    {
        IdOpT idop;
        std::vector<std::string> values;
        std::string destination;
    };

    BOOST_FUSION_ADAPT_STRUCT(IdOpT, (std::string, identifier)(std::string, operation));
    BOOST_FUSION_ADAPT_STRUCT(DataT, (IdOpT, idop)(std::vector<std::string>, values)(std::string, destination));
#endif

template <typename Iterator>
struct test_parser : qi::grammar<Iterator, DataT(), qi::space_type, qi::locals<char> >
{
    test_parser() : test_parser::base_type(test, "test")
    {
        using namespace qi;

        quoted_string = 
               omit    [ char_("'\"") [_a =_1] ]             
            >> no_skip [ *(char_ - char_(_a))  ]
             > lit(_a); 

        any_string = quoted_string | +qi::alnum;

        identifier = lexeme [ alnum >> *graph ];

        operationRule = string("add") | "sub";
        arrow = "->";

#if OPTION == 1 || OPTION == 3   // without expectations
        test = identifier >> operationRule >> repeat(1,3)[any_string] >> arrow >> any_string;
#elif OPTION == 2 || OPTION == 4 // with expectations
        test = identifier >> operationRule  > repeat(1,3)[any_string]  > arrow  > any_string;
#elif OPTION == 5                // lookahead hack
        test = &(identifier >> operationRule) > identifier > operationRule > repeat(1,3)[any_string] > arrow > any_string;
#endif
    }

    qi::rule<Iterator, qi::space_type/*, qi::locals<char> */> arrow;
    qi::rule<Iterator, std::string(), qi::space_type/*, qi::locals<char> */> operationRule;
    qi::rule<Iterator, std::string(), qi::space_type/*, qi::locals<char> */> identifier;
    qi::rule<Iterator, std::string(), qi::space_type, qi::locals<char> > quoted_string, any_string;
    qi::rule<Iterator, DataT(),       qi::space_type, qi::locals<char> > test;
};

int main()
{
    std::string str("addx001 add 'str1'   \"str2\"       ->  \"str3\"");
    test_parser<std::string::const_iterator> grammar;
    std::string::const_iterator iter = str.begin();
    std::string::const_iterator end  = str.end();

    DataT data;
    bool r = phrase_parse(iter, end, grammar, qi::space, data);

    if (r)
    {
        using namespace karma;
        std::cout << "OPTION " << OPTION << ": " << str << " --> ";
#if OPTION == 1 || OPTION == 3 || OPTION == 5 // without expectations (OR lookahead hack)
        std::cout << format(delimit[auto_ << auto_ << '[' << auto_ << ']' << " --> " << auto_], data) << "\n";
#elif OPTION == 2 || OPTION == 4 // with expectations
        std::cout << format(delimit[auto_ << '[' << auto_ << ']' << " --> " << auto_], data) << "\n";
#endif
    }
    if (iter!=end)
        std::cout << "Remaining: " << std::string(iter,end) << "\n";
}

所有选项的输出：

for a in 1 2 3 4 5; do g++ -DOPTION=$a -I ~/custom/boost/ test.cpp -o test$a && ./test$a; done
OPTION 1: addx001 add 'str1'   "str2"       ->  "str3" --> addx001 add [ str1 str2 ]  -->  str3 
OPTION 2: addx001 add 'str1'   "str2"       ->  "str3" --> addx001 add [ str1 str2 ]  -->  str3 
OPTION 3: addx001 add 'str1'   "str2"       ->  "str3" --> addx001 add [ str1 str2 ]  -->  str3 
OPTION 4: addx001 add 'str1'   "str2"       ->  "str3" --> addx001 add [ str1 str2 ]  -->  str3 
OPTION 5: addx001 add 'str1'   "str2"       ->  "str3" --> addx001 add [ str1 str2 ]  -->  str3

c++ - boost::spirit::qi 期望解析器和解析器分组意外行为

1 回答 1

完整代码

Related

Reference