c++ - 如何在 Spirit Lex 模式中使用斜线？

Question

下面的代码编译得很好

clang++ -std=c++11 test.cpp -o test

但是运行时会抛出异常

在抛出 'boost::lexer::runtime_error' what() 的实例后调用终止：尚不支持 Lookahead ('/')。

问题是输入和/或正则表达式（第 12 和 39 行）中的斜杠（/），但我找不到如何正确转义它的解决方案。有什么提示吗？

#include <string>
#include <cstring>
#include <boost/spirit/include/lex.hpp>
#include <boost/spirit/include/lex_lexertl.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>

namespace lex        = boost::spirit::lex;
namespace qi         = boost::spirit::qi;
namespace phoenix    = boost::phoenix;

std::string regex("FOO/BAR");

template <typename Type>
struct Lexer : boost::spirit::lex::lexer<Type> {
    Lexer() : foobar_(regex) {
        this->self.add(foobar_);
    }
    boost::spirit::lex::token_def<std::string> foobar_;
};

template <typename Iterator, typename Def>
struct Grammar
  : qi::grammar <Iterator, qi::in_state_skipper<Def> > {
    template <typename Lexer> Grammar(const Lexer & _lexer);
    typedef qi::in_state_skipper<Def> Skipper;
    qi::rule<Iterator, Skipper> rule_;
};
template <typename Iterator, typename Def>
template <typename Lexer>
Grammar<Iterator, Def>::Grammar(const Lexer & _lexer)
  : Grammar::base_type(rule_) {
    rule_ = _lexer.foobar_;
}

int main() {
    // INPUT
    char const * first("FOO/BAR");
    char const * last(first + strlen(first));

    // LEXER
    typedef lex::lexertl::token<const char *> Token;
    typedef lex::lexertl::lexer<Token> Type;
    Lexer<Type> l;

    // GRAMMAR
    typedef Lexer<Type>::iterator_type Iterator;
    typedef Lexer<Type>::lexer_def Def;
    Grammar<Iterator, Def> g(l);

    // PARSE
    bool ok = lex::tokenize_and_phrase_parse (
        first
      , last
      , l
      , g
      , qi::in_state("WS")[l.self]
    );

    // CHECK
    if (!ok || first != last) {
        std::cout << "Failed parsing input file" << std::endl;
        return 1;
    }
    return 0;
}

score 1 · Accepted Answer

正如sehe 指出的那样，/可能打算用作前瞻运算符，可能会采用flex 的语法。不幸的是，Spirit 不会使用更正常的前瞻语法（并不是说我认为其他语法更优雅；它只是与正则表达式语法中的所有细微变化混淆）。

如果你看re_tokeniser.hpp：

// Not an escape sequence and not inside a string, so
// check for meta characters.
switch (ch_)
{
    ...
    case '/':
        throw runtime_error("Lookahead ('/') is not supported yet.");
        break;
    ...
}

它认为您不在转义序列中，也不在字符串中，因此它正在检查元字符。/被认为是前瞻的元字符（即使该功能没有实现），并且必须被转义，尽管 Boost 文档根本没有提到.

尝试用反斜杠（即，或者如果使用原始字符串）转义/（不在输入中）。或者，其他人建议使用."\\/""\/"[/]

我认为这是 Spirit Lex 文档中的一个错误，因为它没有指出/必须转义。

编辑：感谢 sehe和cv_and_he ，他们帮助纠正了我之前的一些想法。如果他们在此处发布答案，请务必给他们 +1。

c++ - 如何在 Spirit Lex 模式中使用斜线？

1 回答 1

Related

Reference