c++ - Boost Spirit lex 将令牌值写回输入流

Question

我想知道 boost::spirit::lex 中是否有办法将令牌值写回输入流（可能在编辑后）并再次重新扫描。我基本上在寻找的是 Flex 中 unput() 提供的功能。

谢谢！

score 3 · Accepted Answer

听起来您只想接受不同顺序但含义相同的令牌。

事不宜迟，这里有一个完整的示例，展示了如何完成此操作，无论输入顺序如何都会公开标识符。输出：

Input 'abc(' Parsed as: '(abc'
Input '(abc' Parsed as: '(abc'

代码

#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/lex_lexertl.hpp>
#include <iostream>
#include <string>

using namespace boost::spirit;

///// LEXER
template <typename Lexer>
struct tokens : lex::lexer<Lexer>
{
    tokens()
    {
        identifier = "[a-zA-Z][a-zA-Z0-9]*";
        paren_open = '(';

        this->self.add
            (identifier)
            (paren_open)
            ;
    }

    lex::token_def<std::string> identifier;
    lex::token_def<lex::omit> paren_open;
};

///// GRAMMAR
template <typename Iterator>
struct grammar : qi::grammar<Iterator, std::string()>
{
    template <typename TokenDef>
        grammar(TokenDef const& tok) : grammar::base_type(ident_w_parenopen)
    {
        ident_w_parenopen = 
              (tok.identifier >> tok.paren_open)
            | (tok.paren_open >> tok.identifier) 
            ;
    }
  private:
    qi::rule<Iterator, std::string()> ident_w_parenopen;
};

///// DEMONSTRATION
typedef std::string::const_iterator It;

template <typename T, typename G>
void DoTest(std::string const& input, T const& tokens, G const& g)
{
    It first(input.begin()), last(input.end());

    std::string parsed;
    bool r = lex::tokenize_and_parse(first, last, tokens, g, parsed);

    if (r) {
        std::cout << "Input '" << input << "' Parsed as: '(" << parsed << "'\n";
    }
    else {
        std::string rest(first, last);
        std::cerr << "Parsing '" << input << "' failed\n" << "stopped at: \"" << rest << "\"\n";
    }
}

int main(int argc, char* argv[])
{
    typedef lex::lexertl::token<It, boost::mpl::vector<std::string> > token_type;
    typedef lex::lexertl::lexer<token_type> lexer_type;
    typedef tokens<lexer_type>::iterator_type iterator_type;

    tokens<lexer_type> tokens;
    grammar<iterator_type> g (tokens);

    DoTest("abc(", tokens, g);
    DoTest("(abc", tokens, g);
}

score 0 · Accepted Answer

我最终实现了自己的 unput() 功能，如下所示：

   struct unputImpl
   {
      template <typename Iter1T, typename Iter2T, typename StrT>
      struct result {
         typedef void type;
      };

      template <typename Iter1T, typename Iter2T, typename StrT>
      typename result<Iter1T, Iter2T, StrT>::type operator()(Iter1T& start, Iter2T& end, StrT str) const {
         start -= (str.length() - std::distance(start, end));
         std::copy(str.begin(), str.end(), start);
         end = start;
      }
   };

   phoenix::function<unputImpl> const unput = unputImpl();

然后可以像这样使用它：

   this->self += lex::token_def<lex::omit>("{SYMBOL}\\(")
        [
           unput(_start, _end, "(" + construct<string>(_start, _end - 1) + " "),
           _pass = lex::pass_flags::pass_ignore
        ];

如果未输入的字符串长度大于匹配的标记长度，它将覆盖一些先前解析的输入。您需要注意的是确保输入字符串在开始时有足够的空白空间来处理为第一个匹配的令牌调用 unput() 的情况。

c++ - Boost Spirit lex 将令牌值写回输入流

2 回答 2

Related

Reference