2

为了处理大量的编译时间和语法的重用,我将我的语法组合成几个按顺序调用的子语法。其中之一(称为:SETUP 语法)提供了解析器的一些配置(通过符号解析器),因此后来的子语法在逻辑上依赖于该配置(同样通过不同的符号解析器)。所以,解析完SETUP后,需要修改以下子文法的符号解析器。

我的问题是,如何在保持子语法之间松散耦合的同时有效地解决这个问题?

目前我只看到两种可能性:

  • SETUP 语法的 on_success 处理程序,它可以完成这项工作,但这会引入一些耦合。
  • 在 SETUP 之后,将所有内容解析为字符串,构建一个新的解析器(从更改的符号)并在第二步中解析该字符串。这会留下相当多的开销。

我想要的是一个 on_before_parse 处理程序,它可以由任何需要在每次解析之前做一些工作的语法来实现。从我的角度来看,这将引入更少的耦合,并且解析器的某些设置在其他情况下也可以派上用场。这样的事情可能吗?

更新:

对不起,粗略,那不是我的意图。

任务是用一些关键字(如#task1和)解析输入#task2I。但在某些情况下,这些关键字需要不同,例如$$task1$$task2

所以解析的文件将以

setup {
  #task1=$$task1
  #task2=$$task2
}

realwork {
  ...
}

一些代码草图: Given 是一个主解析器,由几个(至少两个)解析器组成。

template<typename Iterator>
struct MainParser: qi::grammar<Iterator, Skipper<Iterator>> {

  MainParser() : MainParser::base_type(start) {
    start = setup >> realwork;
  }

  Setup<Iterator>    setup;
  RealWork<Iterator> realwork;

  qi::rule<Iterator, Skipper<Iterator> > start;
}

Setup并且RealWork它们本身就是解析器(我上面的子解析器)。在设置部分,可能会更改语法的某些关键字,因此设置部分有一个qi::symbols<char, keywords>规则。一开始,这些符号将包含#task1#task2。解析文件的第一部分后,它们包含$$task1$$task2

由于关键字已经改变并且由于RealWork需要解析 I,所以它需要了解新的关键字。所以我必须在文件配对期间将符号从转移Setup到。RealWork

我看到的两种方法是:

  • 在. Setup_ RealWork_ Setup_ RealWork_ _ (坏,耦合)qi::on_successSetup
  • 切换到两个解析步骤。看起来startMainParser

    start = setup >> unparsed_rest
    

    之后会有第二个解析器MainParser。示意图:

    SymbolTable Table;
    string Unparsed_Rest;
    MainParser.parse(Input, (Unparsed_Rest, Table));
    
    RealWordParser.setupFromAlteredSymbolTable(Table);
    RealWorkParser.parse(Unparsed_Rest);
    

    几个解析步骤的开销。

所以,到目前为止,属性还没有发挥作用。只需在解析时更改解析器即可处理多种输入文件。

我希望是qi::on_before_parseqi::on_success. 从这个想法来看,每次解析器开始解析输入时都会触发这个处理程序。理论上只是解析开始时的拦截,就像我们有拦截on_successon_error.

4

1 回答 1

3

Sadly, you showed no code, and your description is a bit... sketchy. So here's a fairly generic answer that addresses some of the points I was able to distill from your question:

Separation of concerns

It sounds very much like you need to separate AST building from transformation/processing steps.

Parser composition

Of course you can compose grammars. Simply compose grammars as you would rules and hide the implementation of these grammars in any traditional way you would (pImpl idiom, const static internal rules, whatever fits the bill).

However, the composition usually doesn't require an 'event' driven element: if you feel the need to parse in two phases, it sounds to me you're just struggling to keep the overview, but recursive descent or PEG grammars are naturally well-suited to describe grammars like that in one swoop (or one pass, if you will).

However, if you find that

(a) your grammar gets complicated
(b) or you want to be able to selectively plugin subgrammars depending on runtime features

You could consider

  1. The Nabialek trick (I've shown/mentioned this on several occasions in my [tag:boost-spirit] answers on this site
  2. You could build rules dynamically (this is not readily recommended because you'll run in deadly traps having to do with copying Proto expression trees which leads to dangling references). I have also shown some answers doing this on occasion:

    REPEAT: don't try this unless you know how to detect UB and fix things with Proto

Hope these things help you on track. If not, I suggest you come back with a concrete question. I'm much more at home with code than 'ideas' because ideas often mean something else to you than to me.

于 2013-07-22T19:41:42.613 回答