scala - scala组合器解析器保留原始输入

Question

我想从另一个解析器组成一个解析器，以将消耗的输入作为 ast 构造的参数。

说我有

def ingredient = amount ~ nameOfIngredient ^^ {
  case amount ~ name => Ingredient(name, amount)
}

我正在寻找的是一种让另一个解析器构造以下元素的方法：

case class RecipeRow(orginalText: String, ingredient: Ingredient)

所以我正在寻找一种方法来检索合成中解析器的原始消费输入。也许是这样的：

def recipeRow = ingredient withConsumedInput ^^ {
  case (amount ~ name, consumed) => RecipeRow(consumed, Ingredient(name, amount))
}

我猜这种情况下的签名是：

def withConsumedInput [U](p: => Parser[U]): Parser[(U, String)]

有没有另一种简单的方法来获得我想要的东西，或者我需要写那个东西？感觉这可能是一个更好的方法......</p>

score 4 · Accepted Answer

其实不容易。

让我们开始吧Parser：它能给我们带来什么？好吧，一个Parserextends Input => ParseResult，所以我们必须从任何一个中提取信息。

无论如何，该类型Input是一个别名。几乎没有什么可以帮助我们，除非它恰好是 a of ，在这种情况下我们可以使用and 。那我们就用那个吧。RegexParsersscala.util.parsing.input.Reader[Char]ReaderCharSequencesourceoffset

现在，ParseResult有很多子类，但我们只对有兴趣Success，它有一个next: Input字段。使用它，我们可以试试这个：

def withConsumedInput [U](p: => Parser[U]): Parser[(U, String)] = new Parser[(U, String)] {
  def apply(in: Input) = p(in) match {
    case Success(result, next) =>
      val parsedString = in.source.subSequence(in.offset, next.offset).toString
      Success(result -> parsedString, next)
    case other: NoSuccess      => other
  }
}

不过，它会捕获任何跳过的空格。您可以对其进行调整以自动避免这种情况：

def withConsumedInput [U](p: => Parser[U]): Parser[(U, String)] = new Parser[(U, String)] {
  def apply(in: Input) = p(in) match {
    case Success(result, next) =>
      val parsedString = in.source.subSequence(handleWhiteSpace(in.source, in.offset), next.offset).toString
      Success(result -> parsedString, next)
    case other: NoSuccess      => other
  }
}

scala - scala组合器解析器保留原始输入

1 回答 1

Related

Reference