2

我正在尝试使用 FParsec 解析令牌列表,其中每个令牌是文本块或标签 - 例如:

这是一个{type of test}测试,它{成功或失败}

这是解析器:

type Parser<'t> = Parser<'t, unit>

type Token =
| Text of string
| Tag of string

let escape fromString toString : Parser<_> =
    pstring fromString |>> (fun c -> toString)

let content : Parser<_> =
    let contentNormal = many1Satisfy (fun c -> c <> '{' && c <> '}')
    let openBraceEscaped = escape "{{" "{"
    let closeBraceEscaped = escape "}}" "}"
    let contentEscaped = openBraceEscaped <|> closeBraceEscaped
    stringsSepBy contentNormal contentEscaped

let ident : Parser<_> =
    let isIdentifierFirstChar c = isLetter c || c = '_'
    let isIdentifierChar c = isLetter c || isDigit c || c = '_'
    spaces >>. many1Satisfy2L isIdentifierFirstChar isIdentifierChar "identifier" .>> spaces

let text = content |>> Text

let tag = 
    ident |> between (skipString "{") (skipString "}")
    |>> Tag

let token = text <|> tag
let tokens = many token .>>. eof   

以下测试有效:

> run token "abc def" ;;
val it : ParserResult<Token,unit> = Success: Text "abc def"

> run token "{abc def}" ;;
val it : ParserResult<Token,unit> = Success: Tag "abc def"

但尝试运行令牌会导致异常:

> run tokens "{abc} def" ;;
System.InvalidOperationException: (Ln: 1, Col: 10): The combinator 'many' was 
    applied to a parser that succeeds without consuming input and without
    changing the parser state in any other way. (If no exception had been raised,
    the combinator likely would have entered an infinite loop.)

我已经解决了这个 stackoverflow 问题,但我没有尝试过任何工作。我什至添加了以下内容,但我得到了同样的例外:

let tokenFwd, tokenRef = createParserForwardedToRef<Token, unit>()
do tokenRef := choice [tag; text]
let readEndOfInput : Parser<unit, unit> = spaces >>. eof
let readExprs = many tokenFwd
let readExprsTillEnd = readExprs .>> readEndOfInput

run readExprsTillEnd "{abc} def"  // System.InvalidOperationException ... The combinator 'many' was applied  ...

我认为问题是内容中的 stringsSepBy,但我想不出任何其他方法来获取带有转义项的字符串

任何帮助将不胜感激 - 我已经经历了几天,但无法弄清楚。

4

2 回答 2

2

stringsSepBy接受零字符串,导致token接受空字符串,导致许多人抱怨。

我将其更改为以下内容以验证这是您需要处理的行。

many1 (contentNormal <|> contentEscaped) |>> fun l -> String.concat "" l

我也摆脱了stringsSepBy contentNormal contentEscaped,因为这表示您需要将contentNormals与它们之间的contentEscapeds匹配。所以 a{{b}}c 是可以的,但是 {{b}}、{{b}}c 和 a{{b}} 会失败。

于 2014-05-17T11:52:31.567 回答
1

notEmpty可用于消耗输入。如果您不使用任何输入但让解析器成功,则解析器的“当前位置”不会向前移动,因此当执行此操作的语句在 a 内many时,它将进入无限循环,而不会出现该异常。stringsSepBy正在成功并解析零个元素,notEmpty如果它获得零个元素,您可以使用它来失败:

stringsSepBy contentNormal contentEscaped |> notEmpty

此外,我试图让您的完整示例进行解析,标签可以包含空格,因此您需要允许ident包含空格以匹配:

let isIdentifierChar c = isLetter c || isDigit c || c = '_' || c = ' '

另一个小的调整是只返回一个Token list而不是Token list * unit元组(unit是的结果eof):

let tokens = many token .>> eof  
于 2014-05-17T12:32:28.820 回答