第一个问题(重复)可以通过一个小的重构来解决。基本思想是将每个解析器包装到一个可以跳过的包装器中。
请注意,这段代码还远非完美,我只是试图使重构尽可能小。
let jobId: Parser<int> = pint32
let status: Parser<Status> =
(skipStringCI "Running" >>% Running) <|> (skipStringCI "Done" >>% Done)
let count: Parser<int> = pint32
let skipAndParse prefix parser =
skipCharsTillStringCI prefix false 1000
>>. ws >>. skipStringCI prefix >>. ws >>. skipChar '=' >>. ws >>. parser
let parse: Parser<Job> = parse {
let! id = skipAndParse "Job id" jobId
let! status = skipAndParse "Status" status
let! count = skipAndParse "Count" count
return { Id = id; Status = status; Count = count }}
第二个问题更复杂。如果您希望数据线以自由顺序出现,则必须考虑以下情况:
为了缓解这种情况,您需要生成一条list
找到的数据线,分析所需的一切是否都在那里,并决定如何处理任何可能的重复项。
请注意,每个数据行不能再有“跳过”部分,因为它可能会在实际解析器之前跳过一个信息行。
let skipAndParse2 prefix parser =
ws >>. skipStringCI prefix >>. ws >>. skipChar '=' >>. ws >>. parser
// Here, you create a DU that will say which data line was found
type Result =
| Id of int
| Status of Status
| Count of int
| Irrelevant of string
// here's a combinator parser
let parse2 =
// list of possible data line parsers
// Note they are intentionally reordered
[
skipAndParse2 "Count" count |>> Count
skipAndParse2 "Status" status |>> Status
skipAndParse2 "Job id" jobId |>> Id
// the trailing one would skip a line in case if it has not
// been parsed by any of prior parsers
// a guard rule is needed because of specifics of
// restOfLine behavior at the end of input: namely, it would
// succeed without consuming an input, which leads
// to an infinite loop. Actually FParsec handles this and
// raises an exception
restOfLine true .>> notFollowedByEof |>> Irrelevant
]
|> List.map attempt // each parser is optional
|> choice // on each iteration, one of the parsers must succeed
|> many // a loop
运行代码:
let sample = "
Some irrelevant text.\n\
Job id = 33\n\
Some other text.\n\
Status = Done\n\
And another text.\n\
Count = 10\n\
Trailing text.\n\
"
sample |> run parse2 |> printfn "%A "
将产生以下输出:
Success: [Irrelevant ""; Irrelevant "Some irrelevant text."; Id 33;
Irrelevant ""; Irrelevant "Some other text."; Status Done; Irrelevant "";
Irrelevant "And another text."; Count 10; Irrelevant ""]
进一步的处理需要过滤Irrelevant
元素,检查重复或缺失的项目,并形成Job
记录或引发错误。
更新:进一步处理以隐藏Result
并返回的简单示例Job option
:
// naive implementation of the record maker
// return Job option
// ignores duplicate fields (uses the first one)
// returns None if any field is missing
let MakeJob arguments =
let a' =
arguments
|> List.filter (function |Irrelevant _ -> false | _ -> true)
try
let theId = a' |> List.pick (function |Id x -> Some x | _ -> None)
let theStatus = a' |> List.pick (function |Status x -> Some x | _ -> None)
let theCount = a' |> List.pick (function |Count x -> Some x | _ -> None)
Some { Id=theId; Status = theStatus; Count = theCount }
with
| :?System.Collections.Generic.KeyNotFoundException -> None
要使用它,只需将以下行添加到代码中parse2
:
|>> MakeJob