6

I am trying to use Parsec to parse something like this:

property :: CharParser SomeObject
property = do
    name
    parameters
    value
    return SomeObjectInstance { fill in records here }

I am implementing the iCalendar spec and on every like there is a name:parameters:value triplet, very much like the way that XML has a name:attributes:content triplet. Infact you could very easily convert an iCalendar into XML format (thought I can't really see the advantages).

My point is that the parameters do not have to come in any order at all and each paramater may have a different type. One parameter may be a string while the other is the numeric id of another element. They may share no similarity yet, in the end, I want to place them correctly in the right record fields for whatever 'SomeObjectInstance' that I wanted the parser to return. How do I go about doing this sort of thing (or can you point me to an example of where somebody had to parse data like this)?

Thankyou, I know that my question is probably a little confused but that reflects my level of understanding of what I need to do.

Edit: I was trying to avoid giving the expected output (because it is large, not because it is hidden) but here is an example of an input file (from wikipedia):

BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//hacksw/handcal//NONSGML v1.0//EN
BEGIN:VEVENT
UID:uid1@example.com
DTSTAMP:19970714T170000Z
ORGANIZER;CN=John Doe:MAILTO:john.doe@example.com
DTSTART:19970714T170000Z
DTEND:19970715T035959Z
SUMMARY:Bastille Day Party
END:VEVENT
END:VCALENDAR

As you can see it contains one VEvent inside a VCalendar, I have made data structures that represent them here.

I am trying to write a parser that parses that type of file into my data structures and I am stuck on the bit where I need to handle properties coming in any order with any type; date, time, int, string, uid, ect. I hope that makes more sense without repeating the entire iCalendar spec.

4

2 回答 2

6

Parsec 具有 Parsec.Perm 模块,可以精确地解析无序但线性(即在语法树中的同一级别)元素,例如 XML 文件中的属性标签。

不幸的是,Perm 模块大多没有文档记录。最好的参考是 Haddock 文档页面所引用的 Parsing Permutation Phrases 论文,但即便如此,它也主要是对该技术的描述,而不是如何使用它。

于 2010-09-14T07:50:52.787 回答
1

好的,在BEGIN:VEVENTand之间END:VEVENT,你有很多键值对。所以写一个keyValuePair返回的规则(key, value)。现在在规则内为VEVENTmany KeyValuePair获取配对列表。完成后,您可以使用折叠来使用给定值填充 VEVENT 记录。在你给 fold 的函数中,你使用模式匹配来找出在哪个字段中存储值。作为累加器的起始值,您使用 VEvent 记录,其中可选字段设置为Nothing。例子:

pairs <- many keyValuePairs
vevent = foldr f (VEvent {sequence = Nothing}) pairs
    where f ("SUMMARY", v) ve = ve {summary = v}
          f ("DSTART", v) ve = ve {dstart = read v}

...等等。对其他组件执行相同操作。

编辑:这是折叠的一些可运行示例代码:

data VEvent = VEvent {
        summary :: String,
        dstart :: String,
        sequenceSt :: Maybe String
        } deriving Show

vevent pairs = foldr f (VEvent {sequenceSt = Nothing}) pairs
    where f ("SUMMARY", v) ve = ve {summary = v}
          f ("DSTART", v) ve = ve {dstart = v}
          f ("SEQUENCEST", v) ve = ve {sequenceSt = Just v}

main = do print $ vevent [("SUMMARY", "lala"), ("DSTART", "lulu")]
          print $ vevent [("SUMMARY", "lala"), ("DSTART", "lulu"), ("SEQUENCEST", "lili")]

输出:

VEvent {summary = "lala", dstart = "lulu", sequenceSt = Nothing}
VEvent {summary = "lala", dstart = "lulu", sequenceSt = Just "lili"}

请注意,这将在编译时产生警告。为避免警告,请将所有非可选字段初始化为undefined显式。

于 2010-09-14T05:16:53.770 回答