我试图了解 HXT,这是一个用于解析使用箭头的 XML 的 Haskell 库。对于我的特定用例,我宁愿不使用deep
,因为有些情况<outer_tag><payload_tag>value</payload_tag></outer_tag>
是不同的,<outer_tag><inner_tag><payload_tag>value</payload_tag></inner_tag></outer_tag>
但我遇到了一些奇怪的感觉,感觉它应该工作但没有。
我已经设法根据文档中的这个示例提出了一个测试用例:
{-# LANGUAGE Arrows, NoMonomorphismRestriction #-}
module Main where
import Text.XML.HXT.Core
data Guest = Guest { firstName, lastName :: String }
deriving (Show, Eq)
getGuest = deep (isElem >>> hasName "guest") >>>
proc x -> do
fname <- getText <<< getChildren <<< deep (hasName "fname") -< x
lname <- getText <<< getChildren <<< deep (hasName "lname") -< x
returnA -< Guest { firstName = fname, lastName = lname }
getGuest' = deep (isElem >>> hasName "guest") >>>
proc x -> do
fname <- getText <<< getChildren <<< (hasName "fname") <<< getChildren -< x
lname <- getText <<< getChildren <<< (hasName "lname") <<< getChildren -< x
returnA -< Guest { firstName = fname, lastName = lname }
getGuest'' = deep (isElem >>> hasName "guest") >>> getChildren >>>
proc x -> do
fname <- getText <<< getChildren <<< (hasName "fname") -< x
lname <- getText <<< getChildren <<< (hasName "lname") -< x
returnA -< Guest { firstName = fname, lastName = lname }
driver finalArrow = runX (readDocument [withValidate no] "guestbook.xml" >>> finalArrow)
main = do
guests <- driver getGuest
print "getGuest"
print guests
guests' <- driver getGuest'
print "getGuest'"
print guests'
guests'' <- driver getGuest''
print "getGuest''"
print guests''
之间getGuest
和getGuest'
我扩展deep
到正确的数量getChildren
。结果函数仍然有效。getChildren
然后我将块的外部因素考虑在内,do
但这会导致结果函数失败。输出是:
"getGuest"
[Guest {firstName = "John", lastName = "Steinbeck"},Guest {firstName = "Henry", lastName = "Ford"},Guest {firstName = "Andrew", lastName = "Carnegie"},Guest {firstName = "Anton", lastName = "Chekhov"},Guest {firstName = "George", lastName = "Washington"},Guest {firstName = "William", lastName = "Shakespeare"},Guest {firstName = "Nathaniel", lastName = "Hawthorne"}]
"getGuest'"
[Guest {firstName = "John", lastName = "Steinbeck"},Guest {firstName = "Henry", lastName = "Ford"},Guest {firstName = "Andrew", lastName = "Carnegie"},Guest {firstName = "Anton", lastName = "Chekhov"},Guest {firstName = "George", lastName = "Washington"},Guest {firstName = "William", lastName = "Shakespeare"},Guest {firstName = "Nathaniel", lastName = "Hawthorne"}]
"getGuest''"
[]
我觉得这应该是一个有效的转换来执行,但我对箭头的理解有点不稳定。难道我做错了什么?这是我应该报告的错误吗?
我正在使用 HXT 版本 9.3.1.3(撰写本文时的最新版本)。ghc --version 打印“The Glorious Glasgow Haskell Compilation System, version 7.4.1”。我还在一个装有 ghc 7.6.3 的盒子上进行了测试,并得到了相同的结果。
XML 文件具有以下重复结构(可在此处找到完整文件)
<guestbook>
<guest>
<fname>John</fname>
<lname>Steinbeck</lname>
</guest>
<guest>
<fname>Henry</fname>
<lname>Ford</lname>
</guest>
<guest>
<fname>Andrew</fname>
<lname>Carnegie</lname>
</guest>
</guestbook>