这是一个对我来说似乎相当不错的解决方案
{-# LANGUAGE Arrows #-}
import Data.Maybe
import Text.Read
import Text.XML.HXT.Core
import Control.Applicative
data Gpx = Gpx [Trk] deriving (Show)
data Trk = Trk [TrkSeg] deriving (Show)
data TrkSeg = TrkSeg [TrkPt] deriving (Show)
data TrkPt = TrkPt Double Double deriving (Show)
最棘手的可能是parseTrkPt
因为为了正确执行此操作,您必须处理将String
s 解析为Double
,这可能会失败。我已经决定让它返回 a Maybe TrkPt
,然后进一步处理:
elemsNamed :: ArrowXml cat => String -> cat XmlTree XmlTree
elemsNamed name = isElem >>> hasName name
parseTrkPt :: ArrowXml cat => cat XmlTree (Maybe TrkPt)
parseTrkPt = elemsNamed "trkpt" >>>
proc trkpt -> do
lat <- getAttrValue "lat" -< trkpt
lon <- getAttrValue "lon" -< trkpt
returnA -< TrkPt <$> readMaybe lat <*> readMaybe lon
我也使用了proc
这里的语法,因为我认为它更简洁。具有TrkPt <$> readMaybe lat <*> readMaybe lon
类型Maybe TrkPt
,Nothing
如果其中一个readMaybe
s 返回,则将返回Nothing
。我们现在可以汇总所有成功的结果:
parseTrkSeg :: (ArrowXml cat, ArrowList cat) => cat XmlTree TrkSeg
parseTrkSeg =
elemsNamed "trkseg" >>>
(getChildren >>> parseTrkPt >>. catMaybes) >. TrkSeg
括号在这里很重要,我花了一段时间才弄清楚那部分。根据您放置括号的位置,您将获得不同的结果,例如[TrkSeg [TrkPt a b], TrkSeg [TrkPt c d]]
代替[TrkSeg [TrkPt a b, TrkPt c d]]
. 解析器的 next 都遵循类似的模式:
parseTrk :: ArrowXml cat => cat XmlTree Trk
parseTrk =
elemsNamed "trk" >>>
(getChildren >>> parseTrkSeg) >. Trk
parseGpx :: ArrowXml cat => cat XmlTree Gpx
parseGpx =
elemsNamed "gpx" >>>
(getChildren >>> parseTrk) >. Gpx
然后你可以很简单地运行它,尽管你仍然必须钻过根元素:
main :: IO ()
main = do
gpxs <- runX $ readDocument [withRemoveWS yes] "ana.gpx"
>>> getChildren
>>> parseGpx
-- Pretty print the document
forM_ gpxs $ \(Gpx trks) -> do
putStrLn "GPX:"
forM_ trks $ \(Trk segs) -> do
putStrLn "\tTRK:"
forM_ segs $ \(TrkSeg pts) -> do
putStrLn "\t\tSEG:"
forM_ pts $ \pt -> do
putStr "\t\t\t"
print pt
诀窍是使用类型类中的方法ArrowList
,特别是>.
具有 type的方法a b c -> ([c] -> d) -> a b d
。它聚合来自 的元素ArrowList
,将其传递给将其转换为新类型的函数,然后ArrowList
在该新类型上输出一个新的d
。
如果你愿意,你甚至可以为最后 3 个解析器抽象一点:
nestedListParser :: ArrowXml cat => String -> cat XmlTree a -> ([a] -> b) -> cat XmlTree b
nestedListParser name subparser constructor
= elemsNamed name
>>> (getChildren >>> subparser)
>. constructor
parseTrkSeg :: (ArrowXml cat, ArrowList cat) => cat XmlTree TrkSeg
parseTrkSeg = nestedListParser "trkseg" (parseTrkPt >>. catMaybes) TrkSeg
parseTrk :: ArrowXml cat => cat XmlTree Trk
parseTrk = nestedListParser "trk" parseTrkSeg Trk
parseGpx :: ArrowXml cat => cat XmlTree Gpx
parseGpx = nestedListParser "gpx" parseTrk Gpx
如果您想完成 GPX 文件的其余语法,这可能会派上用场。