1

我在尝试着

  1. 加载 CSV 文件
  2. 从文件中读取 ID
  3. 为每个 Id 加载一个外部 xml 文件
  4. 从 XML 中读取一些名称
  5. 将 ID 和名称写入新的 CSV 文件

我是 Haskell 的新手,真的很想学习它,我还处于理解的复制和粘贴阶段。我自己找到了每个部分的教程,但我很难将它们结合起来。

CSV 很简单,例如:

736572,"Mount Athos"
6697806,"North Aegean"

我使用Cassava读取 CSV 并使用HandsomeSoup读取 XML。

在这里,我尝试读取 id,加载 xml 并至少从 xml 打印名称。

{-# LANGUAGE ScopedTypeVariables #-}

import qualified Data.ByteString.Lazy as BL
import Data.Csv
import qualified Data.Vector as V

import Text.XML.HXT.Core
import Text.HandsomeSoup

import Data.List
import Data.Char


getPlaceNames::String->String->String
getPlaceNames pid name = do
    let doc = fromUrl ("http://api.geonames.org/get?geonameId="++pid++"&username=demo")

    c<-runX $ doc >>> css "alternateNames" >>> deep getText
    return (head c)


main :: IO ()
main = do
    csvData <- BL.readFile "input.csv"
    case decode NoHeader csvData of
        Left err -> putStrLn err
        Right v -> V.forM_ v $ \ ( pid, name ) ->
          putStrLn $  getPlaceNames pid name

我认为当我调用 getPlaceNames 并返回名称时我做错了。我什至不确定是否应该在 getPlaceNames 中使用“do”语句。

错误说

 Couldn't match expected type ‘[[Char]]’
            with actual type ‘IO [String]’
In a stmt of a 'do' block:
  c <- runX $ doc >>> css "alternateNames" >>> deep getText
In the expression:
  do { let doc
             = fromUrl
                 ("http://api.geonames.org/get?geonameId="
                  ++ pid ++ "&username=demo");
       c <- runX $ doc >>> css "alternateNames" >>> deep getText;
       return (head c) }
In an equation for ‘getPlaceNames’:
    getPlaceNames pid name
      = do { let doc = ...;
             c <- runX $ doc >>> css "alternateNames" >>> deep getText;
             return (head c) }

但这可能只是我做错的一件事,因为我对单子和绑定缺乏了解。

任何帮助表示赞赏,即使它只是指向正确文档的指针。

干杯

比约恩

4

1 回答 1

1

感谢chi,我已经弄清楚了整个过程。我正在为其他需要做类似事情的人发布我的代码。

最后,我不仅从 xml 中获取了名称,而且还从多个字段中获取了名称。所以我getPlaceNames改为gtPlaceDetails.

我展示了完整的代码,因为它还展示了我如何从 XML 中读取不同的字段,以及如何将alternateNameXML 中的元素合并为一个字符串。

{-# LANGUAGE ScopedTypeVariables #-}


import qualified Data.ByteString.Lazy.Char8 as BL


import Data.Csv
import qualified Data.Vector as V

import Text.XML.HXT.Core
import Text.HandsomeSoup
import Data.List
import Data.Char


uppercase :: String -> String
uppercase = map toUpper


toLanguageStr :: (String, String) -> String
toLanguageStr (lan,name) = uppercase lan ++ ":" ++ name


getPlaceDetails::String->String->IO (Int,String,Float,Float,Float,Float,Float,Float,String,String)
getPlaceDetails pid name = do
    let doc = fromUrl ("http://api.geonames.org/get?geonameId="++pid++"&username=demo")

    id<-runX $ doc >>> css "geonameId" >>> deep getText
    name<-runX $ doc >>> css "name" >>> deep getText
    s<- runX $ doc >>> css "south" >>> deep getText
    w<- runX $ doc >>> css "west" >>> deep getText
    n<- runX $ doc >>> css "north" >>>  deep getText
    e<- runX $ doc >>> css "east" >>> deep getText
    lat<- runX $ doc >>> css "lat" >>> deep getText
    lng<- runX $ doc >>> css "lng" >>> deep getText
    translations<- runX $ doc >>> css "alternateName" >>> (getAttrValue "lang" &&& (deep getText))
    terms<- runX $ doc >>> css "alternateNames" >>> deep getText
    return ( read (head id),head name, read (head lat), read (head lng), read (head s), read (head w), read (head n), read (head e), intercalate "|" $ map toLanguageStr translations, head terms )



main :: IO ()
main = do
    csvData <- BL.readFile "input.csv"
    case decode NoHeader csvData of
        Left err -> putStrLn err
        Right v -> V.forM_ v $ \ ( pid, name )->do
            details <- getPlaceDetails pid name
            BL.appendFile "out.csv" $ encode [details]
            BL.putStrLn  (encode [details]) 

例如 input.csv 行

736572,"Mount Athos"

映射到 out.csv 这个

736572,"Mount Athos",40.15798,24.33021,40.11294,23.99234,40.4563,24.40044,"KO:아토스 산|:Aftónomos Periochí Agíou Órous|:Ágion Óros|:Ágio Óros|:Athos|NO:Áthos|EN:Autonomous Monastic State of the Holy Mountain|:Avtonómos Periokhí Ayíou Órous|:Áyion Óros|:Dhioíkisis Ayíou Órous|:Hagion Oros|:Holy Athonite Republic|LINK:http://en.wikipedia.org/wiki/Mount_Athos|CA:Mont Athos|FR:Mont Athos|EN:Mount Athos|FR:République monastique du Mont Athos|EL:Αυτόνομη Μοναστική Πολιτεία Αγίου Όρους","Aftonomos Periochi Agiou Orous,Aftónomos Periochí Agíou Órous,Agio Oros,Agion Oros,Athos,Autonome Monastike Politeia Agiou Orous,Autonomous Monastic State of the Holy Mountain,Avtonomos Periokhi Ayiou Orous,Avtonómos Periokhí Ayíou Órous,Ayion Oros,Dhioikisis Ayiou Orous,Dhioíkisis Ayíou Órous,Hagion Oros,Holy Athonite Republic,Mont Athos,Mount Athos,Republique monastique du Mont Athos,République monastique du Mont Athos,atoseu san,Ágio Óros,Ágion Óros,Áthos,Áyion Óros,Αυτόνομη Μοναστική Πολιτεία Αγίου Όρους,아토스 산"
于 2016-06-16T10:37:38.897 回答