昨天我尝试在Network.HTTP和Feed库的帮助下在 Haskell 中编写一个简单的 rss 下载器。我想从 rss 项目下载链接,并在项目标题后命名下载的文件。
这是我的短代码:
import Control.Monad
import Control.Applicative
import Network.HTTP
import Text.Feed.Import
import Text.Feed.Query
import Text.Feed.Types
import Data.Maybe
import qualified Data.ByteString as B
import Network.URI (parseURI, uriToString)
getTitleAndUrl :: Item -> (Maybe String, Maybe String)
getTitleAndUrl item = (getItemTitle item, getItemLink item)
downloadUri :: (String,String) -> IO ()
downloadUri (title,link) = do
file <- get link
B.writeFile title file
where
get url = let uri = case parseURI url of
Nothing -> error $ "invalid uri" ++ url
Just u -> u in
simpleHTTP (defaultGETRequest_ uri) >>= getResponseBody
getTuples :: IO (Maybe [(Maybe String, Maybe String)])
getTuples = fmap (map getTitleAndUrl) <$> fmap (feedItems) <$> parseFeedString <$> (simpleHTTP (getRequest "http://index.hu/24ora/rss/") >>= getResponseBody)
我达到了一个状态,我得到了一个包含元组的列表,其中包含名称和相应的链接。而且我有一个downloadUri
功能,可以将给定的链接正确下载到具有 rss 项目标题名称的文件中。
我已经尝试修改以继续downloadUri
工作,但失败了。(Maybe String,Maybe String)
fmap
get
writeFile
我如何将我
downloadUri
的函数应用于函数的结果getTuples
。我想实现以下主要功能main :: IO ()
main = some magic incantation donwloadUri more incantation getTuples
破碎结果的字符编码
getItemTitle
,它把代码点放在重音字符的地方。提要是utf8编码的,我认为所有haskell字符串操作函数都默认为utf8。我怎样才能解决这个问题?
编辑:
感谢您的帮助,我成功实现了我的主要功能和辅助功能。代码如下:
downloadUri :: (Maybe String,Maybe String) -> IO ()
downloadUri (Just title,Just link) = do
item <- get link
B.writeFile title item
where
get url = let uri = case parseURI url of
Nothing -> error $ "invalid uri" ++ url
Just u -> u in
simpleHTTP (defaultGETRequest_ uri) >>= getResponseBody
downloadUri _ = print "Somewhere something went Nothing"
getTuples :: IO (Maybe [(Maybe String, Maybe String)])
getTuples = fmap (map getTitleAndUrl) <$> fmap (feedItems) <$> parseFeedString <$> decodeString <$> (simpleHTTP (getRequest "http://index.hu/24ora/rss/") >>= getResponseBody)
downloadAllItems :: Maybe [(Maybe String, Maybe String)] -> IO ()
downloadAllItems (Just feedlist) = mapM_ downloadUri $ feedlist
downloadAllItems _ = error "feed does not get parsed"
main = getTuples >>= downloadAllItems
字符编码问题已部分解决,我放在decodeString
提要解析之前,因此文件被正确命名。但是如果我想打印出来,问题仍然存在。最小的工作示例:
main = getTuples