python - Python 使用 urllib2 来转义图像？

Question

我正在抓取http://apod.nasa.gov/以获取其当天的形象。到目前为止，我已经能够返回我认为是图像源标签的内容。

#!/usr/bin/env python
from urllib2 import urlopen
from BeautifulSoup import BeautifulSoup

class Apod:
    def apod_wallpaper(self):
        self.soup = BeautifulSoup(urlopen('http://apod.nasa.gov/').read())
        self.pic = self.soup.find('img')
        return self.pic


print Apod().apod_wallpaper()


>>> ./apod.py

>>> <img src="image/1208/Ma2011-2Tezel900.jpg" name="imagename1" alt="See Explanation.
Moving the cursor over the image will bring up an annotated version.
Clicking on the image will bring up the highest resolution version
available." />

我不确定如何从这里下载实际的.jpg？

score 4 · Accepted Answer

第 1 步：阅读 HTML 文件。

第 2 步：从您找到的图像中提取src属性。将域http://apod.nasa.gov/与src值image/1208/Ma2011-2Tezel900.jpg连接起来以提供用于获取图像的 URL。

第 3 步：urlopen(...).read()在其上运行，并将其写入文件

例如：

data = urlopen('http://apod.nasa.gov/image/1208/Ma2011-2Tezel900.jpg').read()
open('mypic.jpg', 'wb').write(data)

score 3 · Accepted Answer

你想要urlparse.urljoin()。

>>> urlparse.urljoin('http://apod.nasa.gov/', 'image/1208/Ma2011-2Tezel900.jpg')
'http://apod.nasa.gov/image/1208/Ma2011-2Tezel900.jpg'

python - Python 使用 urllib2 来转义图像？

2 回答 2

Related

Reference