0

下面的代码是从 ESPN/college-football 中提取头条新闻。我进入文章本身并提取p内容并将它们打印到控制台就好了,但我也想通过电子邮件发送内容。由于某种原因,它不会发送内容。这是代码:

from urllib import urlopen
from BeautifulSoup import BeautifulSoup
import datetime
import smtplib

# Copy all of the content from the provided web page
webpage = urlopen('http://espn.go.com/college-football').read()
soup = BeautifulSoup(webpage)
now = datetime.datetime.now()



# to get the contents of <ul> tags w/ attribute class="headlines":
for i in soup.findAll('ul', {'class': 'headlines'}):
    for tag in i.findAll('li'):
        for a in tag.findAll({'a' : True, 'title' : False}):            
            print a.text
            print a['href']           
            print "\n"

            articlePage = urlopen(a['href']).read() # Grab all of the content from original article

            # Pass the article to the Beautiful Soup Module
            soup1 = BeautifulSoup(articlePage)

            # Tell Beautiful Soup to locate all of the p tags and store them in a list
            paragList = soup1.findAll('p')

            # Print all of the paragraphs to screen
            for z in paragList:
                print z.text

            print "\n" 

            # -*- coding: utf-8 -*-
            from email.header    import Header
            from email.mime.text import MIMEText

            msg = MIMEText(a.text + "\n" + str(a.get('href') + "\n" + z.text), 'plain', 'utf-8')
            msg['Subject'] = Header('ESPN Scrape from: '+ now.strftime("%Y-%m-%d %H:%M"), 'utf-8')
            msg['From'] = 'FROM'
            msg['To'] = 'TO'
            print(msg.as_string())   


            # Credentials (if needed)
            username = 'username'
            password = 'password'

            from smtplib import SMTP_SSL

            # send it via gmail
            s = SMTP_SSL('smtp.gmail.com')
            s.set_debuglevel(1)
            try:
                s.login(username, password)
                s.sendmail(msg['From'], msg['To'], msg.as_string())
            finally:
                s.quit()
4

1 回答 1

0

在将 paraglist 的每个成员打印到控制台时,您使用 for 循环,但是在创建 msg 时,您只需添加 z.text 而不循环 paraglist。

不过,这可能是复制粘贴错误。

于 2013-01-25T22:23:46.370 回答